En esta práctica:
correlación intra clase
modelos con efectos aleatorios
1. Cargar/instalar librerías
if (! require ("pacman" )) install.packages ("pacman" ) # solo la primera vez
Loading required package: pacman
pacman:: p_load (lme4,
reghelper,
haven,
stargazer,
ggplot2, # gráficos
texreg, # tablas de regresion (screenreg)
dplyr # manipulacion de datos
) # paquetes a cargar
Para esta práctica utilizaremos la librería lme4
, en particular la función lmer
, así como también la librería haven
, para lectura de base de datos externa. Además stargazer
para visualizar outputs.
2. Leer datos base High School and Beyond (HSB)**
Como en sesión anterior
mlm = read_dta ("http://www.stata-press.com/data/mlmus3/hsb.dta" )
“mlm”” es el nombre que le daremos a la base de datos “High School and Beyond”
Variables relevantes para ejercicios:
Nivel 1:
minority, an indicator for student ethnicity (1 = minority, 0 = other)
female, an indicator for student gender (1 = female, 0 = male)
ses, a standardized scale constructed from variables measuring parental education, occupation, and income
mathach, a measure of mathematics achievement
Nivel 2
size (school enrollment)
sector (1 = Catholic, 0 = public)
pracad (proportion of students in the academic track)
disclim (a scale measuring disciplinary climate)
himnty (1 = more than 40% minority enrollment, 0 = less than 40%)
mnses (mean of the SES values for the students in this school who are included in the level-1 file)
Cluster variable: schoolid
Seleccionar variables
mlm= mlm %>% select (minority,female,ses,mathach,size,sector,pracad,disclim,himinty,mnses,schoolid) %>% as.data.frame ()
names (mlm)
[1] "minority" "female" "ses" "mathach" "size" "sector"
[7] "pracad" "disclim" "himinty" "mnses" "schoolid"
#Tabla estadisticos descriptivos
stargazer (mlm,title= "Estadísticos descriptivos" , type = "text" )
Estadísticos descriptivos
=================================================
Statistic N Mean St. Dev. Min Max
-------------------------------------------------
minority 7,185 0.275 0.446 0 1
female 7,185 0.528 0.499 0 1
ses 7,185 0.0001 0.779 -3.758 2.692
mathach 7,185 12.748 6.878 -2.832 24.993
size 7,185 1,056.862 604.172 100 2,713
sector 7,185 0.493 0.500 0 1
pracad 7,185 0.534 0.251 0.000 1.000
disclim 7,185 -0.132 0.944 -2.416 2.756
himinty 7,185 0.280 0.449 0 1
mnses 7,185 0.0001 0.414 -1.194 0.825
schoolid 7,185 5,277.898 2,499.578 1,224 9,586
-------------------------------------------------
3. Modelos
Modelo 0: Null model
results_0 = lmer (mathach ~ 1 + (1 | schoolid), data = mlm)
summary (results_0)
Linear mixed model fit by REML ['lmerMod']
Formula: mathach ~ 1 + (1 | schoolid)
Data: mlm
REML criterion at convergence: 47116.8
Scaled residuals:
Min 1Q Median 3Q Max
-3.0631 -0.7539 0.0267 0.7606 2.7426
Random effects:
Groups Name Variance Std.Dev.
schoolid (Intercept) 8.614 2.935
Residual 39.148 6.257
Number of obs: 7185, groups: schoolid, 160
Fixed effects:
Estimate Std. Error t value
(Intercept) 12.6370 0.2444 51.71
screenreg (results_0) # de library texreg
========================================
Model 1
----------------------------------------
(Intercept) 12.64 ***
(0.24)
----------------------------------------
AIC 47122.79
BIC 47143.43
Log Likelihood -23558.40
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 8.61
Var: Residual 39.15
========================================
*** p < 0.001; ** p < 0.01; * p < 0.05
En este modelo es posible identificar \(\gamma_{00}\) (intercept - fixed effects), así como también los componentes la varianza :
\(\tau_{00}\) (Var:id)
\(\sigma^2\) (Var:Residual).
Con estos últimos coeficientes aleatorios es posible calcular la correlación intraclase (\(\rho\) ).
Correlación intra-clase = ICC = \(\rho=\frac{\tau_{00}}{\tau_{00}+\sigma^2}\)
# Cálculo directo en R:
8.614 / (8.614+39.148 ) # calculo ICC
# O de manera directa con funcion ICC de reghelper
reghelper:: ICC (results_0)
Modelo 1: Predictores de nivel individual
results_1 = lmer (mathach ~ 1 + ses + female + (1 | schoolid), data = mlm)
screenreg (results_1, naive= TRUE )
========================================
Model 1
----------------------------------------
(Intercept) 13.28 ***
(0.20)
ses 2.36 ***
(0.11)
female -1.19 ***
(0.17)
----------------------------------------
AIC 46605.73
BIC 46640.13
Log Likelihood -23297.86
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 4.49
Var: Residual 36.81
========================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Modelo 2: Predictores nivel 2
results_2 = lmer (mathach ~ 1 + sector + mnses + (1 | schoolid), data = mlm)
screenreg (results_2)
========================================
Model 1
----------------------------------------
(Intercept) 12.13 ***
(0.20)
sector 1.23 ***
(0.31)
mnses 5.33 ***
(0.37)
----------------------------------------
AIC 46956.50
BIC 46990.90
Log Likelihood -23473.25
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 2.31
Var: Residual 39.16
========================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Modelo 3: Predictores individuales y grupales
results_3 = lmer (mathach ~ 1 + ses + female + sector + mnses + (1 | schoolid), data = mlm)
screenreg (results_3)
========================================
Model 1
----------------------------------------
(Intercept) 12.74 ***
(0.21)
ses 2.15 ***
(0.11)
female -1.20 ***
(0.16)
sector 1.25 ***
(0.30)
mnses 3.07 ***
(0.37)
----------------------------------------
AIC 46515.61
BIC 46563.77
Log Likelihood -23250.80
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 2.15
Var: Residual 36.80
========================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Modelo 4: Pendiente aleatoria
results_4 = lmer (mathach ~ 1 + ses + female + mnses + sector + (1 + ses | schoolid), data = mlm)
screenreg (results_4)
============================================
Model 1
--------------------------------------------
(Intercept) 12.65 ***
(0.21)
ses 2.16 ***
(0.12)
female -1.19 ***
(0.16)
mnses 3.07 ***
(0.38)
sector 1.43 ***
(0.30)
--------------------------------------------
AIC 46514.33
BIC 46576.24
Log Likelihood -23248.16
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 2.22
Var: schoolid ses 0.42
Cov: schoolid (Intercept) ses 0.27
Var: Residual 36.59
============================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Modelo 5: Interacción entre niveles
$mathach= {00}(intercepto)+ {10}ses_{ij}+ \(\gamma_{01}sector_j+\gamma_{11}ses_{ij} *sector{j}+\) \(\mu_{0j}(intercepto)+\mu_{1j}ses_{ij}+ r_{ij}\)
results_5 = lmer (mathach ~ 1 + female + ses* sector + mnses + (1 + ses | schoolid), data = mlm)
screenreg (results_5)
============================================
Model 1
--------------------------------------------
(Intercept) 12.81 ***
(0.21)
female -1.18 ***
(0.16)
ses 2.73 ***
(0.14)
sector 1.29 ***
(0.29)
mnses 3.04 ***
(0.37)
ses:sector -1.31 ***
(0.21)
--------------------------------------------
AIC 46482.91
BIC 46551.71
Log Likelihood -23231.45
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 2.11
Var: schoolid ses 0.03
Cov: schoolid (Intercept) ses 0.17
Var: Residual 36.60
============================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Comparación individual, agregado y multinivel
Generar regresiones para comparar
reg_ind= lm (mathach ~ ses + female + mnses + sector, data= mlm)
agg_mlm= mlm %>% group_by (schoolid) %>% summarise_all (funs (mean))
Warning: `funs()` was deprecated in dplyr 0.8.0.
ℹ Please use a list of either functions or lambdas:
# Simple named list: list(mean = mean, median = median)
# Auto named with `tibble::lst()`: tibble::lst(mean, median)
# Using lambdas list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
reg_agg= lm (mathach ~ ses + female + mnses + sector, data= agg_mlm)
Observar: ¿Qué sucede con los coeficientes y errores estándar cuando se comparan los coeficientes y los errores estándar?
screenreg (list (reg_ind, reg_agg, results_3))
=================================================================
Model 1 Model 2 Model 3
-----------------------------------------------------------------
(Intercept) 12.80 *** 13.13 *** 12.74 ***
(0.13) (0.35) (0.21)
ses 2.15 *** 5.19 *** 2.15 ***
(0.11) (0.37) (0.11)
female -1.34 *** -1.97 *** -1.20 ***
(0.15) (0.56) (0.16)
mnses 2.90 *** 3.07 ***
(0.22) (0.37)
sector 1.32 *** 1.25 *** 1.25 ***
(0.16) (0.31) (0.30)
-----------------------------------------------------------------
R^2 0.18 0.67
Adj. R^2 0.18 0.67
Num. obs. 7185 160 7185
AIC 46515.61
BIC 46563.77
Log Likelihood -23250.80
Num. groups: schoolid 160
Var: schoolid (Intercept) 2.15
Var: Residual 36.80
=================================================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Generación de tabla para publicar en HTML
htmlreg (list (reg_ind, reg_agg, results_3),
custom.model.names = c ("Individual" ,"Agregado" ,"Multinivel" ),
custom.coef.names = c ("Intercepto" , "$SES_{ij}$" ,"$Mujer_{ij}$" , "$SES_{j}$" , "$Sector_{j}$" ),
custom.gof.names= c (NA ,NA ,NA ,NA ,NA ,NA ,NA ,
"Var:id ($ \\ tau_{00}$)" ,"Var: Residual ($ \\ sigma^2$)" ),
custom.note = "%stars. Errores estándar en paréntesis" ,
caption= "Comparación de modelos Individual, Agregado y Multinivel" ,
caption.above= TRUE ,
doctype = FALSE )
Comparación de modelos Individual, Agregado y Multinivel
Individual
Agregado
Multinivel
Intercepto
12.80***
13.13***
12.74***
(0.13)
(0.35)
(0.21)
\(SES_{ij}\)
2.15***
5.19***
2.15***
(0.11)
(0.37)
(0.11)
\(Mujer_{ij}\)
-1.34***
-1.97***
-1.20***
(0.15)
(0.56)
(0.16)
\(SES_{j}\)
2.90***
3.07***
(0.22)
(0.37)
\(Sector_{j}\)
1.32***
1.25***
1.25***
(0.16)
(0.31)
(0.30)
R2
0.18
0.67
Adj. R2
0.18
0.67
Num. obs.
7185
160
7185
AIC
46515.61
BIC
46563.77
Log Likelihood
-23250.80
Num. groups: schoolid
160
Var:id (\(\tau_{00}\) )
2.15
Var: Residual (\(\sigma^2\) )
36.80
*** p < 0.001; ** p < 0.01; * p < 0.05. Errores estándar en paréntesis