Chapter 19 - Censored Data, Sample Selection, and Attrition
Examples
------------------------------------------------------------------------------------------
name: SN
log: \iiexample19.smcl
log type: smcl
closed on: 12 May 2020, 19:06:40
. **********************************************
. * Solomon Negash - Examples
. * Wooldridge (2010). Economic Analysis of Cross-Section and Panel Data. 2nd ed.
. * STATA Program, version 16.1.
. * Chapter 19 - Censored Data, Sample Selection, and Attrition
. **********************************************
. // Example 19.6 (Wage Offer Equation for Married Women):
. bcuse mroz, clear nodesc
. eststo OLS: reg lwage educ exper expersq
Source | SS df MS Number of obs = 428
-------------+---------------------------------- F(3, 424) = 26.29
Model | 35.0223023 3 11.6741008 Prob > F = 0.0000
Residual | 188.305149 424 .444115917 R-squared = 0.1568
-------------+---------------------------------- Adj R-squared = 0.1509
Total | 223.327451 427 .523015108 Root MSE = .66642
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .1074896 .0141465 7.60 0.000 .0796837 .1352956
exper | .0415665 .0131752 3.15 0.002 .0156697 .0674633
expersq | -.0008112 .0003932 -2.06 0.040 -.0015841 -.0000382
_cons | -.5220407 .1986321 -2.63 0.009 -.9124668 -.1316145
------------------------------------------------------------------------------
. eststo Heckit: heckman lwage educ exper expersq, ///
select(inlf= educ exper expersq nwifeinc age kidslt6 kidsge6) twostep
Heckman selection model -- two-step estimates Number of obs = 753
(regression model with sample selection) Selected = 428
Nonselected = 325
Wald chi2(3) = 51.53
Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwage |
educ | .1090655 .015523 7.03 0.000 .0786411 .13949
exper | .0438873 .0162611 2.70 0.007 .0120163 .0757584
expersq | -.0008591 .0004389 -1.96 0.050 -.0017194 1.15e-06
_cons | -.5781033 .3050062 -1.90 0.058 -1.175904 .0196979
-------------+----------------------------------------------------------------
inlf |
educ | .1309047 .0252542 5.18 0.000 .0814074 .180402
exper | .1233476 .0187164 6.59 0.000 .0866641 .1600311
expersq | -.0018871 .0006 -3.15 0.002 -.003063 -.0007111
nwifeinc | -.0120237 .0048398 -2.48 0.013 -.0215096 -.0025378
age | -.0528527 .0084772 -6.23 0.000 -.0694678 -.0362376
kidslt6 | -.8683285 .1185223 -7.33 0.000 -1.100628 -.636029
kidsge6 | .036005 .0434768 0.83 0.408 -.049208 .1212179
_cons | .2700768 .508593 0.53 0.595 -.7267472 1.266901
-------------+----------------------------------------------------------------
/mills |
lambda | .0322619 .1336246 0.24 0.809 -.2296376 .2941613
-------------+----------------------------------------------------------------
rho | 0.04861
sigma | .66362876
------------------------------------------------------------------------------
. esttab OLS Heckit, keep(educ exper expersq _cons) cells(b(nostar fmt(4)) /*
*/ se(par fmt(4))) ti("Table19.1 Wage Offer Equation for Married Women")
Table 19.1 Wage Offer Equation for Married Women
--------------------------------------
(1) (2)
lwage lwage
b/se b/se
--------------------------------------
main
educ 0.1075 0.1091
(0.0141) (0.0155)
exper 0.0416 0.0439
(0.0132) (0.0163)
expersq -0.0008 -0.0009
(0.0004) (0.0004)
_cons -0.5220 -0.5781
(0.1986) (0.3050)
--------------------------------------
inlf
educ 0.1309
(0.0253)
exper 0.1233
(0.0187)
expersq -0.0019
(0.0006)
_cons 0.2701
(0.5086)
--------------------------------------
N 428 753
--------------------------------------
. est clear
. heckman lwage educ exper expersq nwifeinc age kidslt6 kidsge6, /*
*/ select(inlf= educ exper expersq nwifeinc age kidslt6 kidsge6) twostep
Heckman selection model -- two-step estimates Number of obs = 753
(regression model with sample selection) Selected = 428
Nonselected = 325
Wald chi2(7) = 53.64
Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwage |
educ | .1187172 .0340507 3.49 0.000 .051979 .1854553
exper | .0598358 .033673 1.78 0.076 -.0061621 .1258336
expersq | -.0010523 .0006381 -1.65 0.099 -.002303 .0001984
nwifeinc | .0038434 .0044919 0.86 0.392 -.0049607 .0126474
age | -.011158 .0134792 -0.83 0.408 -.0375767 .0152606
kidslt6 | -.1880451 .2308275 -0.81 0.415 -.6404587 .2643684
kidsge6 | -.0122255 .0296063 -0.41 0.680 -.0702527 .0458018
_cons | -.5602853 .4587672 -1.22 0.222 -1.459452 .3388818
-------------+----------------------------------------------------------------
inlf |
educ | .1309047 .0252542 5.18 0.000 .0814074 .180402
exper | .1233476 .0187164 6.59 0.000 .0866641 .1600311
expersq | -.0018871 .0006 -3.15 0.002 -.003063 -.0007111
nwifeinc | -.0120237 .0048398 -2.48 0.013 -.0215096 -.0025378
age | -.0528527 .0084772 -6.23 0.000 -.0694678 -.0362376
kidslt6 | -.8683285 .1185223 -7.33 0.000 -1.100628 -.636029
kidsge6 | .036005 .0434768 0.83 0.408 -.049208 .1212179
_cons | .2700768 .508593 0.53 0.595 -.7267472 1.266901
-------------+----------------------------------------------------------------
/mills |
lambda | .2884636 .4635618 0.62 0.534 -.6201008 1.197028
-------------+----------------------------------------------------------------
rho | 0.41830
sigma | .6896138
------------------------------------------------------------------------------
. reg lwage educ exper expersq nwifeinc age kidslt6 kidsge6
Source | SS df MS Number of obs = 428
-------------+---------------------------------- F(7, 420) = 11.78
Model | 36.6476854 7 5.23538363 Prob > F = 0.0000
Residual | 186.679766 420 .444475633 R-squared = 0.1641
-------------+---------------------------------- Adj R-squared = 0.1502
Total | 223.327451 427 .523015108 Root MSE = .66669
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0998844 .0150975 6.62 0.000 .0702084 .1295604
exper | .0407097 .0133723 3.04 0.002 .0144249 .0669946
expersq | -.0007473 .0004018 -1.86 0.064 -.0015371 .0000424
nwifeinc | .0056942 .0033195 1.72 0.087 -.0008307 .0122192
age | -.0035204 .0054145 -0.65 0.516 -.0141633 .0071225
kidslt6 | -.0558726 .0886034 -0.63 0.529 -.230034 .1182889
kidsge6 | -.0176485 .027891 -0.63 0.527 -.0724718 .0371749
_cons | -.3579973 .3182963 -1.12 0.261 -.9836496 .267655
------------------------------------------------------------------------------
. // Example 19.7 (Education Endogenous and Sample Selection)
. probit inlf exper expersq nwifeinc age kidslt6 kidsge6 motheduc fatheduc huseduc, nolog
Probit regression Number of obs = 753
LR chi2(9) = 207.10
Prob > chi2 = 0.0000
Log likelihood = -411.32238 Pseudo R2 = 0.2011
------------------------------------------------------------------------------
inlf | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
exper | .1285092 .0185226 6.94 0.000 .0922056 .1648129
expersq | -.0019474 .0005955 -3.27 0.001 -.0031146 -.0007803
nwifeinc | -.0074294 .0048787 -1.52 0.128 -.0169915 .0021327
age | -.0527657 .0085423 -6.18 0.000 -.0695082 -.0360231
kidslt6 | -.8149255 .1160833 -7.02 0.000 -1.042445 -.5874063
kidsge6 | .0241511 .0432253 0.56 0.576 -.060569 .1088712
motheduc | .0295321 .0185718 1.59 0.112 -.006868 .0659322
fatheduc | .0133487 .0178491 0.75 0.455 -.0216349 .0483324
huseduc | .0161391 .019595 0.82 0.410 -.0222664 .0545446
_cons | 1.146672 .4932706 2.32 0.020 .1798798 2.113465
------------------------------------------------------------------------------
. test motheduc fatheduc huseduc
( 1) [inlf]motheduc = 0
( 2) [inlf]fatheduc = 0
( 3) [inlf]huseduc = 0
chi2( 3) = 8.00
Prob > chi2 = 0.0461
. predict y, xb
. g phi = normalden(y)
. g Phi = normal(y)
. g imr = phi/Phi
. * Full instruments
. eststo With_IMR: qui ivregress 2sls lwage exper expersq imr /*
*/ (educ = nwifeinc age kidslt6 kidsge6 motheduc fatheduc huseduc)
. eststo Without_IMR: qui ivregress 2sls lwage exper expersq /*
*/ (educ = nwifeinc age kidslt6 kidsge6 motheduc fatheduc huseduc)
. esttab With_IMR Without_IMR, keep(educ imr _cons) cells(b(nostar fmt(4)) se(par fmt(4)))
--------------------------------------
(1) (2)
lwage lwage
b/se b/se
--------------------------------------
educ 0.0878 0.0871
(0.0213) (0.0212)
imr 0.0404
(0.1326)
_cons -0.3249 -0.2696
(0.3315) (0.2785)
--------------------------------------
N 428 428
--------------------------------------
. * Partial instruments
. eststo With_IMR2: qui ivregress 2sls lwage exper expersq imr (educ = motheduc fatheduc huseduc)
. eststo Without_IMR2: qui ivregress 2sls lwage exper expersq (educ = motheduc fatheduc huseduc)
. esttab With_IMR2 Without_IMR2, keep(educ imr _cons) cells(b(nostar fmt(4)) se(par fmt(4)))
--------------------------------------
(1) (2)
lwage lwage
b/se b/se
--------------------------------------
educ 0.0808 0.0804
(0.0216) (0.0217)
imr 0.0361
(0.1329)
_cons -0.2338 -0.1869
(0.3351) (0.2841)
--------------------------------------
N 428 428
--------------------------------------
. est clear
. // Example 19.8 (Wage Offer Equation for Married Women)
. tobit inlf nwifeinc motheduc fatheduc huseduc exper expersq age kidslt6 kidsge6
Iteration 0: log likelihood = -433.27652
Iteration 1: log likelihood = -433.27652
Tobit regression Number of obs = 753
Uncensored = 753
Limits: lower = -inf Left-censored = 0
upper = +inf Right-censored = 0
LR chi2(9) = 212.27
Prob > chi2 = 0.0000
Log likelihood = -433.27652 Pseudo R2 = 0.1968
------------------------------------------------------------------------------
inlf | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
nwifeinc | -.00211 .001505 -1.40 0.161 -.0050646 .0008445
motheduc | .0090896 .0058289 1.56 0.119 -.0023535 .0205327
fatheduc | .0038845 .0055435 0.70 0.484 -.0069982 .0147672
huseduc | .0043307 .0060715 0.71 0.476 -.0075887 .0162501
exper | .0425666 .0056981 7.47 0.000 .0313804 .0537528
expersq | -.0006468 .0001862 -3.47 0.001 -.0010123 -.0002812
age | -.0164771 .0025496 -6.46 0.000 -.0214823 -.0114718
kidslt6 | -.2542708 .0337609 -7.53 0.000 -.3205488 -.1879927
kidsge6 | .0099069 .0133099 0.74 0.457 -.0162226 .0360363
_cons | .8490309 .151716 5.60 0.000 .5511885 1.146873
-------------+----------------------------------------------------------------
var(e.inlf)| .1850598 .0095374 .1672524 .2047631
------------------------------------------------------------------------------
. predict yhat, xb
. g v2 = inlf - yhat
. eststo Heckit: heckman lwage educ exper expersq v2, ///
> select(inlf= educ exper expersq nwifeinc age kidslt6 kidsge6) twostep
Heckman selection model -- two-step estimates Number of obs = 753
(regression model with sample selection) Selected = 428
Nonselected = 325
Wald chi2(4) = 51.45
Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwage |
educ | .1032574 .0200558 5.15 0.000 .0639487 .142566
exper | .0457018 .0167666 2.73 0.006 .0128398 .0785638
expersq | -.000895 .0004468 -2.00 0.045 -.0017707 -.0000193
v2 | .2618507 .5703443 0.46 0.646 -.8560035 1.379705
_cons | -.5308917 .3224749 -1.65 0.100 -1.162931 .1011474
-------------+----------------------------------------------------------------
inlf |
educ | .1309047 .0252542 5.18 0.000 .0814074 .180402
exper | .1233476 .0187164 6.59 0.000 .0866641 .1600311
expersq | -.0018871 .0006 -3.15 0.002 -.003063 -.0007111
nwifeinc | -.0120237 .0048398 -2.48 0.013 -.0215096 -.0025378
age | -.0528527 .0084772 -6.23 0.000 -.0694678 -.0362376
kidslt6 | -.8683285 .1185223 -7.33 0.000 -1.100628 -.636029
kidsge6 | .036005 .0434768 0.83 0.408 -.049208 .1212179
_cons | .2700768 .508593 0.53 0.595 -.7267472 1.266901
-------------+----------------------------------------------------------------
/mills |
lambda | -.1076511 .3325972 -0.32 0.746 -.7595297 .5442275
-------------+----------------------------------------------------------------
rho | -0.16133
sigma | .66725601
------------------------------------------------------------------------------
. reg lwage educ v2 exper expersq
Source | SS df MS Number of obs = 428
-------------+---------------------------------- F(4, 423) = 19.72
Model | 35.0947088 4 8.7736772 Prob > F = 0.0000
Residual | 188.232742 423 .444994663 R-squared = 0.1571
-------------+---------------------------------- Adj R-squared = 0.1492
Total | 223.327451 427 .523015108 Root MSE = .66708
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .1078537 .0141892 7.60 0.000 .0799636 .1357438
v2 | .0928949 .230293 0.40 0.687 -.3597662 .5455561
exper | .0457809 .0168251 2.72 0.007 .0127097 .0788521
expersq | -.0008976 .0004482 -2.00 0.046 -.0017786 -.0000167
_cons | -.5915459 .2631025 -2.25 0.025 -1.108697 -.0743947
------------------------------------------------------------------------------
. reg lwage educ exper expersq nwifeinc age kidslt6 kidsge6, r
Linear regression Number of obs = 428
F(7, 420) = 12.62
Prob > F = 0.0000
R-squared = 0.1641
Root MSE = .66669
------------------------------------------------------------------------------
| Robust
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0998844 .0141577 7.06 0.000 .0720557 .1277131
exper | .0407097 .0153088 2.66 0.008 .0106183 .0708012
expersq | -.0007473 .0004093 -1.83 0.069 -.0015519 .0000572
nwifeinc | .0056942 .0027569 2.07 0.039 .0002753 .0111132
age | -.0035204 .0061766 -0.57 0.569 -.0156613 .0086205
kidslt6 | -.0558726 .1061345 -0.53 0.599 -.2644936 .1527485
kidsge6 | -.0176485 .0295136 -0.60 0.550 -.0756611 .0403642
_cons | -.3579973 .3221853 -1.11 0.267 -.9912939 .2752993
------------------------------------------------------------------------------