## Chapter 19 - Censored Data, Sample Selection, and Attrition

### Examples

```------------------------------------------------------------------------------------------
name:  SN
log:  \iiexample19.smcl
log type:  smcl
closed on:  12 May 2020, 19:06:40
. **********************************************
. * Solomon Negash - Examples
. * Wooldridge (2010). Economic Analysis of Cross-Section and Panel Data. 2nd ed.
. * STATA Program, version 16.1.

. * Chapter 19  -  Censored Data, Sample Selection, and Attrition
. **********************************************

. // Example 19.6 (Wage Offer Equation for Married Women):

. bcuse mroz, clear nodesc

. eststo OLS: reg lwage educ exper expersq

Source |       SS           df       MS      Number of obs   =       428
-------------+----------------------------------   F(3, 424)       =     26.29
Model |  35.0223023         3  11.6741008   Prob > F        =    0.0000
Residual |  188.305149       424  .444115917   R-squared       =    0.1568
Total |  223.327451       427  .523015108   Root MSE        =    .66642

------------------------------------------------------------------------------
lwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ |   .1074896   .0141465     7.60   0.000     .0796837    .1352956
exper |   .0415665   .0131752     3.15   0.002     .0156697    .0674633
expersq |  -.0008112   .0003932    -2.06   0.040    -.0015841   -.0000382
_cons |  -.5220407   .1986321    -2.63   0.009    -.9124668   -.1316145
------------------------------------------------------------------------------

. eststo Heckit: heckman lwage educ exper expersq, ///
select(inlf= educ exper expersq nwifeinc age kidslt6 kidsge6) twostep

Heckman selection model -- two-step estimates   Number of obs     =        753
(regression model with sample selection)              Selected    =        428
Nonselected =        325

Wald chi2(3)      =      51.53
Prob > chi2       =     0.0000

------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwage        |
educ |   .1090655    .015523     7.03   0.000     .0786411      .13949
exper |   .0438873   .0162611     2.70   0.007     .0120163    .0757584
expersq |  -.0008591   .0004389    -1.96   0.050    -.0017194    1.15e-06
_cons |  -.5781033   .3050062    -1.90   0.058    -1.175904    .0196979
-------------+----------------------------------------------------------------
inlf         |
educ |   .1309047   .0252542     5.18   0.000     .0814074     .180402
exper |   .1233476   .0187164     6.59   0.000     .0866641    .1600311
expersq |  -.0018871      .0006    -3.15   0.002     -.003063   -.0007111
nwifeinc |  -.0120237   .0048398    -2.48   0.013    -.0215096   -.0025378
age |  -.0528527   .0084772    -6.23   0.000    -.0694678   -.0362376
kidslt6 |  -.8683285   .1185223    -7.33   0.000    -1.100628    -.636029
kidsge6 |    .036005   .0434768     0.83   0.408     -.049208    .1212179
_cons |   .2700768    .508593     0.53   0.595    -.7267472    1.266901
-------------+----------------------------------------------------------------
/mills       |
lambda |   .0322619   .1336246     0.24   0.809    -.2296376    .2941613
-------------+----------------------------------------------------------------
rho |    0.04861
sigma |  .66362876
------------------------------------------------------------------------------

. esttab OLS Heckit, keep(educ exper expersq _cons) cells(b(nostar fmt(4)) /*
*/ se(par fmt(4))) ti("Table19.1 Wage Offer Equation for Married Women")

Table 19.1 Wage Offer Equation for Married Women
--------------------------------------
(1)          (2)
lwage        lwage
b/se         b/se
--------------------------------------
main
educ               0.1075       0.1091
(0.0141)     (0.0155)
exper              0.0416       0.0439
(0.0132)     (0.0163)
expersq           -0.0008      -0.0009
(0.0004)     (0.0004)
_cons             -0.5220      -0.5781
(0.1986)     (0.3050)
--------------------------------------
inlf
educ                            0.1309
(0.0253)
exper                           0.1233
(0.0187)
expersq                        -0.0019
(0.0006)
_cons                           0.2701
(0.5086)
--------------------------------------
N                     428          753
--------------------------------------

. est clear

. heckman lwage educ exper expersq nwifeinc age kidslt6 kidsge6, /*
*/ select(inlf= educ exper expersq nwifeinc age kidslt6 kidsge6) twostep

Heckman selection model -- two-step estimates   Number of obs     =        753
(regression model with sample selection)              Selected    =        428
Nonselected =        325

Wald chi2(7)      =      53.64
Prob > chi2       =     0.0000

------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwage        |
educ |   .1187172   .0340507     3.49   0.000      .051979    .1854553
exper |   .0598358    .033673     1.78   0.076    -.0061621    .1258336
expersq |  -.0010523   .0006381    -1.65   0.099     -.002303    .0001984
nwifeinc |   .0038434   .0044919     0.86   0.392    -.0049607    .0126474
age |   -.011158   .0134792    -0.83   0.408    -.0375767    .0152606
kidslt6 |  -.1880451   .2308275    -0.81   0.415    -.6404587    .2643684
kidsge6 |  -.0122255   .0296063    -0.41   0.680    -.0702527    .0458018
_cons |  -.5602853   .4587672    -1.22   0.222    -1.459452    .3388818
-------------+----------------------------------------------------------------
inlf         |
educ |   .1309047   .0252542     5.18   0.000     .0814074     .180402
exper |   .1233476   .0187164     6.59   0.000     .0866641    .1600311
expersq |  -.0018871      .0006    -3.15   0.002     -.003063   -.0007111
nwifeinc |  -.0120237   .0048398    -2.48   0.013    -.0215096   -.0025378
age |  -.0528527   .0084772    -6.23   0.000    -.0694678   -.0362376
kidslt6 |  -.8683285   .1185223    -7.33   0.000    -1.100628    -.636029
kidsge6 |    .036005   .0434768     0.83   0.408     -.049208    .1212179
_cons |   .2700768    .508593     0.53   0.595    -.7267472    1.266901
-------------+----------------------------------------------------------------
/mills       |
lambda |   .2884636   .4635618     0.62   0.534    -.6201008    1.197028
-------------+----------------------------------------------------------------
rho |    0.41830
sigma |   .6896138
------------------------------------------------------------------------------

. reg lwage educ exper expersq nwifeinc age kidslt6 kidsge6

Source |       SS           df       MS      Number of obs   =       428
-------------+----------------------------------   F(7, 420)       =     11.78
Model |  36.6476854         7  5.23538363   Prob > F        =    0.0000
Residual |  186.679766       420  .444475633   R-squared       =    0.1641
Total |  223.327451       427  .523015108   Root MSE        =    .66669

------------------------------------------------------------------------------
lwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ |   .0998844   .0150975     6.62   0.000     .0702084    .1295604
exper |   .0407097   .0133723     3.04   0.002     .0144249    .0669946
expersq |  -.0007473   .0004018    -1.86   0.064    -.0015371    .0000424
nwifeinc |   .0056942   .0033195     1.72   0.087    -.0008307    .0122192
age |  -.0035204   .0054145    -0.65   0.516    -.0141633    .0071225
kidslt6 |  -.0558726   .0886034    -0.63   0.529     -.230034    .1182889
kidsge6 |  -.0176485    .027891    -0.63   0.527    -.0724718    .0371749
_cons |  -.3579973   .3182963    -1.12   0.261    -.9836496     .267655
------------------------------------------------------------------------------

.  // Example 19.7 (Education Endogenous and Sample Selection)

. probit inlf exper expersq nwifeinc age kidslt6 kidsge6 motheduc fatheduc huseduc, nolog

Probit regression                               Number of obs     =        753
LR chi2(9)        =     207.10
Prob > chi2       =     0.0000
Log likelihood = -411.32238                     Pseudo R2         =     0.2011

------------------------------------------------------------------------------
inlf |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
exper |   .1285092   .0185226     6.94   0.000     .0922056    .1648129
expersq |  -.0019474   .0005955    -3.27   0.001    -.0031146   -.0007803
nwifeinc |  -.0074294   .0048787    -1.52   0.128    -.0169915    .0021327
age |  -.0527657   .0085423    -6.18   0.000    -.0695082   -.0360231
kidslt6 |  -.8149255   .1160833    -7.02   0.000    -1.042445   -.5874063
kidsge6 |   .0241511   .0432253     0.56   0.576     -.060569    .1088712
motheduc |   .0295321   .0185718     1.59   0.112     -.006868    .0659322
fatheduc |   .0133487   .0178491     0.75   0.455    -.0216349    .0483324
huseduc |   .0161391    .019595     0.82   0.410    -.0222664    .0545446
_cons |   1.146672   .4932706     2.32   0.020     .1798798    2.113465
------------------------------------------------------------------------------

. test motheduc fatheduc huseduc

( 1)  [inlf]motheduc = 0
( 2)  [inlf]fatheduc = 0
( 3)  [inlf]huseduc = 0

chi2(  3) =    8.00
Prob > chi2 =    0.0461

. predict y, xb

. g phi = normalden(y)

. g Phi = normal(y)

. g imr = phi/Phi

. * Full instruments

. eststo With_IMR: qui ivregress 2sls lwage exper expersq imr /*
*/ (educ = nwifeinc age kidslt6 kidsge6 motheduc fatheduc huseduc)

. eststo Without_IMR: qui ivregress 2sls lwage exper expersq /*
*/ (educ = nwifeinc age kidslt6 kidsge6 motheduc fatheduc huseduc)

. esttab With_IMR Without_IMR, keep(educ imr _cons) cells(b(nostar fmt(4)) se(par fmt(4)))

--------------------------------------
(1)          (2)
lwage        lwage
b/se         b/se
--------------------------------------
educ               0.0878       0.0871
(0.0213)     (0.0212)
imr                0.0404
(0.1326)
_cons             -0.3249      -0.2696
(0.3315)     (0.2785)
--------------------------------------
N                     428          428
--------------------------------------

. * Partial instruments

. eststo With_IMR2: qui ivregress 2sls lwage exper expersq imr (educ = motheduc fatheduc huseduc)

. eststo Without_IMR2: qui ivregress 2sls lwage exper expersq  (educ =  motheduc fatheduc huseduc)

. esttab With_IMR2 Without_IMR2, keep(educ imr _cons) cells(b(nostar fmt(4)) se(par fmt(4)))

--------------------------------------
(1)          (2)
lwage        lwage
b/se         b/se
--------------------------------------
educ               0.0808       0.0804
(0.0216)     (0.0217)
imr                0.0361
(0.1329)
_cons             -0.2338      -0.1869
(0.3351)     (0.2841)
--------------------------------------
N                     428          428
--------------------------------------

. est clear

.  // Example 19.8 (Wage Offer Equation for Married Women)

. tobit inlf nwifeinc motheduc fatheduc huseduc exper expersq age kidslt6 kidsge6

Iteration 0:   log likelihood = -433.27652
Iteration 1:   log likelihood = -433.27652

Tobit regression                                Number of obs     =        753
Uncensored     =        753
Limits: lower = -inf                               Left-censored  =          0
upper = +inf                               Right-censored =          0

LR chi2(9)        =     212.27
Prob > chi2       =     0.0000
Log likelihood = -433.27652                     Pseudo R2         =     0.1968

------------------------------------------------------------------------------
inlf |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
nwifeinc |    -.00211    .001505    -1.40   0.161    -.0050646    .0008445
motheduc |   .0090896   .0058289     1.56   0.119    -.0023535    .0205327
fatheduc |   .0038845   .0055435     0.70   0.484    -.0069982    .0147672
huseduc |   .0043307   .0060715     0.71   0.476    -.0075887    .0162501
exper |   .0425666   .0056981     7.47   0.000     .0313804    .0537528
expersq |  -.0006468   .0001862    -3.47   0.001    -.0010123   -.0002812
age |  -.0164771   .0025496    -6.46   0.000    -.0214823   -.0114718
kidslt6 |  -.2542708   .0337609    -7.53   0.000    -.3205488   -.1879927
kidsge6 |   .0099069   .0133099     0.74   0.457    -.0162226    .0360363
_cons |   .8490309    .151716     5.60   0.000     .5511885    1.146873
-------------+----------------------------------------------------------------
var(e.inlf)|   .1850598   .0095374                      .1672524    .2047631
------------------------------------------------------------------------------

. predict yhat, xb

. g v2 = inlf - yhat

. eststo Heckit: heckman lwage educ exper expersq v2, ///
> select(inlf= educ exper expersq nwifeinc age kidslt6 kidsge6) twostep

Heckman selection model -- two-step estimates   Number of obs     =        753
(regression model with sample selection)              Selected    =        428
Nonselected =        325

Wald chi2(4)      =      51.45
Prob > chi2       =     0.0000

------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwage        |
educ |   .1032574   .0200558     5.15   0.000     .0639487     .142566
exper |   .0457018   .0167666     2.73   0.006     .0128398    .0785638
expersq |   -.000895   .0004468    -2.00   0.045    -.0017707   -.0000193
v2 |   .2618507   .5703443     0.46   0.646    -.8560035    1.379705
_cons |  -.5308917   .3224749    -1.65   0.100    -1.162931    .1011474
-------------+----------------------------------------------------------------
inlf         |
educ |   .1309047   .0252542     5.18   0.000     .0814074     .180402
exper |   .1233476   .0187164     6.59   0.000     .0866641    .1600311
expersq |  -.0018871      .0006    -3.15   0.002     -.003063   -.0007111
nwifeinc |  -.0120237   .0048398    -2.48   0.013    -.0215096   -.0025378
age |  -.0528527   .0084772    -6.23   0.000    -.0694678   -.0362376
kidslt6 |  -.8683285   .1185223    -7.33   0.000    -1.100628    -.636029
kidsge6 |    .036005   .0434768     0.83   0.408     -.049208    .1212179
_cons |   .2700768    .508593     0.53   0.595    -.7267472    1.266901
-------------+----------------------------------------------------------------
/mills       |
lambda |  -.1076511   .3325972    -0.32   0.746    -.7595297    .5442275
-------------+----------------------------------------------------------------
rho |   -0.16133
sigma |  .66725601
------------------------------------------------------------------------------

. reg lwage educ v2 exper expersq

Source |       SS           df       MS      Number of obs   =       428
-------------+----------------------------------   F(4, 423)       =     19.72
Model |  35.0947088         4   8.7736772   Prob > F        =    0.0000
Residual |  188.232742       423  .444994663   R-squared       =    0.1571
Total |  223.327451       427  .523015108   Root MSE        =    .66708

------------------------------------------------------------------------------
lwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ |   .1078537   .0141892     7.60   0.000     .0799636    .1357438
v2 |   .0928949    .230293     0.40   0.687    -.3597662    .5455561
exper |   .0457809   .0168251     2.72   0.007     .0127097    .0788521
expersq |  -.0008976   .0004482    -2.00   0.046    -.0017786   -.0000167
_cons |  -.5915459   .2631025    -2.25   0.025    -1.108697   -.0743947
------------------------------------------------------------------------------

. reg lwage educ exper expersq nwifeinc age kidslt6 kidsge6, r

Linear regression                               Number of obs     =        428
F(7, 420)         =      12.62
Prob > F          =     0.0000
R-squared         =     0.1641
Root MSE          =     .66669

------------------------------------------------------------------------------
|               Robust
lwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ |   .0998844   .0141577     7.06   0.000     .0720557    .1277131
exper |   .0407097   .0153088     2.66   0.008     .0106183    .0708012
expersq |  -.0007473   .0004093    -1.83   0.069    -.0015519    .0000572
nwifeinc |   .0056942   .0027569     2.07   0.039     .0002753    .0111132
age |  -.0035204   .0061766    -0.57   0.569    -.0156613    .0086205
kidslt6 |  -.0558726   .1061345    -0.53   0.599    -.2644936    .1527485
kidsge6 |  -.0176485   .0295136    -0.60   0.550    -.0756611    .0403642
_cons |  -.3579973   .3221853    -1.11   0.267    -.9912939    .2752993
------------------------------------------------------------------------------

```