Chapter 6 - Examples
* Solomon Negash - Replicating Examples
* Wooldridge (2016). Introductory Econometrics: A Modern Approach. 6th ed.
* STATA Program, version 15.1.
* Chapter 6 - Multiple Regression Analysis: Further Analysis
* Computer Exercises (Examples)
******************** SETUP *********************
*Table6.1 Determinants of College GPA
u bwght, clear
eststo: qui reg bwght cigs faminc
(est1 stored)
eststo: qui reg bwghtlb cigs faminc
(est2 stored)
eststo: qui reg bwght packs faminc
(est3 stored)
esttab *, se r2 nostar ti("Compare to Table6.1 'Effects of Data Scaling'")
Compare to Table6.1 'Effects of Data Scaling'
(1) (2) (3)
bwght bwghtlbs bwght
cigs -0.463 -0.0290
(0.0916) (0.00572)
faminc 0.0928 0.00580 0.0928
(0.0292) (0.00182) (0.0292)
packs -9.268
_cons 117.0 7.311 117.0
(1.049) (0.0656) (1.049)
N 1388 1388 1388
R-sq 0.030 0.030 0.030
Standard errors in parentheses
est clear
*Example6.1. Effects of pollution on housing prices
u hprice2, clear
//Standardizing the variables
foreach x of varlist price nox crime rooms dist stratio {
egen z`x'= std(`x')
label var z`x' "`x' - standardized"
reg zprice znox zcrime zrooms zdist zstratio
Source | SS df MS Number of obs = 506
-------------+---------------------------------- F(5, 500) = 174.47
Model | 321.011232 5 64.2022464 Prob > F = 0.0000
Residual | 183.988778 500 .367977557 R-squared = 0.6357
-------------+---------------------------------- Adj R-squared = 0.6320
Total | 505.00001 505 1.00000002 Root MSE = .60661
zprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]
znox | -.340446 .0445411 -7.64 0.000 -.4279568 -.2529352
zcrime | -.1432828 .0307168 -4.66 0.000 -.2036327 -.0829328
zrooms | .5138878 .0300302 17.11 0.000 .454887 .5728887
zdist | -.2348385 .0430217 -5.46 0.000 -.3193642 -.1503129
zstratio | -.2702799 .0299698 -9.02 0.000 -.3291622 -.2113976
_cons | 6.61e-09 .0269672 0.00 1.000 -.0529829 .0529829
//Compare the result to Example 4.5.
g ldist=ln(dist)
reg lprice lnox ldist rooms stratio
Source | SS df MS Number of obs = 506
-------------+---------------------------------- F(4, 501) = 175.86
Model | 49.3987586 4 12.3496897 Prob > F = 0.0000
Residual | 35.1834663 501 .07022648 R-squared = 0.5840
-------------+---------------------------------- Adj R-squared = 0.5807
Total | 84.582225 505 .167489554 Root MSE = .265
lprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]
lnox | -.9535388 .1167417 -8.17 0.000 -1.182902 -.7241751
ldist | -.1343395 .0431032 -3.12 0.002 -.2190247 -.0496542
rooms | .2545271 .0185303 13.74 0.000 .2181203 .2909338
stratio | -.0524511 .0058971 -8.89 0.000 -.0640372 -.040865
_cons | 11.08386 .3181113 34.84 0.000 10.45887 11.70886
//Equation (6.7)
reg lprice lnox rooms
Source | SS df MS Number of obs = 506
-------------+---------------------------------- F(2, 503) = 265.69
Model | 43.4513652 2 21.7256826 Prob > F = 0.0000
Residual | 41.1308598 503 .081771093 R-squared = 0.5137
-------------+---------------------------------- Adj R-squared = 0.5118
Total | 84.582225 505 .167489554 Root MSE = .28596
lprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]
lnox | -.7176736 .0663397 -10.82 0.000 -.8480106 -.5873366
rooms | .3059183 .0190174 16.09 0.000 .268555 .3432816
_cons | 9.233738 .1877406 49.18 0.000 8.864885 9.60259
//Equation (6.12)
u wage1, clear
reg wage exper*
Source | SS df MS Number of obs = 526
-------------+---------------------------------- F(2, 523) = 26.74
Model | 664.266927 2 332.133463 Prob > F = 0.0000
Residual | 6496.14736 523 12.4209319 R-squared = 0.0928
-------------+---------------------------------- Adj R-squared = 0.0893
Total | 7160.41429 525 13.6388844 Root MSE = 3.5243
wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
exper | .2981001 .0409655 7.28 0.000 .2176229 .3785773
expersq | -.0061299 .0009025 -6.79 0.000 -.0079029 -.0043569
_cons | 3.725406 .3459392 10.77 0.000 3.045805 4.405007
*Example6.2. Effects of pollution on housing prices
u hprice2, clear
g ldis=ln(dist)
g roomsq = rooms^2
reg lprice lnox ldis rooms roomsq stratio
Source | SS df MS Number of obs = 506
-------------+---------------------------------- F(5, 500) = 151.77
Model | 50.9872375 5 10.1974475 Prob > F = 0.0000
Residual | 33.5949875 500 .067189975 R-squared = 0.6028
-------------+---------------------------------- Adj R-squared = 0.5988
Total | 84.582225 505 .167489554 Root MSE = .25921
lprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]
lnox | -.901682 .1146869 -7.86 0.000 -1.12701 -.6763544
ldis | -.0867814 .0432807 -2.01 0.045 -.1718159 -.001747
rooms | -.5451128 .1654542 -3.29 0.001 -.8701839 -.2200417
roomsq | .0622612 .012805 4.86 0.000 .037103 .0874194
stratio | -.0475902 .0058542 -8.13 0.000 -.059092 -.0360884
_cons | 13.38548 .5664732 23.63 0.000 12.27252 14.49844
*Example6.3. Effects of attendance on final exam performance
u attend, clear
g priGPAsq = priGPA^2
g ACTsq = ACT^2
eststo stndfnl: qui reg stndfnl atndrte priGPA ACT priGPAsq ACTsq c.priGPA#c.atndrte
estout , cells(b(nostar fmt(3)) se(par fmt(5))) stats(r2 r2_a N, fmt(%9.3f %9.3f %9.0g)
> labels(R-squared Adj-R-squared)) varlabels(_cons Constant) varwidth(25)
atndrte -0.007
priGPA -1.629
ACT -0.128
priGPAsq 0.296
ACTsq 0.005
c.priGPA#c.atndrte 0.006
Constant 2.050
R-squared 0.229
Adj-R-squared 0.222
N 680
est clear
*Example6.4. CEO compensation and frim perfromance
u ceosal1.dta, clear
eststo salary: qui reg salary sales roe
eststo lsalary: qui reg lsalary lsales roe
estout , cells(b(nostar fmt(3)) se(par fmt(5))) stats(r2 r2_a N, fmt(%9.3f %9.3f %9.0g)
> labels(R-squared Adj-R-squared)) varlabels(_cons Constant) varwidth(25)
salary lsalary
b/se b/se
sales 0.016
roe 19.631 0.018
(11.07655) (0.00396)
lsales 0.275
Constant 830.631 4.362
(223.90489) (0.29388)
R-squared 0.029 0.282
Adj-R-squared 0.020 0.275
N 209 209
est clear
*Example6.5. Confidence interval for predicted college GPA
u gpa2, clear
eststo regression: reg colgpa sat hsperc hsize c.hsize#c.hsize
Source | SS df MS Number of obs = 4,137
-------------+---------------------------------- F(4, 4132) = 398.02
Model | 499.030504 4 124.757626 Prob > F = 0.0000
Residual | 1295.16517 4,132 .313447524 R-squared = 0.2781
-------------+---------------------------------- Adj R-squared = 0.2774
Total | 1794.19567 4,136 .433799728 Root MSE = .55986
colgpa | Coef. Std. Err. t P>|t| [95% Conf. Interval]
sat | .0014925 .0000652 22.89 0.000 .0013646 .0016204
hsperc | -.0138558 .000561 -24.70 0.000 -.0149557 -.0127559
hsize | -.0608815 .0165012 -3.69 0.000 -.0932328 -.0285302
c.hsize#c.hsize | .0054603 .0022698 2.41 0.016 .0010102 .0099104
_cons | 1.492652 .0753414 19.81 0.000 1.344942 1.640362
g sat0 = sat - 1200
g hsperc0 = hsperc - 30
g hsize0 = hsize -5
eststo prediction: reg colgpa sat0 hsperc0 hsize0 c.hsize0#c.hsize0
Source | SS df MS Number of obs = 4,137
-------------+---------------------------------- F(4, 4132) = 398.02
Model | 499.030503 4 124.757626 Prob > F = 0.0000
Residual | 1295.16517 4,132 .313447524 R-squared = 0.2781
-------------+---------------------------------- Adj R-squared = 0.2774
Total | 1794.19567 4,136 .433799728 Root MSE = .55986
colgpa | Coef. Std. Err. t P>|t| [95% Conf. Interval]
sat0 | .0014925 .0000652 22.89 0.000 .0013646 .0016204
hsperc0 | -.0138558 .000561 -24.70 0.000 -.0149557 -.0127559
hsize0 | -.0062785 .0086006 -0.73 0.465 -.0231403 .0105833
c.hsize0#c.hsize0 | .0054603 .0022698 2.41 0.016 .0010102 .0099104
_cons | 2.700075 .0198778 135.83 0.000 2.661104 2.739047
estout , cells(b(nostar fmt(5)) se(par fmt(5))) stats(r2 r2_a N, fmt(%9.3f %9.3f %9.0g)
> labels(R-squared Adj-R-squared)) varlabels(_cons Constant) varwidth(25)
regression prediction
b/se b/se
sat 0.00149
hsperc -0.01386
hsize -0.06088
c.hsize#c.hsize 0.00546
sat0 0.00149
hsperc0 -0.01386
hsize0 -0.00628
c.hsize0#c.hsize0 0.00546
Constant 1.49265 2.70008
(0.07534) (0.01988)
R-squared 0.278 0.278
Adj-R-squared 0.277 0.277
N 4137 4137
est clear
*Example6.6. Confidence Interval for Future Collage GPA
u gpa2, clear
reg colgpa sat hsperc hsize c.hsize#c.hsize
Source | SS df MS Number of obs = 4,137
-------------+---------------------------------- F(4, 4132) = 398.02
Model | 499.030504 4 124.757626 Prob > F = 0.0000
Residual | 1295.16517 4,132 .313447524 R-squared = 0.2781
-------------+---------------------------------- Adj R-squared = 0.2774
Total | 1794.19567 4,136 .433799728 Root MSE = .55986
colgpa | Coef. Std. Err. t P>|t| [95% Conf. Interval]
sat | .0014925 .0000652 22.89 0.000 .0013646 .0016204
hsperc | -.0138558 .000561 -24.70 0.000 -.0149557 -.0127559
hsize | -.0608815 .0165012 -3.69 0.000 -.0932328 -.0285302
c.hsize#c.hsize | .0054603 .0022698 2.41 0.016 .0010102 .0099104
_cons | 1.492652 .0753414 19.81 0.000 1.344942 1.640362
margins, at(sat = 1200 hsperc = 30 hsize = 5 )
Adjusted predictions Number of obs = 4,137
Model VCE : OLS
Expression : Linear prediction, predict()
at : sat = 1200
hsperc = 30
hsize = 5
| Delta-method
| Margin Std. Err. t P>|t| [95% Conf. Interval]
_cons | 2.700075 .0198778 135.83 0.000 2.661104 2.739047
display as text "Root MSE = " e(rmse)
Root MSE = .55986384
predict u, res
gen u2 = u^2
mean u2
Mean estimation Number of obs = 4,137
| Mean Std. Err. [95% Conf. Interval]
u2 | .3130687 .0078993 .2975818 .3285556
display sqrt(.313)
//The 95% CI
display as text "Lower Bound = " 2.7 - 1.96*.56
Lower Bound = 1.6024
display as text "Upper Bound = " 2.7 + 1.96*.56
Upper Bound = 3.7976
*Example6.7. Predicting CEO log(salary)
u ceosal2.dta, clear
*Step 1
reg lsalary lsales lmktval ceoten
Source | SS df MS Number of obs = 177
-------------+---------------------------------- F(3, 173) = 26.91
Model | 20.5672434 3 6.85574779 Prob > F = 0.0000
Residual | 44.0789697 173 .254791732 R-squared = 0.3182
-------------+---------------------------------- Adj R-squared = 0.3063
Total | 64.6462131 176 .367308029 Root MSE = .50477
lsalary | Coef. Std. Err. t P>|t| [95% Conf. Interval]
lsales | .1628545 .0392421 4.15 0.000 .0853995 .2403094
lmktval | .109243 .0495947 2.20 0.029 .0113545 .2071315
ceoten | .0117054 .0053261 2.20 0.029 .001193 .0222178
_cons | 4.503795 .2572344 17.51 0.000 3.996073 5.011517
predict lsalaryhat, xb
predict uhat, residual
*Step 2
g euhat=exp(uhat)
mean euhat //The Duan smearing estimate (alpha_hat_0)
Mean estimation Number of obs = 177
| Mean Std. Err. [95% Conf. Interval]
euhat | 1.135661 .0523938 1.03226 1.239062
g mhat=exp(lsalaryhat)
reg salary mhat,noc // The coef. as in equation 46.44
Source | SS df MS Number of obs = 177
-------------+---------------------------------- F(1, 176) = 562.39
Model | 147352711 1 147352711 Prob > F = 0.0000
Residual | 46113901 176 262010.801 R-squared = 0.7616
-------------+---------------------------------- Adj R-squared = 0.7603
Total | 193466612 177 1093031.71 Root MSE = 511.87
salary | Coef. Std. Err. t P>|t| [95% Conf. Interval]
mhat | 1.116857 .0470953 23.71 0.000 1.023912 1.209801
*Step 3
qui reg lsalary lsales lmktval ceoten
display _b[_cons]+_b[lsales]*log(5000)+_b[lmktval]*log(10000)+_b[ceoten]*10
*Step 4
qui reg salary mhat, noc
display 1.136*exp(7.013) //or
display 1.117*exp(7.013)
*Example6.8. PRedicting CEO salary
corr mhat salary,
| mhat salary
mhat | 1.0000
salary | 0.4930 1.0000
u ceosal2.dta, clear
reg salary sales mktval ceoten
Source | SS df MS Number of obs = 177
-------------+---------------------------------- F(3, 173) = 14.53
Model | 12230632.6 3 4076877.52 Prob > F = 0.0000
Residual | 48535332.2 173 280551.053 R-squared = 0.2013
-------------+---------------------------------- Adj R-squared = 0.1874
Total | 60765964.7 176 345261.163 Root MSE = 529.67
salary | Coef. Std. Err. t P>|t| [95% Conf. Interval]
sales | .0190191 .0100561 1.89 0.060 -.0008294 .0388676
mktval | .0234003 .0094826 2.47 0.015 .0046839 .0421167
ceoten | 12.70337 5.618052 2.26 0.025 1.614616 23.79211
_cons | 613.4361 65.23685 9.40 0.000 484.6735 742.1987
log close
