Verbeek 5ed. Chapter 3 - Interpreting and Comparing Regression Models
Examples
----------------------------------------------------------------------------------------------------
name: SN
log: \5iexample3_s.smcl
log type: smcl
opened on: 9 Jun 2020, 23:38:06
. **********************************************
. * Solomon Negash - Examples
. * Verbeek(2017). A Giude To Modern Econometrics. 5ed.
. * STATA Program, version 16.1.
. * Chapter 3 - Interpreting and Comparing Regression Models
. ******************** **** *********************
. * Table 3.1 OLS results hedonic price function
. u "Data/housing.dta", clear
. g lnprice = log(price)
. g lnlotsize = log(lotsize)
. reg lnprice lnlotsize bedroom bathrms airco
Source | SS df MS Number of obs = 546
-------------+---------------------------------- F(4, 541) = 177.41
Model | 42.790971 4 10.6977427 Prob > F = 0.0000
Residual | 32.6221992 541 .060299814 R-squared = 0.5674
-------------+---------------------------------- Adj R-squared = 0.5642
Total | 75.4131702 545 .138372789 Root MSE = .24556
------------------------------------------------------------------------------
lnprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lnlotsize | .4004218 .0278122 14.40 0.000 .3457886 .455055
bedrooms | .0776997 .0154859 5.02 0.000 .0472798 .1081195
bathrms | .2158305 .0229961 9.39 0.000 .1706578 .2610031
airco | .2116745 .0237213 8.92 0.000 .1650775 .2582716
_cons | 7.093777 .231547 30.64 0.000 6.638935 7.548618
------------------------------------------------------------------------------
. * Table 3.2 OLS results hedonic price function, extended model
. reg lnprice lnlotsize bedroom bathrms airco driveway recroom fullbase gashw garagepl prefarea stor
> ies
Source | SS df MS Number of obs = 546
-------------+---------------------------------- F(11, 534) = 106.33
Model | 51.7748825 11 4.7068075 Prob > F = 0.0000
Residual | 23.6382877 534 .044266456 R-squared = 0.6865
-------------+---------------------------------- Adj R-squared = 0.6801
Total | 75.4131702 545 .138372789 Root MSE = .2104
------------------------------------------------------------------------------
lnprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lnlotsize | .3031258 .0266931 11.36 0.000 .2506895 .3555622
bedrooms | .034399 .0142741 2.41 0.016 .0063588 .0624392
bathrms | .1657644 .0203286 8.15 0.000 .1258306 .2056981
airco | .1664238 .0213386 7.80 0.000 .1245059 .2083417
driveway | .110202 .0282261 3.90 0.000 .0547542 .1656498
recroom | .0579739 .0260528 2.23 0.026 .0067953 .1091524
fullbase | .1044881 .0216916 4.82 0.000 .0618768 .1470994
gashw | .1790231 .0438933 4.08 0.000 .0927984 .2652477
garagepl | .0479543 .0114765 4.18 0.000 .0254097 .070499
prefarea | .131851 .0226692 5.82 0.000 .0873192 .1763827
stories | .0916851 .0126144 7.27 0.000 .0669051 .116465
_cons | 7.745093 .2163352 35.80 0.000 7.32012 8.170065
------------------------------------------------------------------------------
. test driveway= recroom =fullbase= gashw =garagepl =prefarea =stories=0
( 1) driveway - recroom = 0
( 2) driveway - fullbase = 0
( 3) driveway - gashw = 0
( 4) driveway - garagepl = 0
( 5) driveway - prefarea = 0
( 6) driveway - stories = 0
( 7) driveway = 0
F( 7, 534) = 28.99
Prob > F = 0.0000
. * Table 3.3. OLS results hedonic price function, linear model
. reg price lotsize bedroom bathrms airco driveway recroom fullbase gashw garagepl prefarea stories
Source | SS df MS Number of obs = 546
-------------+---------------------------------- F(11, 534) = 99.97
Model | 2.6158e+11 11 2.3780e+10 Prob > F = 0.0000
Residual | 1.2703e+11 534 237874666 R-squared = 0.6731
-------------+---------------------------------- Adj R-squared = 0.6664
Total | 3.8860e+11 545 713032635 Root MSE = 15423
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lotsize | 3.546303 .3503 10.12 0.000 2.858168 4.234438
bedrooms | 1832.003 1047 1.75 0.081 -224.7409 3888.748
bathrms | 14335.56 1489.921 9.62 0.000 11408.73 17262.38
airco | 12632.89 1555.021 8.12 0.000 9578.182 15687.6
driveway | 6687.779 2045.246 3.27 0.001 2670.065 10705.49
recroom | 4511.284 1899.958 2.37 0.018 778.9759 8243.592
fullbase | 5452.386 1588.024 3.43 0.001 2332.845 8571.926
gashw | 12831.41 3217.597 3.99 0.000 6510.706 19152.11
garagepl | 4244.829 840.5442 5.05 0.000 2593.65 5896.008
prefarea | 9369.513 1669.091 5.61 0.000 6090.724 12648.3
stories | 6556.946 925.2899 7.09 0.000 4739.291 8374.6
_cons | -4038.35 3409.471 -1.18 0.237 -10735.97 2659.271
------------------------------------------------------------------------------
. * Table 3.4 Forecasting equation S&P 500 excess returns
. u "Data/predictsp5.dta", clear
. eststo Full: reg exret l.b_m l.dfr l.dfy l.logdp l.logdy l.logep l2.infl l.ltr l.lty l.tms winter
> if yyyymm < 200401
Source | SS df MS Number of obs = 646
-------------+---------------------------------- F(11, 634) = 4.67
Model | .085934914 11 .007812265 Prob > F = 0.0000
Residual | 1.06080631 634 .001673196 R-squared = 0.0749
-------------+---------------------------------- Adj R-squared = 0.0589
Total | 1.14674123 645 .001777893 Root MSE = .0409
------------------------------------------------------------------------------
exret | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
b_m |
L1. | -.0368628 .017917 -2.06 0.040 -.0720466 -.0016789
|
dfr |
L1. | .1742668 .1680648 1.04 0.300 -.1557642 .5042979
|
dfy |
L1. | 1.705778 .6871452 2.48 0.013 .3564221 3.055134
|
logdp |
L1. | .0518217 .042463 1.22 0.223 -.0315635 .135207
|
logdy |
L1. | -.0550532 .0408865 -1.35 0.179 -.1353426 .0252362
|
logep |
L1. | .0377153 .0124171 3.04 0.002 .0133317 .062099
|
infl |
L2. | -.3605266 .6476758 -0.56 0.578 -1.632376 .9113227
|
ltr |
L1. | .1760991 .0755754 2.33 0.020 .0276908 .3245074
|
lty |
L1. | -.3499433 .0978919 -3.57 0.000 -.542175 -.1577117
|
tms |
L1. | .414244 .1494562 2.77 0.006 .120755 .7077331
|
winter | .0095454 .0032615 2.93 0.004 .0031406 .0159501
_cons | .2011918 .0629227 3.20 0.001 .0776296 .3247539
------------------------------------------------------------------------------
. * Stepwise regression: excludes regressors with t-ratio smaller than 1.96
. eststo Stepwise: reg exret l.b_m l.dfy l.logep l.lty l.tms winter if yyyymm < 200401
Source | SS df MS Number of obs = 647
-------------+---------------------------------- F(6, 640) = 7.46
Model | .074984508 6 .012497418 Prob > F = 0.0000
Residual | 1.07191051 640 .00167486 R-squared = 0.0654
-------------+---------------------------------- Adj R-squared = 0.0566
Total | 1.14689502 646 .001775379 Root MSE = .04093
------------------------------------------------------------------------------
exret | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
b_m |
L1. | -.0413953 .0156137 -2.65 0.008 -.0720556 -.0107351
|
dfy |
L1. | 2.012768 .6534208 3.08 0.002 .7296608 3.295876
|
logep |
L1. | .035245 .008954 3.94 0.000 .0176623 .0528277
|
lty |
L1. | -.3664099 .0868687 -4.22 0.000 -.5369921 -.1958278
|
tms |
L1. | .401002 .1347627 2.98 0.003 .1363716 .6656325
|
winter | .0090887 .0032381 2.81 0.005 .0027301 .0154473
_cons | .20808 .0538762 3.86 0.000 .1022846 .3138754
------------------------------------------------------------------------------
. estat ic
Akaike's information criterion and Bayesian information criterion
-----------------------------------------------------------------------------
Model | N ll(null) ll(model) df AIC BIC
-------------+---------------------------------------------------------------
Stepwise | 647 1131.412 1153.286 7 -2292.572 -2261.266
-----------------------------------------------------------------------------
Note: BIC uses N = number of observations. See [R] BIC note.
. * Max adjusted R-square / Min AIC
. eststo AIC: reg exret l.b_m l.dfy l.logep l.ltr l.lty l.tms winter if yyyymm < 200401
Source | SS df MS Number of obs = 647
-------------+---------------------------------- F(7, 639) = 6.96
Model | .0812643 7 .011609186 Prob > F = 0.0000
Residual | 1.06563072 639 .001667654 R-squared = 0.0709
-------------+---------------------------------- Adj R-squared = 0.0607
Total | 1.14689502 646 .001775379 Root MSE = .04084
------------------------------------------------------------------------------
exret | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
b_m |
L1. | -.0382198 .0156658 -2.44 0.015 -.0689824 -.0074572
|
dfy |
L1. | 1.737098 .6673099 2.60 0.009 .4267128 3.047484
|
logep |
L1. | .0342268 .0089501 3.82 0.000 .0166517 .051802
|
ltr |
L1. | .1225549 .0631555 1.94 0.053 -.0014625 .2465722
|
lty |
L1. | -.350817 .0870533 -4.03 0.000 -.5217621 -.1798719
|
tms |
L1. | .4175686 .1347432 3.10 0.002 .1529757 .6821616
|
winter | .0091893 .0032316 2.84 0.005 .0028435 .015535
_cons | .2015684 .0538647 3.74 0.000 .095795 .3073417
------------------------------------------------------------------------------
. estat ic
Akaike's information criterion and Bayesian information criterion
-----------------------------------------------------------------------------
Model | N ll(null) ll(model) df AIC BIC
-------------+---------------------------------------------------------------
AIC | 647 1131.412 1155.187 8 -2294.374 -2258.595
-----------------------------------------------------------------------------
Note: BIC uses N = number of observations. See [R] BIC note.
. * Minimumm BIC
. eststo BIC: reg exret l.logep l.ltr l.lty l.tms winter if yyyymm < 200401
Source | SS df MS Number of obs = 647
-------------+---------------------------------- F(5, 641) = 7.92
Model | .066707636 5 .013341527 Prob > F = 0.0000
Residual | 1.08018738 641 .00168516 R-squared = 0.0582
-------------+---------------------------------- Adj R-squared = 0.0508
Total | 1.14689502 646 .001775379 Root MSE = .04105
------------------------------------------------------------------------------
exret | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
logep |
L1. | .0168167 .0044369 3.79 0.000 .0081042 .0255292
|
ltr |
L1. | .1581856 .0620292 2.55 0.011 .0363807 .2799905
|
lty |
L1. | -.2121012 .0624824 -3.39 0.001 -.3347961 -.0894063
|
tms |
L1. | .5126211 .1305532 3.93 0.000 .2562575 .7689848
|
winter | .0101107 .003232 3.13 0.002 .003764 .0164574
_cons | .094022 .0241341 3.90 0.000 .0466306 .1414134
------------------------------------------------------------------------------
. estat ic
Akaike's information criterion and Bayesian information criterion
-----------------------------------------------------------------------------
Model | N ll(null) ll(model) df AIC BIC
-------------+---------------------------------------------------------------
BIC | 647 1131.412 1150.798 6 -2289.596 -2262.761
-----------------------------------------------------------------------------
Note: BIC uses N = number of observations. See [R] BIC note.
. estout Full AIC BIC Stepwise, cells(b(nostar fmt(3)) se(par fmt(3))) stats(r2 r2_p N, fmt(%5.0g %5
> .0g) labels(R-Squared Psuedo_R-Sqaured N )) varlabels(_cons constant) varwidth(10) ti("Table 3.4 F
> orecasting equation S&P 500 excess returns")
Table 3.4 Forecasting equation S&P 500 excess returns
--------------------------------------------------------------
Full AIC BIC Stepwise
b/se b/se b/se b/se
--------------------------------------------------------------
L.b_m -0.037 -0.038 -0.041
(0.018) (0.016) (0.016)
L.dfr 0.174
(0.168)
L.dfy 1.706 1.737 2.013
(0.687) (0.667) (0.653)
L.logdp 0.052
(0.042)
L.logdy -0.055
(0.041)
L.logep 0.038 0.034 0.017 0.035
(0.012) (0.009) (0.004) (0.009)
L2.infl -0.361
(0.648)
L.ltr 0.176 0.123 0.158
(0.076) (0.063) (0.062)
L.lty -0.350 -0.351 -0.212 -0.366
(0.098) (0.087) (0.062) (0.087)
L.tms 0.414 0.418 0.513 0.401
(0.149) (0.135) (0.131) (0.135)
winter 0.010 0.009 0.010 0.009
(0.003) (0.003) (0.003) (0.003)
constant 0.201 0.202 0.094 0.208
(0.063) (0.054) (0.024) (0.054)
--------------------------------------------------------------
R-Squared .075 .071 .058 .065
Psuedo_R~d
N 646 647 647 647
--------------------------------------------------------------
. * Table 3.6 Summary statistics, 1472 individuals
. u "Data/Bwages.dta", clear
. tabstat wage educ exper, by(male) stat(mean sd)
Summary statistics: mean, sd
by categories of: male
male | wage educ exper
---------+------------------------------
0 | 10.26154 3.587219 15.2038
| 3.808585 1.086521 9.704987
---------+------------------------------
1 | 11.56223 3.243001 18.52296
| 4.753789 1.257386 10.25104
---------+------------------------------
Total | 11.05062 3.378397 17.21739
| 4.450513 1.204522 10.16667
----------------------------------------
. * Table 3.7 OLS results specification 1
. reg wage male educ exper
Source | SS df MS Number of obs = 1,472
-------------+---------------------------------- F(3, 1468) = 281.98
Model | 10651.6554 3 3550.55181 Prob > F = 0.0000
Residual | 18484.5373 1,468 12.5916467 R-squared = 0.3656
-------------+---------------------------------- Adj R-squared = 0.3643
Total | 29136.1928 1,471 19.8070651 Root MSE = 3.5485
------------------------------------------------------------------------------
wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | 1.346144 .1927364 6.98 0.000 .9680761 1.724212
educ | 1.98609 .0806396 24.63 0.000 1.827909 2.144271
exper | .1922751 .0095831 20.06 0.000 .1734771 .2110731
_cons | .2136922 .386895 0.55 0.581 -.5452338 .9726183
------------------------------------------------------------------------------
. * Table 3.8 OLS results specification 2
. g expersq= exper^2
. reg wage male educ exper expersq
Source | SS df MS Number of obs = 1,472
-------------+---------------------------------- F(4, 1467) = 223.20
Model | 11023.4381 4 2755.85953 Prob > F = 0.0000
Residual | 18112.7546 1,467 12.3467993 R-squared = 0.3783
-------------+---------------------------------- Adj R-squared = 0.3766
Total | 29136.1928 1,471 19.8070651 Root MSE = 3.5138
------------------------------------------------------------------------------
wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | 1.333693 .1908668 6.99 0.000 .9592925 1.708094
educ | 1.988127 .0798526 24.90 0.000 1.831489 2.144764
exper | .3579993 .0316566 11.31 0.000 .2959024 .4200963
expersq | -.0043692 .0007962 -5.49 0.000 -.005931 -.0028073
_cons | -.8924851 .4329127 -2.06 0.039 -1.741679 -.0432912
------------------------------------------------------------------------------
. * Figure 3.1 Residuals versus fitted values - linear model
. predict xb, xb
. predict u, r
. twoway (scatter u xb), ytitle("") ylabel(-20(10)40) xtitle("") caption(Figure 3.1 Residuals versus
> fitted values - linear model)
. graph export figure3_1.png, replace
(note: file figure3_1.png not found)
(file figure3_1.png written in PNG format)
. * Table 3.9 OLS results specification 3 and the F-test
. g lnexpersq = lnexper^2
. reg lnwage male lneduc lnexper lnexpersq
Source | SS df MS Number of obs = 1,472
-------------+---------------------------------- F(4, 1467) = 223.13
Model | 73.1312577 4 18.2828144 Prob > F = 0.0000
Residual | 120.204562 1,467 .081939033 R-squared = 0.3783
-------------+---------------------------------- Adj R-squared = 0.3766
Total | 193.33582 1,471 .131431557 Root MSE = .28625
------------------------------------------------------------------------------
lnwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | .1179433 .0155711 7.57 0.000 .0873993 .1484874
lneduc | .4421763 .0181921 24.31 0.000 .4064911 .4778616
lnexper | .1098205 .0543838 2.02 0.044 .0031421 .2164988
lnexpersq | .0260073 .0114762 2.27 0.024 .0034958 .0485188
_cons | 1.262706 .0663418 19.03 0.000 1.132571 1.39284
------------------------------------------------------------------------------
. * Figure 3.2 Residuals versus fitted values - loglinear model
. predict lnxb, xb
. predict lnu, r
. twoway (scatter lnu lnxb), ytitle("") xtitle("") ylabel(-2(1)2) xlabel(1 (.5) 3) caption(Figure
> 3.2 Residuals versus fitted values - loglinear model)
. graph export figure3_2.png, replace
(note: file figure3_2.png not found)
(file figure3_2.png written in PNG format)
. * Table 3.10 OLS results specification 4
. reg lnwage male lneduc lnexper
Source | SS df MS Number of obs = 1,472
-------------+---------------------------------- F(3, 1468) = 294.96
Model | 72.7104488 3 24.2368163 Prob > F = 0.0000
Residual | 120.625371 1,468 .082169871 R-squared = 0.3761
-------------+---------------------------------- Adj R-squared = 0.3748
Total | 193.33582 1,471 .131431557 Root MSE = .28665
------------------------------------------------------------------------------
lnwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | .1200777 .0155645 7.71 0.000 .0895467 .1506087
lneduc | .4366162 .0180512 24.19 0.000 .4012072 .4720252
lnexper | .2306474 .0107336 21.49 0.000 .2095925 .2517023
_cons | 1.14473 .0411808 27.80 0.000 1.06395 1.225509
------------------------------------------------------------------------------
. * Table 3.11 OLS results specification 5 and the F-test
. reg lnwage male i.educ lnexper
Source | SS df MS Number of obs = 1,472
-------------+---------------------------------- F(6, 1465) = 161.14
Model | 76.8667701 6 12.8111284 Prob > F = 0.0000
Residual | 116.46905 1,465 .079501058 R-squared = 0.3976
-------------+---------------------------------- Adj R-squared = 0.3951
Total | 193.33582 1,471 .131431557 Root MSE = .28196
------------------------------------------------------------------------------
lnwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | .1176238 .0154565 7.61 0.000 .0873046 .147943
|
educ |
2 | .1436364 .0333578 4.31 0.000 .0782023 .2090705
3 | .3048744 .0320225 9.52 0.000 .2420596 .3676892
4 | .4742768 .0330129 14.37 0.000 .4095192 .5390344
5 | .6391026 .0332227 19.24 0.000 .5739335 .7042718
|
lnexper | .2302227 .010559 21.80 0.000 .2095104 .250935
_cons | 1.271888 .0448344 28.37 0.000 1.183942 1.359835
------------------------------------------------------------------------------
. * Table 3.12 OLS results specification 6 and the F-test
. reg lnwage c.male#i.educ lnexper c.lnexper#c.male i.educ
Source | SS df MS Number of obs = 1,472
-------------+---------------------------------- F(11, 1460) = 89.69
Model | 77.9624965 11 7.08749968 Prob > F = 0.0000
Residual | 115.373323 1,460 .079022824 R-squared = 0.4032
-------------+---------------------------------- Adj R-squared = 0.3988
Total | 193.33582 1,471 .131431557 Root MSE = .28111
----------------------------------------------------------------------------------
lnwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------------+----------------------------------------------------------------
educ#c.male |
1 | .1537538 .0952221 1.61 0.107 -.0330329 .3405406
2 | .057244 .0745444 0.77 0.443 -.0889816 .2034697
3 | -.0130207 .0635555 -0.20 0.838 -.1376906 .1116492
4 | -.0186097 .0624594 -0.30 0.766 -.1411295 .1039101
5 | .0075969 .0612747 0.12 0.901 -.112599 .1277928
|
lnexper | .207439 .0165491 12.53 0.000 .1749765 .2399015
|
c.lnexper#c.male | .040632 .0214915 1.89 0.059 -.0015256 .0827896
|
educ |
2 | .224107 .0675788 3.32 0.001 .0915451 .3566689
3 | .4331904 .06323 6.85 0.000 .3091591 .5572217
4 | .6019133 .0627983 9.58 0.000 .4787287 .7250979
5 | .7549128 .0646697 11.67 0.000 .6280575 .8817682
|
_cons | 1.215836 .0776769 15.65 0.000 1.063466 1.368206
----------------------------------------------------------------------------------
. * Table 3.13 OLS results specification 7
. reg lnwage male c.lnexper##i.educ
Source | SS df MS Number of obs = 1,472
-------------+---------------------------------- F(10, 1461) = 97.90
Model | 77.5717588 10 7.75717588 Prob > F = 0.0000
Residual | 115.764061 1,461 .079236181 R-squared = 0.4012
-------------+---------------------------------- Adj R-squared = 0.3971
Total | 193.33582 1,471 .131431557 Root MSE = .28149
--------------------------------------------------------------------------------
lnwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------------+----------------------------------------------------------------
male | .1159727 .0154778 7.49 0.000 .0856117 .1463337
lnexper | .1631229 .0653945 2.49 0.013 .0348458 .2914
|
educ |
2 | .0672667 .226281 0.30 0.766 -.3766035 .511137
3 | .1352482 .2188858 0.62 0.537 -.2941157 .5646121
4 | .20495 .2194569 0.93 0.351 -.2255342 .6354342
5 | .3412997 .2180759 1.57 0.118 -.0864756 .7690749
|
educ#c.lnexper |
2 | .0193341 .070487 0.27 0.784 -.1189324 .1576007
3 | .0498847 .0682127 0.73 0.465 -.0839205 .18369
4 | .0878362 .068766 1.28 0.202 -.0470544 .2227267
5 | .0999624 .0682177 1.47 0.143 -.0338526 .2337774
|
_cons | 1.48891 .2120301 7.02 0.000 1.072994 1.904826
--------------------------------------------------------------------------------
. log close
name: SN
log: \5iexample3_s.smcl
log type: smcl
closed on: 9 Jun 2020, 23:38:08
----------------------------------------------------------------------------------------------------