Chapter 5. Multiple Regression Analysis: OLS Asymptotics#

Home | Stata | R

import statsmodels.formula.api as smf
from statsmodels.iolib.summary2 import summary_col

from wooldridge import *

Example 5.2 Birth weight equaiton, Standar Errors#

df = dataWoo('bwght')
half= df['cigs'].count()/2
half
694.0
df2=df[:694]
bwght_ols_half = smf.ols(formula='lbwght  ~ cigs  + lfaminc + 1', data=df2).fit()
print(bwght_ols_half.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                 lbwght   R-squared:                       0.030
Model:                            OLS   Adj. R-squared:                  0.027
Method:                 Least Squares   F-statistic:                     10.52
Date:                Mon, 11 Dec 2023   Prob (F-statistic):           3.16e-05
Time:                        18:36:37   Log-Likelihood:                 147.30
No. Observations:                 694   AIC:                            -288.6
Df Residuals:                     691   BIC:                            -275.0
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      4.7056      0.027    173.939      0.000       4.652       4.759
cigs          -0.0046      0.001     -3.481      0.001      -0.007      -0.002
lfaminc        0.0194      0.008      2.370      0.018       0.003       0.035
==============================================================================
Omnibus:                      384.000   Durbin-Watson:                   1.859
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             5273.755
Skew:                          -2.170   Prob(JB):                         0.00
Kurtosis:                      15.788   Cond. No.                         22.8
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
bwght_ols = smf.ols(formula='lbwght  ~ cigs  + lfaminc + 1', data=df).fit()
print(bwght_ols.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                 lbwght   R-squared:                       0.026
Model:                            OLS   Adj. R-squared:                  0.024
Method:                 Least Squares   F-statistic:                     18.31
Date:                Mon, 11 Dec 2023   Prob (F-statistic):           1.42e-08
Time:                        18:36:37   Log-Likelihood:                 349.39
No. Observations:                1388   AIC:                            -692.8
Df Residuals:                    1385   BIC:                            -677.1
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      4.7186      0.018    258.631      0.000       4.683       4.754
cigs          -0.0041      0.001     -4.756      0.000      -0.006      -0.002
lfaminc        0.0163      0.006      2.913      0.004       0.005       0.027
==============================================================================
Omnibus:                      610.862   Durbin-Watson:                   1.927
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             5956.668
Skew:                          -1.786   Prob(JB):                         0.00
Kurtosis:                      12.499   Cond. No.                         24.1
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
print(summary_col([bwght_ols_half, bwght_ols], stars=True,float_format='%0.3f',
                  model_names=['bwght_ols_half','bwght_ols'],
                  info_dict={'N':lambda x: "{0:d}".format(int(x.nobs)),
                             'R2':lambda x: "{:.3f}".format(x.rsquared)}))
=======================================
               bwght_ols_half bwght_ols
---------------------------------------
Intercept      4.706***       4.719*** 
               (0.027)        (0.018)  
cigs           -0.005***      -0.004***
               (0.001)        (0.001)  
lfaminc        0.019**        0.016*** 
               (0.008)        (0.006)  
R-squared      0.030          0.026    
R-squared Adj. 0.027          0.024    
N              694            1388     
R2             0.030          0.026    
=======================================
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01

Example 5.3 Economic model of crime#

df = dataWoo('crime1')
crime_ols = smf.ols(formula='narr86  ~ pcnv  + avgsen + tottime + ptime86 + qemp86 + 1', 
                    data=df).fit()
print(crime_ols.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                 narr86   R-squared:                       0.043
Model:                            OLS   Adj. R-squared:                  0.041
Method:                 Least Squares   F-statistic:                     24.29
Date:                Mon, 11 Dec 2023   Prob (F-statistic):           5.43e-24
Time:                        18:36:37   Log-Likelihood:                -3392.7
No. Observations:                2725   AIC:                             6797.
Df Residuals:                    2719   BIC:                             6833.
Df Model:                           5                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.7061      0.033     21.297      0.000       0.641       0.771
pcnv          -0.1512      0.041     -3.701      0.000      -0.231      -0.071
avgsen        -0.0070      0.012     -0.568      0.570      -0.031       0.017
tottime        0.0121      0.010      1.263      0.207      -0.007       0.031
ptime86       -0.0393      0.009     -4.403      0.000      -0.057      -0.022
qemp86        -0.1031      0.010     -9.915      0.000      -0.123      -0.083
==============================================================================
Omnibus:                     2395.326   Durbin-Watson:                   1.837
Prob(Omnibus):                  0.000   Jarque-Bera (JB):           106869.684
Skew:                           4.001   Prob(JB):                         0.00
Kurtosis:                      32.618   Cond. No.                         16.3
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
crime_ols_r = smf.ols(formula='narr86  ~ pcnv + ptime86 + qemp86 + 1', data=df).fit()
resid = df.narr86 - crime_ols_r.predict()
print(crime_ols_r.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                 narr86   R-squared:                       0.041
Model:                            OLS   Adj. R-squared:                  0.040
Method:                 Least Squares   F-statistic:                     39.10
Date:                Mon, 11 Dec 2023   Prob (F-statistic):           9.91e-25
Time:                        18:36:37   Log-Likelihood:                -3394.7
No. Observations:                2725   AIC:                             6797.
Df Residuals:                    2721   BIC:                             6821.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.7118      0.033     21.565      0.000       0.647       0.776
pcnv          -0.1499      0.041     -3.669      0.000      -0.230      -0.070
ptime86       -0.0344      0.009     -4.007      0.000      -0.051      -0.018
qemp86        -0.1041      0.010    -10.023      0.000      -0.124      -0.084
==============================================================================
Omnibus:                     2394.860   Durbin-Watson:                   1.836
Prob(Omnibus):                  0.000   Jarque-Bera (JB):           106169.153
Skew:                           4.002   Prob(JB):                         0.00
Kurtosis:                      32.513   Cond. No.                         8.27
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
# print(" (LM, p, df) = ", crime_ols.compare_lm_test(crime_ols_r))
#Alternatively, 
crime_resid = smf.ols(formula='resid  ~ pcnv  + avgsen + tottime + ptime86 + qemp86 + 1', 
                      data=df).fit()
print(crime_resid.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  resid   R-squared:                       0.001
Model:                            OLS   Adj. R-squared:                 -0.000
Method:                 Least Squares   F-statistic:                    0.8136
Date:                Mon, 11 Dec 2023   Prob (F-statistic):              0.540
Time:                        18:36:38   Log-Likelihood:                -3392.7
No. Observations:                2725   AIC:                             6797.
Df Residuals:                    2719   BIC:                             6833.
Df Model:                           5                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     -0.0057      0.033     -0.172      0.863      -0.071       0.059
pcnv          -0.0013      0.041     -0.032      0.975      -0.081       0.079
avgsen        -0.0070      0.012     -0.568      0.570      -0.031       0.017
tottime        0.0121      0.010      1.263      0.207      -0.007       0.031
ptime86       -0.0048      0.009     -0.543      0.587      -0.022       0.013
qemp86         0.0010      0.010      0.098      0.922      -0.019       0.021
==============================================================================
Omnibus:                     2395.326   Durbin-Watson:                   1.837
Prob(Omnibus):                  0.000   Jarque-Bera (JB):           106869.684
Skew:                           4.001   Prob(JB):                         0.00
Kurtosis:                      32.618   Cond. No.                         16.3
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
LM = 2725 * 0.0015 # N'Rsq
LM
4.0875