Chapter 14. Advanced Panel Data Methods#

Home | Stata | R

import pandas as pd
from statsmodels.iolib.summary2 import summary_col

from wooldridge import *

Example 14.1. Effect of Job Training on Firm Scrap Rates#

import warnings
warnings.filterwarnings('ignore')
df = dataWoo("jtrain")
dfp= pd.DataFrame(df.set_index(['fcode', 'year'], inplace=True))
from linearmodels import PanelOLS 
fe1 = PanelOLS.from_formula('lscrap ~1 + d88 + d89 + grant + grant_1 + EntityEffects', data=df).fit()
print(fe1)
                          PanelOLS Estimation Summary                           
================================================================================
Dep. Variable:                 lscrap   R-squared:                        0.2010
Estimator:                   PanelOLS   R-squared (Between):             -0.0177
No. Observations:                 162   R-squared (Within):               0.2010
Date:                Mon, Dec 11 2023   R-squared (Overall):              0.0021
Time:                        18:37:45   Log-likelihood                   -80.946
Cov. Estimator:            Unadjusted                                           
                                        F-statistic:                      6.5426
Entities:                          54   P-value                           0.0001
Avg Obs:                       3.0000   Distribution:                   F(4,104)
Min Obs:                       3.0000                                           
Max Obs:                       3.0000   F-statistic (robust):             6.5426
                                        P-value                           0.0001
Time periods:                       3   Distribution:                   F(4,104)
Avg Obs:                       54.000                                           
Min Obs:                       54.000                                           
Max Obs:                       54.000                                           
                                                                                
                             Parameter Estimates                              
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
Intercept      0.5974     0.0677     8.8202     0.0000      0.4631      0.7318
d88           -0.0802     0.1095    -0.7327     0.4654     -0.2973      0.1369
d89           -0.2472     0.1332    -1.8556     0.0663     -0.5114      0.0170
grant         -0.2523     0.1506    -1.6751     0.0969     -0.5510      0.0464
grant_1       -0.4216     0.2102    -2.0057     0.0475     -0.8384     -0.0048
==============================================================================

F-test for Poolability: 24.661
P-value: 0.0000
Distribution: F(53,104)

Included effects: Entity
fe2 = PanelOLS.from_formula('lscrap ~1 + d88 + d89 + grant + EntityEffects', data=df).fit()
print(fe2)
                          PanelOLS Estimation Summary                           
================================================================================
Dep. Variable:                 lscrap   R-squared:                        0.1701
Estimator:                   PanelOLS   R-squared (Between):             -0.0028
No. Observations:                 162   R-squared (Within):               0.1701
Date:                Mon, Dec 11 2023   R-squared (Overall):              0.0129
Time:                        18:37:45   Log-likelihood                   -84.020
Cov. Estimator:            Unadjusted                                           
                                        F-statistic:                      7.1760
Entities:                          54   P-value                           0.0002
Avg Obs:                       3.0000   Distribution:                   F(3,105)
Min Obs:                       3.0000                                           
Max Obs:                       3.0000   F-statistic (robust):             7.1760
                                        P-value                           0.0002
Time periods:                       3   Distribution:                   F(3,105)
Avg Obs:                       54.000                                           
Min Obs:                       54.000                                           
Max Obs:                       54.000                                           
                                                                                
                             Parameter Estimates                              
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
Intercept      0.5974     0.0687     8.6960     0.0000      0.4612      0.7337
d88           -0.1401     0.1068    -1.3110     0.1927     -0.3519      0.0718
d89           -0.4270     0.0999    -4.2732     0.0000     -0.6252     -0.2289
grant         -0.0822     0.1263    -0.6511     0.5164     -0.3326      0.1681
==============================================================================

F-test for Poolability: 23.900
P-value: 0.0000
Distribution: F(53,105)

Included effects: Entity

Example 14.2.Has the Return to Education Changed over Time?#

df = dataWoo("wagepan")
year = pd.Categorical(df.year)
dfp= pd.DataFrame(df.set_index(['nr','year'], inplace=True))
df['year'] = year
fe1 = PanelOLS.from_formula('lwage ~ 1 + union + married + year*educ + EntityEffects', data=df, drop_absorbed=True).fit()
print(fe1)
                          PanelOLS Estimation Summary                           
================================================================================
Dep. Variable:                  lwage   R-squared:                        0.1708
Estimator:                   PanelOLS   R-squared (Between):              0.0905
No. Observations:                4360   R-squared (Within):               0.1708
Date:                Mon, Dec 11 2023   R-squared (Overall):              0.1277
Time:                        18:37:46   Log-likelihood                   -1350.7
Cov. Estimator:            Unadjusted                                           
                                        F-statistic:                      48.907
Entities:                         545   P-value                           0.0000
Avg Obs:                       8.0000   Distribution:                 F(16,3799)
Min Obs:                       8.0000                                           
Max Obs:                       8.0000   F-statistic (robust):             48.907
                                        P-value                           0.0000
Time periods:                       8   Distribution:                 F(16,3799)
Avg Obs:                       545.00                                           
Min Obs:                       545.00                                           
Max Obs:                       545.00                                           
                                                                                
                                 Parameter Estimates                                 
=====================================================================================
                   Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
-------------------------------------------------------------------------------------
Intercept             1.3625     0.0162     83.903     0.0000      1.3306      1.3943
union                 0.0830     0.0194     4.2671     0.0000      0.0449      0.1211
married               0.0548     0.0184     2.9773     0.0029      0.0187      0.0909
year[T.1981]         -0.0224     0.1459    -0.1537     0.8779     -0.3084      0.2636
year[T.1982]         -0.0058     0.1459    -0.0395     0.9685     -0.2917      0.2802
year[T.1983]          0.0104     0.1459     0.0715     0.9430     -0.2755      0.2964
year[T.1984]          0.0844     0.1459     0.5785     0.5630     -0.2016      0.3703
year[T.1985]          0.0497     0.1459     0.3409     0.7332     -0.2362      0.3357
year[T.1986]          0.0656     0.1459     0.4497     0.6530     -0.2204      0.3516
year[T.1987]          0.0904     0.1459     0.6201     0.5352     -0.1955      0.3764
year[T.1981]:educ     0.0116     0.0123     0.9448     0.3448     -0.0125      0.0356
year[T.1982]:educ     0.0148     0.0123     1.2061     0.2279     -0.0093      0.0388
year[T.1983]:educ     0.0171     0.0123     1.3959     0.1628     -0.0069      0.0412
year[T.1984]:educ     0.0166     0.0123     1.3521     0.1764     -0.0075      0.0406
year[T.1985]:educ     0.0237     0.0123     1.9316     0.0535     -0.0004      0.0478
year[T.1986]:educ     0.0274     0.0123     2.2334     0.0256      0.0033      0.0515
year[T.1987]:educ     0.0304     0.0123     2.4798     0.0132      0.0064      0.0545
=====================================================================================

F-test for Poolability: 8.0932
P-value: 0.0000
Distribution: F(544,3799)

Included effects: Entity

Example 14.3.Effect of Job Training on Firm Scrap Rates#

df = dataWoo("jtrain")
dfp= pd.DataFrame(df.set_index(['fcode', 'year'], inplace=True))
from linearmodels import PanelOLS 
fe1 = PanelOLS.from_formula('lscrap ~1 + d88 + d89 + grant + grant_1 + lsales + lemploy + EntityEffects', data=df).fit()
print(fe1)
                          PanelOLS Estimation Summary                           
================================================================================
Dep. Variable:                 lscrap   R-squared:                        0.2131
Estimator:                   PanelOLS   R-squared (Between):             -0.0797
No. Observations:                 148   R-squared (Within):               0.2131
Date:                Mon, Dec 11 2023   R-squared (Overall):             -0.0494
Time:                        18:37:46   Log-likelihood                   -68.887
Cov. Estimator:            Unadjusted                                           
                                        F-statistic:                      4.1063
Entities:                          51   P-value                           0.0011
Avg Obs:                       2.9020   Distribution:                    F(6,91)
Min Obs:                       1.0000                                           
Max Obs:                       3.0000   F-statistic (robust):             4.1063
                                        P-value                           0.0011
Time periods:                       3   Distribution:                    F(6,91)
Avg Obs:                       49.333                                           
Min Obs:                       47.000                                           
Max Obs:                       51.000                                           
                                                                                
                             Parameter Estimates                              
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
Intercept      2.1155     3.1084     0.6806     0.4979     -4.0590      8.2900
d88           -0.0040     0.1195    -0.0331     0.9736     -0.2414      0.2335
d89           -0.1322     0.1537    -0.8601     0.3920     -0.4375      0.1731
grant         -0.2968     0.1571    -1.8891     0.0621     -0.6088      0.0153
grant_1       -0.5356     0.2242    -2.3888     0.0190     -0.9809     -0.0902
lsales        -0.0869     0.2597    -0.3345     0.7388     -0.6027      0.4290
lemploy       -0.0764     0.3503    -0.2180     0.8279     -0.7722      0.6194
==============================================================================

F-test for Poolability: 20.748
P-value: 0.0000
Distribution: F(50,91)

Included effects: Entity

Example 14.4. A Wage Equation Using Panel Data#

df = dataWoo("wagepan")
year = pd.Categorical(df.year)
dfp= pd.DataFrame(df.set_index(['nr','year'], inplace=True))
df['year'] = year
FE = PanelOLS.from_formula('lwage ~ 1 + educ + black + hisp + exper + expersq + married + union + year + EntityEffects', data=df, drop_absorbed=True).fit()
from linearmodels import PooledOLS
POLS = PooledOLS.from_formula('lwage ~ 1 + educ + black + hisp + exper + expersq + married + union + year', data=df).fit()
from linearmodels import RandomEffects
RE = RandomEffects.from_formula('lwage ~ 1 + educ + black + hisp + exper + expersq + married + union + year', data=df).fit()

from linearmodels.panel import compare
print(compare({'Pooled':POLS, 'RE':RE, 'FE':FE}))
                            Model Comparison                           
=======================================================================
                                Pooled                RE             FE
-----------------------------------------------------------------------
Dep. Variable                    lwage             lwage          lwage
Estimator                    PooledOLS     RandomEffects       PanelOLS
No. Observations                  4360              4360           4360
Cov. Est.                   Unadjusted        Unadjusted     Unadjusted
R-squared                       0.1893            0.1806         0.1806
R-Squared (Within)              0.1692            0.1799         0.1806
R-Squared (Between)             0.2066            0.1853        -0.0528
R-Squared (Overall)             0.1893            0.1828         0.0552
F-statistic                     72.459            68.409         83.851
P-value (F-stat)                0.0000            0.0000         0.0000
=====================     ============   ===============   ============
Intercept                       0.0921            0.0234         1.0276
                              (1.1761)          (0.1546)       (34.312)
educ                            0.0913            0.0919               
                              (17.442)          (8.5744)               
black                          -0.1392           -0.1394               
                             (-5.9049)         (-2.9054)               
hisp                            0.0160            0.0217               
                              (0.7703)          (0.5078)               
exper                           0.0672            0.1058         0.1321
                              (4.9095)          (6.8706)       (13.450)
expersq                        -0.0024           -0.0047        -0.0052
                             (-2.9413)         (-6.8623)      (-7.3612)
married                         0.1083            0.0638         0.0467
                              (6.8997)          (3.8035)       (2.5494)
union                           0.1825            0.1059         0.0800
                              (10.635)          (5.9289)       (4.1430)
year[T.1981]                    0.0583            0.0404         0.0190
                              (1.9214)          (1.6362)       (0.9353)
year[T.1982]                    0.0628            0.0309        -0.0113
                              (1.8900)          (0.9519)      (-0.5597)
year[T.1983]                    0.0620            0.0202        -0.0420
                              (1.6915)          (0.4840)      (-2.0667)
year[T.1984]                    0.0905            0.0430        -0.0385
                              (2.2566)          (0.8350)      (-1.8938)
year[T.1985]                    0.1092            0.0577        -0.0432
                              (2.5200)          (0.9383)      (-2.1362)
year[T.1986]                    0.1420            0.0918        -0.0274
                              (3.0580)          (1.2834)      (-1.3432)
year[T.1987]                    0.1738            0.1348               
                              (3.5165)          (1.6504)               
======================= ============== ================= ==============
Effects                                                          Entity
-----------------------------------------------------------------------

T-stats reported in parentheses