Wooldridge Econometrics Pdf 128285 | Papke Wooldridge 1996

Partial capture of text on file.
      Econometric Methods for Fractional Response Variables With an Application to 401 (K) Plan
      Participation Rates
      Author(s): Leslie E. Papke and Jeffrey M. Wooldridge
      Source: Journal of Applied Econometrics, Vol. 11, No. 6 (Nov. - Dec., 1996), pp. 619-632
      Published by: John Wiley & Sons
      Stable URL: http://www.jstor.org/stable/2285155 .
      Accessed: 22/05/2011 17:56
      Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at .
      http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
      you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
      may use content in the JSTOR archive only for your personal, non-commercial use.
      Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at .
      http://www.jstor.org/action/showPublisher?publisherCode=jwiley. .
      Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
      page of such transmission.
      JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
      content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
      of scholarship. For more information about JSTOR, please contact support@jstor.org.
            John Wiley & Sons is collaborating with JSTOR to digitize, preserve and extend access to Journal of Applied
            Econometrics.
      http://www.jstor.org
                              OF APPLIED                                   VOL.         619-632 
               JOURNAL                          ECONOMETRICS,                      11,               (1996) 
                                                            METHODS FOR FRACTIONAL RESPONSE 
                    ECONOMETRIC 
                                                       WITH                                                          TO 401 
                        VARIABLES                                               APPLICATION                                                    PLAN 
                                                                       AN                                                             (K) 
                                                                                                       RATES 
                                                             PARTICIPATION 
                                                                                                      M. 
                                                              E.              AND JEFFREY                  WOOLDRIDGE 
                                                  LESLIE  PAPKE 
                                                                       State                 Marshall           East               MI                   USA 
                                          Economics,                                                     Hall,         Lansing,        48824-1038, 
                                       of                 Michigan            University, 
                       Department 
                                                                                 SUMMARY 
               We develop  attractive functional forms and simple  quasi-likelihood estimation methods for regression 
               models  with  a  fractional  dependent  variable.  Compared with  log-odds  type  procedures,  there is  no 
                             in                  the                  function for the fractional                         and there is no need to use ad 
               difficulty        recovering           regression                                             variable,                                       robust 
                                            to handle data at the extreme values of zero and one. We also offer some 
               hoc transformations                                                                                                                   new, 
                                                                                                                                                We             these 
                                                                                               in    more               functional form. 
                                  tests                 the          or           function        a          general                                  apply 
               specification             by nesting          logit      probit            in 401 
                                             of                                   rates 
               methods to a data set                                                               (k)              plans. 
                                                 employee participation                                 pension 
                                                                          1.  INTRODUCTION 
               Fractional  response  variables  arise  naturally  in  many  economic  settings.  The  fraction  of  total 
               weekly  hours  spent  working,  the  proportion  of  income  spent  on  charitable  contributions,  and 
                                    rates in                                              are           a few  examples of economic  variables 
               participation                      voluntary pension plans                       just                variables and the                              of 
               bounded between zero and one. The bounded nature of  such                                                                         possibility 
                                                                                                                                                            In 
                                                                                                                           and inference issues.                this 
                                                                                                                  form 
                                                                       raise                     functional 
                                               the boundaries 
                                values at                                      interesting 
               observing                                         class of functional forms with                                  econometric 
                         we                and                a                                                  satisfying                           properties. 
               paper           specify           analyse                                                                                        from statistics 
                                                                                                        models                   literature 
               We also                        and                on the                       linear                 (GLM) 
                             synthesize             expand                 generalized                to obtain robust methods for estimation 
                                                                       from econometrics 
               and the quasi-likelihood literature 
               and inference with fractional                                variables. 
                                                             response                                                                     in 401 
                   We                    methods to estimate a model of                                                          rates 
                                    the                                                     employee participation                                  (k) pension 
                          apply                                                                                                                at which a firm 
                                                                                         is                                       the rate 
                                                                        of interest          the             'match 
                          The                              variable                                plan's               rate,' 
               plans.            key explanatory                                                                                           of 
                                                                                      The                     work extends that 
               matches a dollar of employee contributions.                                    empirical                                         Papke (1995), 
                who studied this                                  linear               methods.                   methods are                       but           do 
                                                         using               spline                    Spline                         flexible,           they 
                                           problem                              unit interval. 
               not ensure that                         values lie in the 
                                       predicted                                                with 
                   To illustrate the                                   issues that arise                fractional                       variables, 
                                             methodological                                                             dependent                         suppose 
                                                  <  <                                                      x K  vector  of                             variables 
                that a  variable y,  0  y                 1,  is  to  be  explained  by  a  1                                       explanatory 
                                                                                     x      1. The                      model 
                                                      the convention that 
                x a  (xI,  x2,  ...,   XK), with                                                       population 
                                                                           fl      +             '+  +     x                                                      (1) 
                                                             E(y       x)=            42X2+              PKXK 
                                                                     I 
                where         is a Kx  1                                           the best                        of                  The                  reason 
                           fi                 vector,                                                                   E(        x). 
                                                          rarely provides                        description                y I                primary 
                is            is                between 0 and                 and so the effect of                                       cannot be constant 
                    that          bounded                                1,                                   any particular 
                           y                                                                                                         xj            this 
                                  the             of x  (unless the range of                  is very limited). To some extent                           problem 
                throughout              range                                             xj                                                      the 
                                                                                           with                     functions of            but 
                can be overcome by                                  a linear model                 non-linear                           x,              predicted 
                                                augmenting 
                                                                                                                                                               1993 
                CCC 0883-7252/96/060619-14                                                                                   Received 25 October 
                © 1996           John               & Sons, Ltd.                                                             Revised 19                        1996 
                            by           Wiley                                                                                                 February 
           620                               L. E. PAPKE AND J. M. WOOLDRIDGE 
           values  from an OLS                  can never be                to lie  in the unit interval.         the 
                                   regression                  guaranteed                                 Thus, 
           drawbacks of  linear models  for fractional data are analogous to the drawbacks of  the linear 
                       model for           data. 
           probability             binary      to                             model the 
             The most common alternative          equation (1) has been to                log-odds ratio as a linear 
           function. If   is         between zero and one then a linear model for the                ratio is 
                        y    strictly                                                      log-odds 
                                                               -         = 
                                                                 y)]  x)   x/                                     (2) 
                                                  E(log[y/(1         I 
                          is attractive                    -                        real value 
           Equation (2)                because log[y/(l      y)]  can take on any               as y varies between 
           0             it is natural    model its                            as a linear function. 
             and 1, so                 to            population regression                            Nevertheless, 
                                                with                                                       if 
           there are two                              equation (2). First, the            cannot be true        takes 
                          potential problems                                    equation                      y 
           on  the  values  0  or  1  with  positive  probability. Consequently,  given  a  set  of  data,  if  any 
           observation              0 or 1 then an                must be made before                   the 
                         yi equals                    djustment                            computing        log-odds 
           ratio. When the yi are proportions from a fixed number of  groups with known group sizes, 
                         are  vailable in the                    for           Maddala 
           adjustments  ae                    literature-see,       example,              (1983, p. 30). Estimation 
           of the           model then                 to Berkson's minimum                  method. 
                  log-odds               corresponds                             chi-square 
              Unfortunately, the  minimum chi-square method for  a  fixed  number of  categories  is  not 
                                   economic                                            not                    from a 
           applicable to certain              problems. First, the fraction y may           be a proportion 
           discrete         size-for              yi could be the fraction of county land area                  toxic 
                    group              example,                                                     containing 
                           or the             of income          in                                      one       be 
           waste                                                    charitable contributions. 
                  dumps,          proportion              given                                Second,        may 
                     to        the                   in the         a                     is   the                our 
           hesitant     adjust     extreme values           data if   large percentage      at     extremes. In 
                        to 401(k)                       rates, about 40% of the        takes on the value unity. It 
           application              plan participation  in                          yi 
                  more           to       such               a 
           seems         natural  treat        examples        regression-type framework. 
                    when               is well                  is still a           Without further 
              Even         model (2)            defined, there            problem.                     assumptions, 
           we cannot recover E(        x),  which is our           interest. Under model (2) the                value 
           of          x is         y I                   primary                                    expected 
              y given 
                                                     x                 v) 
                                                     r(     + exp(xf + 
                                                          1 
                            denotes the                          of  u             -                   x         is a 
           where                          conditional                                    -x/f            and v 
                   f(  | 
                        x)                             density        _log[y/(l      y)]        given 
           dummy  argument  of  integration.  Even  if  u  and  x  are  assumed  to  be  independent, 
           E( y Ix) *                 +                        E(  Ix)  can  be  estimated             for 
                       exp(xfS)]/[l     exp(xfl)],  although       y                           using,      example, 
           Duan's                      method.     u and x are                      model               be 
                    (1983) smearing              If             not independent,            (3) cannot     estimated 
           without                f(-  x).  This is  either difficult or non-robust,                  on whether a 
                     estimating       I                                                  depending 
                             or a                         is            Instead, we          to          models for 
           non-parametric          parametric approach       adopted.                 prefer     specify 
                               without          to estimate              of u 
           E(     x)                                         the                     x. 
               y I  directly,           having                   density      given 
                            is 
                          it                     to estimate                          a                         for 
                                                                 y                                 distribution 
              Naturally,       always                         E(    x) by                                           y 
                                       possible                    I       assuming  particular 
           given x and                the              of the conditional distribution        maximum likelihood. 
                          estimating      parameters                                      by 
                                         for               is 
           One             distribution      fractional      the beta                                             this 
                                                        y              distribution;            (1990) 
                 plausible                                                           Mullahy            suggests 
               one                                                        of                one           are known 
           as                                             the estimates            I   that      obtains 
                   possible              Unfortunately,                      E( y  x) 
                              approach. 
           not  to  be  robust  to  distributional  failure  (this  follows  from  Gourieroux,  Monfort,  and 
                              more on this                                 distributional                 can fail  in 
           Trognon(1984);                     below).  Clearly, standard                   assumptions 
                                                                                         is 
           certain                One              limitation of           distribution  that it            that each 
                    applications.       important                 the beta                         implies 
                  in        is       on                                                         is          to 
           value         1     taken     with              zero.         the beta                 difficult 
                     [0,  ]                    probability        Thus,           distribution                justify 
           in               where     least some           of             is at     extreme values of zero or 
                                   at                         the              the                              one. 
              applications                        portion         sample 
              In the next section we             a reasonable class of functional forms for            I   and show 
           how  to             the      specify                                                  E( y  x) 
                     estimate       parameters          Bernoulli                      methods. These  functional 
                                                 using              quasi-likelihood 
           forms and estimators circumvent the                  raised above and are                           Some 
                                                     problems                           easily implemented. 
                                                            FOR FRACTIONAL 
                              ECONOMETRIC METHODS                                RESPONSE VARIABLES                       621 
           new                   tests      offered in 
                 specification         are              Section 3, and Section 4 contains the 
                                                                                                     empirical 
                     401                                                                                         application 
           relating       (k) plan                  rates to the                     rate and other         characteristics. 
                                    participation                plan's matching                      plan 
                           2.  FUNCTIONAL FORMS AND                                              METHODS 
                                                                    QUASI-LIKELIHOOD 
           We assume the availability of an independent (though not necessarily identically distributed) 
                                                        =                           <  s 
                        of  observations  {            i                  where 0           1 and N  is                  size. 
                                             (x,     :     1,2,  ...,  N},                                 the 
            sequence                             y,)                                   yi                       sample 
           The                           is carried                                                    is 
                                                    out as           Our maintained                             for all i, 
                               analysis                                                                   that, 
                 asymptotic                                 N--oo.                       assumption 
                                                          E( yi  xi) = G(xfl)                                              (4) 
                                                                I 
            where G(-)  is  a known  function  satisfying  0<  G(z)<  1 for all  zER.  This ensures that the 
                                of       in the                                    is                      if 
                       values        lie         interval                             well defined even          can       on 
                                                           (0, 1).            (4)                                     take 
            predicted              y                               Equation                                   yi 
            0  or  1  with  positive  probability. Typically,  G(.)  is  chosen  to  be  a cumulative distribution 
            function  (cdf),       with  the  two  most  popular  examples  being  G(z)                      A(z)    exp(z)/ 
               + 
            [1                              function-and             -¢(z),     where         is the standard normal cdf. 
                 exp(z)]-the       logistic                    G(z)                     <(D) 
                              need       even be a        in what 
            However, G(-)            not             cdf           follows. 
               In                          we                                    an                          used 
                  stating             (4)      make no                   about      underlying structure            to obtain 
                In         equation            is         assumption                                    the            in this 
                   the           case that        a              from a           of known size             methods 
            Yi.        special              y,      proportion            group        to           ni, 
                           the                on             are some                                ni.        one does not 
                                information           There                                             First, 
                   ignore                         n,.                    advantages       ignoring 
            paper  want to condition on n  in which case               contains all relevant information.                  the 
            always                            n,                    y                                           Second, 
            methods here are                                  Third, under the                     we            the method 
                                 computationally simple.                           assumptions         impose, 
                                             less                             that                     on          size. 
            suggested here need not be             efficient than methods          use information         group         (See 
            Papke and Wooldridge  (1993)  for methods that incorporate information on  ni in  a similar 
            framework.)                              form             in terms of                 where       is observable. 
               We have stated the functional                                         E(      xi),          xi 
                                                            directly                     yi I 
                     the model of  interest in terms of  E( y,             0,),  where Oi is  unobserved heterogeneity 
            Stating                                                   Ixi,           in        to 
                           of               one to           a distribution for         order      obtain             (which 
                              xi,                                                 i0                       E( yi  xi) 
            independent           requires          specify                                         will lead  I  different 
            is             of interest in       case).                           not always, this              to a 
              ultimately                   any          Generally, although                 other than the index structure 
            functional form from equation (4). Allowing for functional forms 
                                                             it is not within the          of this          In Section 3 we 
            in equation (4) may be worth-while, but                                 scope           paper. 
            present a general  functional  form test that has  power  against a  variety of  functional form 
                                              those       arise from models of unobserved 
            misspecifications, including             that                                         heterogeneity. 
                                        j8                                         non-linear least                       The 
               Under equation (4),         can be consistently estimated by                            squares (NLS). 
                                                                                      reason a           model for      or for 
                                      is              in   is           the                      linear 
            fact that             (4)    non-linear      fi   perhaps        leading                                 yi 
                       equation  is          in           work.                                     is         to be 
            the             ratio     used                        Further,                             likely 
                 log-odds                       applied                      heteroscedasticity                       present 
            since Var(            is            to be constant when             < I.               the NLS estimates and 
                         yi  xi)     unlikely                            '0
The words contained in this file might help you see if this file matches what you are looking for:

...Econometric methods for fractional response variables with an application to k plan participation rates author s leslie e papke and jeffrey m wooldridge source journal of applied econometrics vol no nov dec pp published by john wiley sons stable url http www jstor org accessed your use the archive indicates acceptance terms conditions available at page info about policies jsp provides in part that unless you have obtained prior permission may not download entire issue a or multiple copies articles content only personal non commercial please contact publisher regarding any further this work information be action showpublisher publishercode jwiley each copy transmission must contain same copyright notice appears on screen printed such is profit service helps scholars researchers students discover build upon wide range trusted digital we technology tools increase productivity facilitate new forms scholarship more support collaborating digitize preserve extend access state marshall east mi...
Related files

Share

Help

Related files

Share

Share to social media

Help

Login Area