139x Filetype PDF File size 0.43 MB Source: liberalarts.tamu.edu
Econometric Methods for Fractional Response Variables With an Application to 401 (K) Plan Participation Rates Author(s): Leslie E. Papke and Jeffrey M. Wooldridge Source: Journal of Applied Econometrics, Vol. 11, No. 6 (Nov. - Dec., 1996), pp. 619-632 Published by: John Wiley & Sons Stable URL: http://www.jstor.org/stable/2285155 . Accessed: 22/05/2011 17:56 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at . http://www.jstor.org/action/showPublisher?publisherCode=jwiley. . Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. John Wiley & Sons is collaborating with JSTOR to digitize, preserve and extend access to Journal of Applied Econometrics. http://www.jstor.org OF APPLIED VOL. 619-632 JOURNAL ECONOMETRICS, 11, (1996) METHODS FOR FRACTIONAL RESPONSE ECONOMETRIC WITH TO 401 VARIABLES APPLICATION PLAN AN (K) RATES PARTICIPATION M. E. AND JEFFREY WOOLDRIDGE LESLIE PAPKE State Marshall East MI USA Economics, Hall, Lansing, 48824-1038, of Michigan University, Department SUMMARY We develop attractive functional forms and simple quasi-likelihood estimation methods for regression models with a fractional dependent variable. Compared with log-odds type procedures, there is no in the function for the fractional and there is no need to use ad difficulty recovering regression variable, robust to handle data at the extreme values of zero and one. We also offer some hoc transformations new, We these in more functional form. tests the or function a general apply specification by nesting logit probit in 401 of rates methods to a data set (k) plans. employee participation pension 1. INTRODUCTION Fractional response variables arise naturally in many economic settings. The fraction of total weekly hours spent working, the proportion of income spent on charitable contributions, and rates in are a few examples of economic variables participation voluntary pension plans just variables and the of bounded between zero and one. The bounded nature of such possibility In and inference issues. this form raise functional the boundaries values at interesting observing class of functional forms with econometric we and a satisfying properties. paper specify analyse from statistics models literature We also and on the linear (GLM) synthesize expand generalized to obtain robust methods for estimation from econometrics and the quasi-likelihood literature and inference with fractional variables. response in 401 We methods to estimate a model of rates the employee participation (k) pension apply at which a firm is the rate of interest the 'match The variable plan's rate,' plans. key explanatory of The work extends that matches a dollar of employee contributions. empirical Papke (1995), who studied this linear methods. methods are but do using spline Spline flexible, they problem unit interval. not ensure that values lie in the predicted with To illustrate the issues that arise fractional variables, methodological dependent suppose < < x K vector of variables that a variable y, 0 y 1, is to be explained by a 1 explanatory x 1. The model the convention that x a (xI, x2, ..., XK), with population fl + '+ + x (1) E(y x)= 42X2+ PKXK I where is a Kx 1 the best of The reason fi vector, E( x). rarely provides description y I primary is is between 0 and and so the effect of cannot be constant that bounded 1, any particular y xj this the of x (unless the range of is very limited). To some extent problem throughout range xj the with functions of but can be overcome by a linear model non-linear x, predicted augmenting 1993 CCC 0883-7252/96/060619-14 Received 25 October © 1996 John & Sons, Ltd. Revised 19 1996 by Wiley February 620 L. E. PAPKE AND J. M. WOOLDRIDGE values from an OLS can never be to lie in the unit interval. the regression guaranteed Thus, drawbacks of linear models for fractional data are analogous to the drawbacks of the linear model for data. probability binary to model the The most common alternative equation (1) has been to log-odds ratio as a linear function. If is between zero and one then a linear model for the ratio is y strictly log-odds - = y)] x) x/ (2) E(log[y/(1 I is attractive - real value Equation (2) because log[y/(l y)] can take on any as y varies between 0 it is natural model its as a linear function. and 1, so to population regression Nevertheless, with if there are two equation (2). First, the cannot be true takes potential problems equation y on the values 0 or 1 with positive probability. Consequently, given a set of data, if any observation 0 or 1 then an must be made before the yi equals djustment computing log-odds ratio. When the yi are proportions from a fixed number of groups with known group sizes, are vailable in the for Maddala adjustments ae literature-see, example, (1983, p. 30). Estimation of the model then to Berkson's minimum method. log-odds corresponds chi-square Unfortunately, the minimum chi-square method for a fixed number of categories is not economic not from a applicable to certain problems. First, the fraction y may be a proportion discrete size-for yi could be the fraction of county land area toxic group example, containing or the of income in one be waste charitable contributions. dumps, proportion given Second, may to the in the a is the our hesitant adjust extreme values data if large percentage at extremes. In to 401(k) rates, about 40% of the takes on the value unity. It application plan participation in yi more to such a seems natural treat examples regression-type framework. when is well is still a Without further Even model (2) defined, there problem. assumptions, we cannot recover E( x), which is our interest. Under model (2) the value of x is y I primary expected y given x v) r( + exp(xf + 1 denotes the of u - x is a where conditional -x/f and v f( | x) density _log[y/(l y)] given dummy argument of integration. Even if u and x are assumed to be independent, E( y Ix) * + E( Ix) can be estimated for exp(xfS)]/[l exp(xfl)], although y using, example, Duan's method. u and x are model be (1983) smearing If not independent, (3) cannot estimated without f(- x). This is either difficult or non-robust, on whether a estimating I depending or a is Instead, we to models for non-parametric parametric approach adopted. prefer specify without to estimate of u E( x) the x. y I directly, having density given is it to estimate a for y distribution Naturally, always E( x) by y possible I assuming particular given x and the of the conditional distribution maximum likelihood. estimating parameters by for is One distribution fractional the beta this y distribution; (1990) plausible Mullahy suggests one of one are known as the estimates I that obtains possible Unfortunately, E( y x) approach. not to be robust to distributional failure (this follows from Gourieroux, Monfort, and more on this distributional can fail in Trognon(1984); below). Clearly, standard assumptions is certain One limitation of distribution that it that each applications. important the beta implies in is on is to value 1 taken with zero. the beta difficult [0, ] probability Thus, distribution justify in where least some of is at extreme values of zero or at the the one. applications portion sample In the next section we a reasonable class of functional forms for I and show how to the specify E( y x) estimate parameters Bernoulli methods. These functional using quasi-likelihood forms and estimators circumvent the raised above and are Some problems easily implemented. FOR FRACTIONAL ECONOMETRIC METHODS RESPONSE VARIABLES 621 new tests offered in specification are Section 3, and Section 4 contains the empirical 401 application relating (k) plan rates to the rate and other characteristics. participation plan's matching plan 2. FUNCTIONAL FORMS AND METHODS QUASI-LIKELIHOOD We assume the availability of an independent (though not necessarily identically distributed) = < s of observations { i where 0 1 and N is size. (x, : 1,2, ..., N}, the sequence y,) yi sample The is carried is out as Our maintained for all i, analysis that, asymptotic N--oo. assumption E( yi xi) = G(xfl) (4) I where G(-) is a known function satisfying 0< G(z)< 1 for all zER. This ensures that the of in the is if values lie interval well defined even can on (0, 1). (4) take predicted y Equation yi 0 or 1 with positive probability. Typically, G(.) is chosen to be a cumulative distribution function (cdf), with the two most popular examples being G(z) A(z) exp(z)/ + [1 function-and -¢(z), where is the standard normal cdf. exp(z)]-the logistic G(z) <(D) need even be a in what However, G(-) not cdf follows. In we an used stating (4) make no about underlying structure to obtain In equation is assumption the in this the case that a from a of known size methods Yi. special y, proportion group to ni, the on are some ni. one does not information There First, ignore n,. advantages ignoring paper want to condition on n in which case contains all relevant information. the always n, y Second, methods here are Third, under the we the method computationally simple. assumptions impose, less that on size. suggested here need not be efficient than methods use information group (See Papke and Wooldridge (1993) for methods that incorporate information on ni in a similar framework.) form in terms of where is observable. We have stated the functional E( xi), xi directly yi I the model of interest in terms of E( y, 0,), where Oi is unobserved heterogeneity Stating Ixi, in to of one to a distribution for order obtain (which xi, i0 E( yi xi) independent requires specify will lead I different is of interest in case). not always, this to a ultimately any Generally, although other than the index structure functional form from equation (4). Allowing for functional forms it is not within the of this In Section 3 we in equation (4) may be worth-while, but scope paper. present a general functional form test that has power against a variety of functional form those arise from models of unobserved misspecifications, including that heterogeneity. j8 non-linear least The Under equation (4), can be consistently estimated by squares (NLS). reason a model for or for is in is the linear fact that (4) non-linear fi perhaps leading yi equation is in work. is to be the ratio used Further, likely log-odds applied heteroscedasticity present since Var( is to be constant when < I. the NLS estimates and yi xi) unlikely '0
no reviews yet
Please Login to review.