262x Filetype PDF File size 0.43 MB Source: liberalarts.tamu.edu
Econometric Methods for Fractional Response Variables With an Application to 401 (K) Plan
Participation Rates
Author(s): Leslie E. Papke and Jeffrey M. Wooldridge
Source: Journal of Applied Econometrics, Vol. 11, No. 6 (Nov. - Dec., 1996), pp. 619-632
Published by: John Wiley & Sons
Stable URL: http://www.jstor.org/stable/2285155 .
Accessed: 22/05/2011 17:56
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at .
http://www.jstor.org/action/showPublisher?publisherCode=jwiley. .
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
John Wiley & Sons is collaborating with JSTOR to digitize, preserve and extend access to Journal of Applied
Econometrics.
http://www.jstor.org
OF APPLIED VOL. 619-632
JOURNAL ECONOMETRICS, 11, (1996)
METHODS FOR FRACTIONAL RESPONSE
ECONOMETRIC
WITH TO 401
VARIABLES APPLICATION PLAN
AN (K)
RATES
PARTICIPATION
M.
E. AND JEFFREY WOOLDRIDGE
LESLIE PAPKE
State Marshall East MI USA
Economics, Hall, Lansing, 48824-1038,
of Michigan University,
Department
SUMMARY
We develop attractive functional forms and simple quasi-likelihood estimation methods for regression
models with a fractional dependent variable. Compared with log-odds type procedures, there is no
in the function for the fractional and there is no need to use ad
difficulty recovering regression variable, robust
to handle data at the extreme values of zero and one. We also offer some
hoc transformations new,
We these
in more functional form.
tests the or function a general apply
specification by nesting logit probit in 401
of rates
methods to a data set (k) plans.
employee participation pension
1. INTRODUCTION
Fractional response variables arise naturally in many economic settings. The fraction of total
weekly hours spent working, the proportion of income spent on charitable contributions, and
rates in are a few examples of economic variables
participation voluntary pension plans just variables and the of
bounded between zero and one. The bounded nature of such possibility
In
and inference issues. this
form
raise functional
the boundaries
values at interesting
observing class of functional forms with econometric
we and a satisfying properties.
paper specify analyse from statistics
models literature
We also and on the linear (GLM)
synthesize expand generalized to obtain robust methods for estimation
from econometrics
and the quasi-likelihood literature
and inference with fractional variables.
response in 401
We methods to estimate a model of rates
the employee participation (k) pension
apply at which a firm
is the rate
of interest the 'match
The variable plan's rate,'
plans. key explanatory of
The work extends that
matches a dollar of employee contributions. empirical Papke (1995),
who studied this linear methods. methods are but do
using spline Spline flexible, they
problem unit interval.
not ensure that values lie in the
predicted with
To illustrate the issues that arise fractional variables,
methodological dependent suppose
< < x K vector of variables
that a variable y, 0 y 1, is to be explained by a 1 explanatory
x 1. The model
the convention that
x a (xI, x2, ..., XK), with population
fl + '+ + x (1)
E(y x)= 42X2+ PKXK
I
where is a Kx 1 the best of The reason
fi vector, E( x).
rarely provides description y I primary
is is between 0 and and so the effect of cannot be constant
that bounded 1, any particular
y xj this
the of x (unless the range of is very limited). To some extent problem
throughout range xj the
with functions of but
can be overcome by a linear model non-linear x, predicted
augmenting
1993
CCC 0883-7252/96/060619-14 Received 25 October
© 1996 John & Sons, Ltd. Revised 19 1996
by Wiley February
620 L. E. PAPKE AND J. M. WOOLDRIDGE
values from an OLS can never be to lie in the unit interval. the
regression guaranteed Thus,
drawbacks of linear models for fractional data are analogous to the drawbacks of the linear
model for data.
probability binary to model the
The most common alternative equation (1) has been to log-odds ratio as a linear
function. If is between zero and one then a linear model for the ratio is
y strictly log-odds
- =
y)] x) x/ (2)
E(log[y/(1 I
is attractive - real value
Equation (2) because log[y/(l y)] can take on any as y varies between
0 it is natural model its as a linear function.
and 1, so to population regression Nevertheless,
with if
there are two equation (2). First, the cannot be true takes
potential problems equation y
on the values 0 or 1 with positive probability. Consequently, given a set of data, if any
observation 0 or 1 then an must be made before the
yi equals djustment computing log-odds
ratio. When the yi are proportions from a fixed number of groups with known group sizes,
are vailable in the for Maddala
adjustments ae literature-see, example, (1983, p. 30). Estimation
of the model then to Berkson's minimum method.
log-odds corresponds chi-square
Unfortunately, the minimum chi-square method for a fixed number of categories is not
economic not from a
applicable to certain problems. First, the fraction y may be a proportion
discrete size-for yi could be the fraction of county land area toxic
group example, containing
or the of income in one be
waste charitable contributions.
dumps, proportion given Second, may
to the in the a is the our
hesitant adjust extreme values data if large percentage at extremes. In
to 401(k) rates, about 40% of the takes on the value unity. It
application plan participation in yi
more to such a
seems natural treat examples regression-type framework.
when is well is still a Without further
Even model (2) defined, there problem. assumptions,
we cannot recover E( x), which is our interest. Under model (2) the value
of x is y I primary expected
y given
x v)
r( + exp(xf +
1
denotes the of u - x is a
where conditional -x/f and v
f( |
x) density _log[y/(l y)] given
dummy argument of integration. Even if u and x are assumed to be independent,
E( y Ix) * + E( Ix) can be estimated for
exp(xfS)]/[l exp(xfl)], although y using, example,
Duan's method. u and x are model be
(1983) smearing If not independent, (3) cannot estimated
without f(- x). This is either difficult or non-robust, on whether a
estimating I depending
or a is Instead, we to models for
non-parametric parametric approach adopted. prefer specify
without to estimate of u
E( x) the x.
y I directly, having density given
is
it to estimate a for
y distribution
Naturally, always E( x) by y
possible I assuming particular
given x and the of the conditional distribution maximum likelihood.
estimating parameters by
for is
One distribution fractional the beta this
y distribution; (1990)
plausible Mullahy suggests
one of one are known
as the estimates I that obtains
possible Unfortunately, E( y x)
approach.
not to be robust to distributional failure (this follows from Gourieroux, Monfort, and
more on this distributional can fail in
Trognon(1984); below). Clearly, standard assumptions
is
certain One limitation of distribution that it that each
applications. important the beta implies
in is on is to
value 1 taken with zero. the beta difficult
[0, ] probability Thus, distribution justify
in where least some of is at extreme values of zero or
at the the one.
applications portion sample
In the next section we a reasonable class of functional forms for I and show
how to the specify E( y x)
estimate parameters Bernoulli methods. These functional
using quasi-likelihood
forms and estimators circumvent the raised above and are Some
problems easily implemented.
FOR FRACTIONAL
ECONOMETRIC METHODS RESPONSE VARIABLES 621
new tests offered in
specification are Section 3, and Section 4 contains the
empirical
401 application
relating (k) plan rates to the rate and other characteristics.
participation plan's matching plan
2. FUNCTIONAL FORMS AND METHODS
QUASI-LIKELIHOOD
We assume the availability of an independent (though not necessarily identically distributed)
= < s
of observations { i where 0 1 and N is size.
(x, : 1,2, ..., N}, the
sequence y,) yi sample
The is carried is
out as Our maintained for all i,
analysis that,
asymptotic N--oo. assumption
E( yi xi) = G(xfl) (4)
I
where G(-) is a known function satisfying 0< G(z)< 1 for all zER. This ensures that the
of in the is if
values lie interval well defined even can on
(0, 1). (4) take
predicted y Equation yi
0 or 1 with positive probability. Typically, G(.) is chosen to be a cumulative distribution
function (cdf), with the two most popular examples being G(z) A(z) exp(z)/
+
[1 function-and -¢(z), where is the standard normal cdf.
exp(z)]-the logistic G(z) <(D)
need even be a in what
However, G(-) not cdf follows.
In we an used
stating (4) make no about underlying structure to obtain
In equation is assumption the in this
the case that a from a of known size methods
Yi. special y, proportion group to ni,
the on are some ni. one does not
information There First,
ignore n,. advantages ignoring
paper want to condition on n in which case contains all relevant information. the
always n, y Second,
methods here are Third, under the we the method
computationally simple. assumptions impose,
less that on size.
suggested here need not be efficient than methods use information group (See
Papke and Wooldridge (1993) for methods that incorporate information on ni in a similar
framework.) form in terms of where is observable.
We have stated the functional E( xi), xi
directly yi I
the model of interest in terms of E( y, 0,), where Oi is unobserved heterogeneity
Stating Ixi, in to
of one to a distribution for order obtain (which
xi, i0 E( yi xi)
independent requires specify will lead I different
is of interest in case). not always, this to a
ultimately any Generally, although other than the index structure
functional form from equation (4). Allowing for functional forms
it is not within the of this In Section 3 we
in equation (4) may be worth-while, but scope paper.
present a general functional form test that has power against a variety of functional form
those arise from models of unobserved
misspecifications, including that heterogeneity.
j8 non-linear least The
Under equation (4), can be consistently estimated by squares (NLS).
reason a model for or for
is in is the linear
fact that (4) non-linear fi perhaps leading yi
equation is in work. is to be
the ratio used Further, likely
log-odds applied heteroscedasticity present
since Var( is to be constant when < I. the NLS estimates and
yi xi) unlikely '0
no reviews yet
Please Login to review.