The effects of birth inputs on birthweight- evidence from quantile estimation on panel data
更新时间:2023-05-07 00:52:01 阅读量: 实用文档 文档下载
- the推荐度:
- 相关推荐
The e?ects of birth inputs on birthweight:
evidence from quantile estimation on panel data
by Jason Abrevaya?and Christian M.Dahl?
ABSTRACT
Unobserved heterogeneity among childbearing women makes it di?cult to isolate the causal e?ects of smoking and prenatal care on birth outcomes(such as birthweight).Whether or not a mother smokes,for instance,is likely to be correlated with unobserved characteristics of the mother.This paper controls for such unobserved heterogeneity by using state-level panel data on maternally linked births.A quantile-estimation approach,motivated by a correlated random-e?ects model,is used in order to estimate the e?ects of smoking and other observables(number of prenatal-care visits,years of education,etc.)on the entire birthweight distribution.
?Department of Economics,The University of Texas,Austin,TX78712.
?CREATES and School of Economics and Management,University of Aarhus,Aarhus,Denmark;e-mail: cdahl@econ.au.dk.
1Introduction
Adverse birth outcomes have been found to result in large economic costs,in the form of both direct medical costs and long-term developmental consequences.It is not surprising,then,that the public-health community has focused e?orts on prenatal-care improvements(e.g.,through smoking cessation,alcohol-intake reduction,and/or better nutrition)that are thought to improve birth out-comes.Birthweight has served as a leading indicator of infant health,with“low birthweight”(LBW) infants classi?ed as those weighing less than2500grams at birth.Observable measures of poor prenatal care,such as smoking,have strong negative associations with birthweight.For instance, according to a report by the Surgeon General,mothers who smoke during pregnancy have babies that,on average,weigh250grams less(Centers for Disease Control and Prevention(2001)).
The direct medical costs of low birthweight are quite high.Based upon hospital-discharge data from New York and New Jersey,Almond et.al.(2005)report that the hospital costs for newborns peaks at around$150,000(in2000dollars)for infants that weigh800grams;the costs remain quite high for all“low birthweight”outcomes,with an average cost of around$15,000for infants that weigh2000grams.The infant-mortality rate also increases at lower birthweights.
Other research has examined the long-term e?ects of low birthweight on cognitive develop-ment,educational outcomes,and labor-market outcomes.LBW babies have developmental prob-lems in cognition,attention,and neuromotor functioning that persist until adolescence(Hack et. al.(1995)).LBW babies are more likely to delay entry into kindergarten,repeat a grade in school, and attend special-education classes(Corman(1995);Corman and Chaikind(1998)).LBW babies are also more likely to have inferior labor-market outcomes,being more likely to be unemployed and earn lower wages(Behrman and Rosenzweig(2004);Case et.al.(2005);Currie and Hyson(1999)).
Although it has received less attention in the economics literature,high-birthweight out-comes can also represent adverse outcomes.For instance,babies weighing more than4000grams (classi?ed as high birthweight(HBW))and especially those weighing more than4500grams(clas-si?ed as very high birthweight(VHBW))are more likely to require cesarean-section births,have higher infant mortality rates,and develop health problems later in life.
A di?culty in evaluating initiatives aimed at improving birth outcomes is to accurately estimate the causal e?ects of prenatal activities on these birth outcomes.Unobserved heterogeneity among childbearing women makes it di?cult to isolate causal e?ects of various determinants of birth outcomes.Whether or not a mother smokes,for instance,is likely to be correlated with unobserved characteristics of the mother.To deal with this di?culty,various studies have used an instrumental-variable approach to estimate the e?ects of smoking(Evans and Ringel(1999); Permutt and Hebel(1989)),prenatal care(Currie and Gruber(1996);Evans and Lien(2005);
1
Joyce(1999)),and air pollution(Chay and Greenstone(2003a,2003b))on birth outcomes.
Another approach has been to utilize panel data(i.e.,several births for each mother)to iden-tify these e?ects from changes in prenatal behavior or maternal characteristics between pregnancies (Abrevaya(2006);Currie and Moretti(2002);Rosenzweig and Wolpin(1991);Royer(2004)).One concern with the panel-data identi?cation strategy is the presence of“feedback e?ects,”speci?cally that prenatal care and smoking in later pregnancies may be correlated with birth outcomes in ear-lier pregnancies.Royer(2004)provides an explicit estimation strategy to deal with such feedback e?ects(using data on at least three births per mother).Abrevaya(2006)shows that feedback e?ects are likely to cause the estimated(negative)smoking e?ect to be too large in magnitude.
Since the costs associated with birthweight have been found to exist primarily at the low end of the birthweight distribution(with costs increasing signi?cantly at the very low end),most studies have estimated the e?ects of birth inputs on the fraction of births below various thresholds(e.g., 2500grams for LBW and1500grams for“very low birthweight”).As an alternative,this paper considers a quantile-regression approach to estimating the e?ects of birth inputs on birthweight, so it is useful to compare the two approaches.The threshold-crossing approach?xes a common unconditional threshold for the entire sample,whereas the quantile-regression approach focuses upon particular conditional quantiles of the birthweight distribution.Denoting birthweight by bw and a birth input vector by x,a probit-based threshold-crossing model for LBW outcomes would be Pr(bw<2500|x)=Φ(x γ).For each x,there is a conditional probability of the LBW outcome (bw below the common threshold)and estimates ofγcan be used to infer the marginal e?ects of the birth inputs upon these conditional probabilities.For the quantile approach,a simple(linear) model for,say,the5%conditional quantile would be Q5%(bw|x)=x β.The value of the conditional quantile Q5%(bw|x)may be below the LBW threshold of2500grams for some x values and above it for other x values.The estimated marginal e?ects(inferred from the estimates ofβ)would indicate how the5%conditional quantile would be a?ected at all x values.These e?ects are not directly comparable to the probit-based e?ects.
For the question of economic costs,both the probit approach and quantile approach have drawbacks:(i)the probit approach is inherently discontinuous and o?ers only predictions of LBW vs.non-LBW outcomes,and(ii)the quantile approach combines predictions from extremely ad-verse x values(lower Q5%(bw|x)),where the costs are higher,and less adverse x values(higher Q5%(bw|x)),where the costs are lower.For the question of what causes LBW outcomes,the simple probit-based approach is certainly su?cient.The quantile approach,however,provides a convenient method for determining how birth inputs a?ect birthweight at di?erent parts of the distribution. The closest analogy with the threshold-crossing approach would be to continuously alter the thresh-old value and estimate a series of probit models.Given the di?erent aspects of the birthweight
2
distribution being modeled and estimated by the two approaches,our view is that these approaches should be viewed as complements to each other rather than substitutes.
A recent literature on estimation of quantile treatment e?ects,including Abadie,Angrist, and Imbens(2002)and Bitler,Gelbach,and Hoynes(2006),has argued that traditional estimation of average(mean)treatment e?ects may miss important causal impacts.Speci?cally,an aver-age treatment e?ect inherently combines the magnitudes of causal e?ects upon di?erent parts of the conditional distribution.It is quite possible,as in our birthweight application(and also in wage-distribution applications),that societal costs and bene?ts are more pronounced at the lower quantiles of the conditional distribution.As an example,if one estimated the average causal e?ect of smoking to be a reduction in birthweight of150grams,it could be the case that the e?ect of smoking on lower quantiles is substantially higher or lower than150grams.If a200-gram e?ect were estimated at lower quantiles and a100-gram e?ect at higher quantiles,this would argue for a stronger policy response than if the e?ects were instead stronger at the higher quantiles.Ulti-mately,consideration of how e?ects vary over the quantiles is an empirical question and one which we attempt to answer in the context of birthweight regressions in this paper.
Previous quantile-estimation approaches to estimating birth-outcome regressions have used cross-sectional data and,therefore,have su?ered from an inability to control for unobserved heterogeneity.For instance,Abrevaya(2001)(see also Koenker and Hallock(2001)and Cher-nozhukov(2005))uses cross-sectional federal natality data and?nds that various observables have signi?cantly stronger associations with birthweight at lower quantiles of the birthweight distribu-tion;unfortunately,one can not interpret these“e?ects”as causal since the estimation has a purely reduced-form structure that does not account for unobserved heterogeneity.
The outline of the paper is as follows.Section2details the quantile-estimation approach, motivated by the“correlated random e?ects model”of Chamberlain(1982,1984).We consider a notion of marginal e?ects upon conditional quantiles in which we explicitly control for unobserved heterogeneity by allowing the“mother random e?ect”to be related to observables.Section3 describes the maternally-linked birth panel data for Washington and Arizona that are used in this study.Section4reports the main empirical results of the paper.There are some interesting di?erences between the panel-data and cross-sectional results.For example,the results from panel-data estimation,which controls for unobserved heterogeneity,indicate that the negative e?ects of smoking on birthweight are signi?cantly lower(in magnitude)across all quantiles than indicated by the cross-sectional estimates.Section4.2provides a general hypothesis testing framework. Section4.3discusses issues related to endogeneity(e.g.,feedback e?ects and measurement error)in the panel-data context.Section5discusses the theoretical panel-data model in greater detail and highlights directions for future research.
3
2Quantile estimation for two-birth panel data
Despite the widespread use of both panel-data methodology and quantile-regression methodology, there has been little work at the intersection of the two methodologies.As discussed in this section, the most likely explanation is the di?culty in extending di?erencing methods to quantiles.The outline of this section is as follows.Section2.1brie?y reviews the?xed e?ects and correlated random e?ects models for conditional expectations.Building upon the correlated random e?ects framework of Section2.1,Section2.2extends the notion of marginal e?ects(and their estimation) to conditional quantile models.Section2.3discusses previous related studies.
2.1Review of conditional expectation models with panel data
Suppose that the data source contains information on exactly two births for a large sample of mothers.A standard linear panel-data model for such a situation would be
y mb=x mbβ+c m+u mb(b=1,2;m=1,...,M),(1)
where m indexes mothers,b indexes births,y denotes a birth outcome(e.g.,birthweight),x denotes a vector of observables,c denotes the(unobservable)“mother e?ect,”and u denotes a birth-speci?c disturbance.To simplify notation,let x m≡(x m1,x m2)denote the covariate values from both births of a given mother.From the basic model in(1),several di?erent types of panel-data models arise from the assumptions concerning the unobservable c m.In the“pure”random-e?ects version of(1), c m is assumed to be uncorrelated with x m.This assumption is implausible in the context of our empirical application,so attention is focused upon two models that allow for dependence between c m and x m:(1)the?xed-e?ects model and(2)the correlated random-e?ects model.
Fixed-e?ects model:The?xed-e?ects model allows correlation between c m and x m in a com-pletely unspeci?ed manner.The“meaning”of the parameter vectorβis given by
β=?E(y mb|x m,c m)
?x mb
(2)
under the following assumption:
(A1)E(u m1|x m,c m)=E(u m2|x m,c m)=0?m.(3) It is well known that,under(A1),βcan be consistently estimated by a?rst-di?erence regression
(i.e.,regressing y m2?y m1on x m2?x m1).The reason that this strategy works for the conditional expectation hinges critically upon the fact that an expectation is a linear operator,a property that is not shared by conditional quantiles.
4
Correlated random-e?ects model:The correlated random-e?ects model of Chamberlain(1982, 1984)views the unobservable c m as a linear projection onto the observables plus a disturbance:
c m=ψ+x m1λ1+x m2λ2+v m,(4)
whereψis a scalar and v m is a disturbance that(by de?nition of linear projections)is uncorrelated with x m1and x a18def34d15abe23482f4dfcbining equations(1)and(4)yields
y m1=ψ+x m1(β+λ1)+x m2λ2+v m+u m1(5)
y m2=ψ+x m1λ1+x m2(β+λ2)+v m+u m2.(6)
The parameters(ψ,β,λ1,λ2)in(5)and(6)can be estimated by least-squares regression or other methods(see,e.g.,Wooldridge(2002,Section11.3)).The vector x m1a?ects y m1through two channels,(i)a direct e?ect(expressed by the x m1βterm)and(ii)an indirect e?ect working through the unobservable e?ect c m.In contrast,the vector x m1a?ects y m2only through the unobservable e?ect c m.In fact,under the additional assumption
(A2)E(v m|x m)=0,(7) the“meaning”ofβis given by the following equation
β=?E(y m1|x m)
?x m1
??E(y m2
|x m)
?x m1
=
?E(y m2|x m)
?x m2
??E(y m1
|x m)
?x m2
.(8)
That is,βtells us how much x m1a?ects E(y m1|x m)above and beyond the e?ect that works through the unobservable c m.
2.2Estimation of e?ects on conditional quantiles with panel data
For conditional quantiles,a simple di?erencing strategy is infeasible since quantiles are not linear operators—that is,in general,Qτ(y m2?y m1|x m)=Qτ(y m2|x m)?Qτ(y m1|x m),where Qτ(·|·) denotes theτ-th conditional quantile function forτ∈(0,1).This inherent di?culty has been recognized by others and is summarized nicely in a recent quantile-regression survey by Koenker and Hallock(2000):“Quantiles of convolutions of random variables are rather intractable objects, and preliminary di?erencing strategies familiar from Gaussian models have sometimes unanticipated e?ects.”Without being more explicit about the relationship between c m and x m,it is di?cult to envision an appropriate strategy for dealing with conditional quantiles,although Koenker(2004) has made some progress on this front.
To consider the relevant e?ects of the observables on the conditional quantiles Qτ(y mb|x m) (rather than E(y mb|x m)),we consider the analogous e?ects to those given in equation(8).In
5
particular,the e?ects of the observables on a given conditional quantile are given by
?Qτ(y m1|x m)
?x m1??Qτ(y m2
|x m)
?x m1
(9)
and
?Qτ(y m2|x m)
?x m2??Qτ(y m1
|x m)
?x m2
.(10)
For example,the di?erence in equation(9)is the e?ect of x m1(?rst-birth observables)on Qτ(y m1|x m) above and beyond the e?ect on theτ-th conditional quantile that works through the unobservable.
To estimate the e?ects given in equations(9)and(10),a model for both Qτ(y m1|x m)and Qτ(y m2|x m)is needed.Unfortunately,it is non-trivial to explicitly determine the conditional quan-tile models.Consider,for example,the simple case in which the data-generating process is given by equations(1)and(4)(which then imply equations(5)and(6)).If all of the error disturbances (u m1,u m2,v m)were independent of x m,then the conditional quantile functions would take a simple form(analogous to that of the conditional expectation function under assumption(A2)):
Qτ(y m1|x m)=ψ1τ+x m1(β+λ1)+x m2λ2(11)
Qτ(y m2|x m)=ψ2τ+x m1λ1+x m2(β+λ2).(12) Under this independence assumption,the e?ect of the disturbances is re?ected by a locational shift in the conditional quantiles(ψ1τandψ2τ);the slopes do not vary across the conditional quantiles. Without the independence assumption,however,the simple linear form for the conditional quantile functions(like those in equations(11)and(12))only arises in very special cases.In general,the conditional quantile functions involve more complicated non-linear expressions and,in fact,can not be explicitly written down without a complete parametric speci?cation of the error disturbances.
Therefore,the conditional quantiles are viewed as somewhat general functions of x m:say, Qτ(y m1|x m)=f1τ(x m)and Qτ(y m2|x m)=f2τ(x m).To estimate the e?ects in(9)and(10),then, reduced-form models for Qτ(y m1|x m)and Qτ(y m2|x m)are speci?ed.These reduced-form models should be viewed as approximating the“true”conditional quantile functions f1τ(x m)and f2τ(x m). In this paper,a very simple form for the reduced-form models is considered,in which the conditional quantiles are expressed as linear(and separable)functions of x m1and x m2:
Qτ(y m1|x m)=φ1τ+x m1θ1τ+x m2λ2τ(13)
Qτ(y m2|x m)=φ2τ+x m1λ1τ+x m2θ2τ.(14) A more general model,as well as the appropriateness of linearity and separability,is discussed in greater detail in Section5.Based upon(13)and(14),the e?ects of the observables on the conditional quantiles(see(9)and(10))are equal toθ1τ?λ1τ(for the?rst-birth outcome)and
6
θ2τ?λ2τ(for the second-birth outcome).The parameters(φ1τ,φ2τ,θ1τ,θ2τ,λ1τ,λ2τ)can be consistently estimated with linear quantile regression(Koenker and Bassett(1978)).
Although the linear approximation may at?rst appear to be restrictive,this strategy is the one usually employed in cross-sectional quantile regression.In the cross-sectional case,even if the data-generating process is linear in the covariates with a mean-zero error,the conditional quantiles will only be linear in the covariates in very special cases(see,e.g.,Koenker and Bassett(1982)). Even in cross-sectional applications,then,the speci?cation chosen by an empirical researcher(lin-ear usually)should also be viewed as a reduced-form approximation to the true conditional quantile function.In fact,empirical applications of quantile regression generally start(either explicitly or implicitly)with a reduced-form approximating model of the conditional quantile function rather than the data-generating process(see,e.g.,Buchinsky(1994)and Bassett and Chen(2001)).An-grist,Chernozhukov,and Fernandez-Val(2006)provide a framework for analyzing misspeci?cation of the conditional quantile function.Although beyond the scope of this paper,it would be inter-esting to apply their methodology to the panel-data setting considered here.
The linear approximation approach is also an inherent feature of the correlated random-e?ects approach for the conditional expectation model given by(1)and(4).As Chamberlain(1982) originally pointed out,if assumption(A2)does not hold,the conditional expectation function is non-linear;in this case,equations(5)and(6)represent linear approximations(projections)andβrepresents the marginal e?ects of the covariates upon these linear approximations.
For the application in this paper,we impose the additional restriction that the e?ects on the conditional quantiles are the same for both birth outcomes.This restriction is similar to the implicit restriction embodied in the linear panel-data model(1),whereβdoes not vary with b.For the conditional quantiles,letβτdenote the(common)e?ect vector,so that the restriction is
βτ=θ1τ?λ1τ=θ2τ?λ2τ.(15) Under this restriction,the conditional quantile functions in(13)and(14)can be re-written as Qτ(y m1|x m)=φ1τ+x m1(βτ+λ1τ)+x m2λ2τ=φ1τ+x m1βτ+x m1λ1τ+x m2λ2τ(16)
Qτ(y m2|x m)=φ2τ+x m1λ1τ+x m2(βτ+λ2τ)=φ2τ+x m2βτ+x m1λ1τ+x m2λ2τ.(17) The simplest estimation strategy,based upon the second equalities in both(16)and(17),is to run a pooled linear quantile regression in which the observations corresponding to both births of a given mother are stacked together as a pair.In particular,a quantile regression(using the estimator for
7
theτ-th quantile)would be run using
???
??????????????y11
y12
···
y21
y22
···
..
.
···
y M1
y M2
?
??
??
??
??
??
??
??
??
and
?
??
??
??
??
??
??
??
??
10x 11x 11x 12
11x 12x 11x 12
···············
10x 21x 21x 22
11x 22x 21x 22
···············
..
.
···············
10x M1x M1x M2
11x M2x M1x M2
?
??
??
??
??
??
??
??
??
(18)
as the left-hand-side and right-hand-side variables,respectively.This pooled regression directly estimates(φ1τ,φ2τ?φ1τ,βτ,λ1τ,λ2τ).The di?erenceφ2τ?φ1τrepresents the e?ect of birth parity. Birth parity can not be included explicitly in x since the associated components ofβτ,λ1τ,and λ2τwould not be separately identi?ed.In a traditional panel-data context,the di?erenceφ2τ?φ1τwould represent the“time e?ect.”Although the application considered here does not have any birth-invariant explanatory variables(“time-invariant”variables),such variables could be easily incorporated into(18)as additional columns in the RHS matrix;like birth parity,it would not be possible to separately identify the direct e?ects of these variables on y from the indirect e?ects (working through c)on y.
The only di?culty introduced by the pooled regression approach involves computation of the estimator’s standard errors.Since there is dependence within a mother’s pair of births,the standard asymptotic-variance formula(Koenker and Bassett(1978))and the standard bootstrap approach,which are both based upon independent observations,can not be applied.Instead,a given bootstrap sample is created by repeatedly drawing(with replacement)a mother from the sample of M mothers and including both births for that mother,where the draws continue until the desired bootstrap sample size is reached.For a given bootstrap sample,the pooled quantile estimator is computed.After repeating this process for many bootstrap samples,the original estimator’s variance matrix can be estimated by the empirical variance matrix of the bootstrap estimates.Similarly,bootstrap percentile intervals for the parameters can be easily constructed.
2.3Review of related studies
In their recent survey of quantile regression,Koenker and Hallock(2000)cite only a single panel-data application.The cited study by Chay(1995)uses quantile regression on longitudinal earnings data to estimate the e?ect of the1964Civil Rights Act on the black-white earnings di?erential. Chay(1995)allows the individual e?ect to depend on the racial indicator variable,which amounts to a shift in the conditional quantile function and is a special case of the general approach described
8
in Section2.2.Interestingly,the application of Chay(1995)involves censored earnings data,so that quantile regression methods for censored data(Powell(1984,1986))are needed.Such censored-data quantile methods would also work with the general model of Section2.2but are not needed for the application considered in this paper.
A more recent application of quantile regression on panel data is Arias et.al.(2001),who estimate the returns to schooling using twins data.To deal with the unobserved“family e?ect,”the authors include proxy variables(father’s education and sibling’s education)in the model.This proxy-variable approach is related to the correlated random e?ects model in the sense that the latter speci?cation can be viewed as using the observables x m1and x m2as proxies for the unobserved individual e?ect.One could also incorporate an external proxy(such as father’s education in the Arias et.al.(2001)case)into the correlated random e?ects framework.
Another panel-data study that is directly related to our empirical application is Royer(2004), who applies a correlated random e?ects model to maternally linked data from Texas.Royer(2004) estimates the e?ects of various observables(with a focus upon maternal age)on“binary”birth outcomes(such as premature birth or LBW birth).Fixed-e?ects estimation is also possible(in the context of the linear probability model)whereas no such alternative is available in the conditional quantile case.Royer(2004)also relaxes the strict exogeneity assumption(required for consistency of the?xed-e?ects estimator)in several interesting ways.Unfortunately,identi?cation of the least restrictive models requires panel data with at least three births per mother.As a practical matter, this requirement reduces the sample size to an extent that makes the estimated e?ects of observ-ables rather imprecise and introduces a possible selection bias(see the discussion in Royer(2004, pp.39?)).Analogous extensions to the conditional quantile models are left for future research.
3Data
Detailed“natality data”are recorded for nearly every live birth in the United a18def34d15abe23482f4dfcrmation on maternal characteristics(age,education,race,etc.),birth outcomes(birthweight,gestation, etc.),and prenatal care(number of prenatal visits,smoking status,etc.)is collected by each state (with federal guidelines on speci?c data-item requirements).Unfortunately,due to con?dentiality restrictions,comprehensive natality data with personal identi?ers are not available at the federal level,making it di?cult to reliably construct maternally-linked panel data.However,individual states may release such personal identi?ers to researchers,subject to con?dentiality agreements in most cases.The data used in this study were obtained from two states,Washington and Arizona, and are described in detail below:
9
1.Washington data:The Washington State Longitudinal Birth Database(WSLBD)was pro-
vided by Washington’s Center for Health Statistics.The WSLBD is a panel dataset consisting of all births between1992and2002that could be accurately linked together as belonging to the same mother.(The original WSLBD has births dating back to1980,but mother’s edu-cation is not available as a data item until1992.)The matching algorithm used to construct the WSLBD used personal identifying information such as mother’s full maiden name and mother’s date of birth.For two births to be linked together,(i)an exact match on mother’s name,mother’s date of birth,mother’s race,and mother’s state of birth was required,and (ii)consistency of birth parity and the reported interval-since-last-birth was required.Only births that could be uniquely linked together were retained in the WSLBD.
2.Arizona data:The Arizona Department of Health Services provided the authors with data
on all births occurring in the state of Arizona between1993and2002.Although names were not provided,the exact dates of birth for both mother and father were provided in the data.
To maternally link births together,we followed as closely as possible the algorithm used for the Washington data.For two births to be linked together,(i)an exact match on mother’s date of birth,father’s date of birth,mother’s race,and mother’s state of birth was required, and(ii)consistency of birth parity and the reported interval-since-last-birth was required.As with the Washington data,only births that could be uniquely linked together were retained.
Since births could not be linked by maternal name,we decided to also require an exact match on father’s date of birth in order to minimize the chance of false matches entering the sample.
(Roughly3.5%of births that were linked on the basis of mother’s birthdate are dropped when links are also based upon father’s birthdate.)This choice turns out to have very little impact on the estimation results reported in Section4.The decision to match upon father’s birthdate restricts the Arizona sample to mothers whose children had the same birth father, which is not a restriction of the Washington sample.
For this study,we consider only pairs of?rst and second births to white mothers.Birth outcomes(and the e?ects of other variables upon birth outcomes)have been found to di?er across di?erent races and at higher birth parities.The choice of subsample circumvents these issue by fo-cusing upon a more homogeneous sample.The resulting estimates,of course,should be interpreted as being applicable to the subpopulation represented by this sample choice.
Estimation was carried out separately for the Washington data and Arizona data.The Washington data has several advantages over the Arizona data:(i)the matching of siblings for the Washington data is of higher quality due to the use of mothers’names,(ii)the Washington data is not restricted to siblings with the same fathers,and(iii)the Washington data includes information
10
on the month of?rst prenatal visit.For these reasons,most of the detailed analysis will be reported for the Washington data.Results for Arizona will be discussed more brie?y,but these results serve as a useful comparison to the Washington results.
Table1provides descriptive statistics for the Washington and Arizona samples,broken down by?rst-child and second-child births.Any mother with missing data items in either of her two births (for the variables summarized in Table1)was dropped from the sample.The resulting samples used for estimation consist of45,067Washington mothers(90,134births)and56,201Arizona mothers (112,402births).Sample averages are reported for all variables,as well as standard deviations for the non-indicator variables.The“Smoke”(“Drink”)variable is equal to one if the mother reported smoking(drinking alcohol)during pregnancy.Although alcohol consumption during pregnancy is known to be severely under-reported,the“Drink”variable is included in the regressions as it may be useful a proxy for other unobservables.For Washington,the four prenatal-care categories(“No prenatal care,”“1st-trimester care,”“2nd-trimester care,”“3rd-trimester care”)were constructed on the basis of the reported month of the?rst prenatal-care visit.Unfortunately,the month of?rst prenatal-care visit is not reported in the Arizona data until1997.As a result,only the number of prenatal visits and an indicator variable for“no prenatal care”(equal to one if there are no prenatal visits)are summarized in Table1and used in the empirical analysis of Section4.The other variables are self-explanatory.
The descriptive statistics in Table1indicate that average birthweight increases by88grams at the second birth for both Washington mothers and Arizona mothers.For their second birth, women are less likely to smoke and drink and more likely to be married,have a male child,and have a?rst-trimester prenatal-care visit.Based on the summary statistics,the two samples of mothers are quite similar.On average,Arizona mothers are slightly less educated and have higher birthweight babies.The largest di?erence between the two samples appears to be the level of smoking:Washington mothers report smoking in13.7%of pregnancies(close to the national average during this time period),whereas only4.7%of Arizona mothers report smoking.These smoking percentages are below the overall smoking percentages for pregnant women in these two states during the periods of interest(8.9%in Arizona and18.4%in Washington),indicating that the matching algorithms result in subsamples that over-represent non-smokers.For instance,unmarried Arizona mothers(for whom the smoking percentage is12.7%)are far more likely to have father’s date-of-birth missing from the data(45.9%of the time,as compared to1.1%for married mothers) and,therefore,not included in the matched sample.The reported rate of drinking during pregnancy is also lower in Arizona than Washington;these reported percentages are also lower than the overall percentages for pregnant women in the two states(2.7%in Washington,1.4%in Arizona).
11
Table1:Descriptive Statistics,Washington and Arizona Birth Panels Variable Washington Arizona
1st Child2nd Child1st Child2nd Child Male child0.5150.5110.5200.516 Mother’s age25.27(5.25)27.89(5.35)25.23(5.26)27.85(5.36) Mother’s education13.52(2.32)13.72(2.21)13.21(2.68)13.39(2.61) Married0.7510.8530.7800.886 No prenatal care0.0040.0030.0050.006
1st-trimester care0.8790.895——
2nd-trimester care0.1070.093——
3rd-trimester care0.0140.012——Smoke0.1430.1320.0490.044 Drink0.0170.0140.0090.007
#prenatal visits12.06(3.53)11.63(3.25)11.83(3.59)11.73(3.55) Quantiles of birthweight:
10%quantile2807289227502863 25%quantile3146322030613146 50%quantile3458354333733445 75%quantile3770385536853742 90%quantile4060416739684040
12
Table1also provides the(unconditional)10%/25%/50%/75%/90%quantiles for?rst and second births in Washington and Arizona.These quantiles indicate fairly symmetric birthweight distributions,with the median quite close to the mean,the25%and75%quantiles roughly equidis-tant from the median,and the10%and90%quantiles roughly equidistant from the median.For both states,there is a positive shift in the entire birthweight distribution from?rst to second births. The shift is largest in magnitude at the90%quantile(107grams)for Washington births and at the 10%quantile(113grams)for Arizona births.Finally,we note that the LBW cuto?of2500grams corresponds to the3–5%quantiles of the unconditional birthweight distributions,whereas the HBW cuto?of4000grams corresponds to the85–92%quantiles of the unconditional distributions.
4Results
Regression results for the two maternally linked datasets are provided in Section4.1,within the strict-exogeneity framework introduced in Section2.A straightforward approach to hypothesis testing is provided in Section4.2.Section4.3provides discussion related to possible violations of strict exogeneity(e.g.,feedback e?ects or mismeasured variables).
4.1Regression results
In the interest of space,the full set of numerical results(tables)and a detailed discussion are provided only for the Washington data(Section4.1.1).The Arizona results are reported in a graphical format comparable to the Washington results(Section4.1.3),but the detailed tables have been omitted and the discussion is limited to comparisons with the Washington results.(Complete tables are available upon request from the authors.)
4.1.1Washington data
The tables report estimates for the quantilesτ∈{0.10,0.25,0.50,0.75,0.90}(along with least-squares estimates for comparison),although the?gures presented in this section consider marginal e?ects at2-percent intervals(speci?cally,τ∈{0.04,0.06,...,0.94,0.96}).Throughout this section, the dependent variable of interest is birthweight(measured in grams).In order to have a relevant comparison for the panel-data results,cross-sectional results(without incorporating the correlated random e?ects)are also reported.For the cross-sectional results,the panel structure of the data is only used for computing standard errors.Since each mother appears twice in the data,the pair-sampling bootstrap described at the end of Section2.2is used.
Tables2and3report the cross-sectional results and panel-data results,respectively.The model speci?cation includes the variables summarized in Table1,along with an indicator variable
13
Table2:Cross-Sectional Estimation Results,Washington Data.The dependent variable is birth-weight(in grams).
10%25%50%75%90%OLS Second child100.65***92.87***93.86***99.99***110.37***98.95***
(7.24)(4.96)(4.40)(4.89)(6.84)(4.04)
Male child87.22***115.87***128.34***142.70***160.83***124.34***
(6.25)(4.33)(3.85)(4.25)(5.65)(3.57)
Age19.30***13.81***7.39**7.92** 5.4712.57***
(6.35)(4.02)(3.58)(3.93)(5.17)(3.41)
Age2-0.385***-0.258***-0.131**-0.122*-0.070-0.228***
(0.115)(0.070)(0.064)(0.070)(0.091)(0.061)
Education31.02**22.46**30.23***28.81***23.87**26.94***
(14.32)(8.80)(7.70)(6.71)(10.45)(7.16)
Education2-0.744-0.603*-0.927***-0.987***-0.793**-0.789***
(0.525)(0.324)(0.285)(0.250)(0.388)(0.264)
Married36.58***27.52***26.74***22.41***16.20*28.11***
(10.22)(7.61)(6.03)(6.91)(9.19)(6.23)
No prenatal care-324.75*-22.76-36.0515.54176.44**-34.57
(177.34)(55.29)(41.13)(44.24)(74.27)(47.90) 2nd-trimester care37.34***27.93***24.72***31.58***37.96***38.73***
(11.86)(8.41)(7.00)(8.27)(10.43)(6.48)
3rd-trimester care106.60***59.36***37.26*26.0724.4066.39***
(30.75)(20.43)(19.79)(18.81)(23.84)(15.96)
Smoke-186.05***-182.29***-178.57***-176.05***-160.02***-177.97***
(11.40)(7.64)(6.23)(7.51)(9.92)(6.16)
Drink-21.80-20.08-7.27-15.0117.03 3.89
(28.14)(22.67)(16.10)(20.00)(26.55)(16.47)
#prenatal visits19.33***16.52***15.11***14.82***13.92***18.46***
(1.35)(0.89)(0.76)(0.82)(1.10)(0.85)
Bootstrapped standard errors in parentheses,using bootstrap sample size of20,000(10,000pairs)
and1,000bootstrap replications.Year dummies were included in all regressions.
‘*’:signi?cant at10%level(2-sided);‘**’:5%level;‘***’:1%level.
for the second child,quadratic variables for both mother’s age and education,and a full set of year-of-birth dummy variables.For the prenatal-care variables,the omitted category corresponds to?rst-trimester prenatal care,so the estimates for the other three prenatal-care variables(“No prenatal care,”“2nd-trimester care,”and“3rd-trimester care”)should be interpreted as di?erences from?rst-trimester prenatal care.The e?ect of prenatal care will therefore be captured by(i)the trimester of the?rst prenatal visit(if any)and(ii)the number of prenatal visits(if any).
It should be pointed out that interpreting the e?ect of any prenatal-care variable is a bit di?cult since the observed prenatal care proxies for both intended prenatal care and pregnancy problems.For instance,if two mothers have identical intentions(at the beginning of pregnancy) with respect to prenatal-care visits,the mother that experiences problems early in her pregnancy would be more likely to have an earlier?rst prenatal-care visit and to have more prenatal-care visits overall.The estimated e?ects of the prenatal-care variables,therefore,may re?ect the combined
14
e?ects of intended care and pregnancy complications.This idea has been independently investigated by Conway and Deb(2005),who(i)?nd that bimodal residuals result from a standard2SLS regression of birthweight and(ii)use a two-class mixture model to explicitly allow for a di?erence between“normal”and“complicated”pregnancies.The estimates for the no-prenatal-care indicator variable in both Tables2and3,which are signi?cantly negative at the10%quantile and signi?cantly positive at the90%quantile,illustrate this point.A possible explanation for the dramatic di?erence at the two ends of the distribution is that lack of prenatal care is more likely to proxy for lack of intended care at the lowest quantiles and more likely to proxy for a problem-free pregnancy at the highest quantiles.Alternatively,the positive e?ect found at higher quantiles could still be consistent with a lack of intended care since HBW outcomes have previously been associated with poor prenatal care and disadvantage mothers.(Unfortunately,the leading indicators of HBW outcomes are mother’s weight prior to pregnancy and weight gain during pregnancy.Neither of these items is available in the datasets,forcing us to focus less on the e?ects of birth inputs on HBW outcomes.)At the intermediate quantiles,the e?ect of the no-prenatal-care indicator is found to be statistically insigni?cant in both the cross-sectional and panel results.
Overall,the cross-sectional results in Table2are very similar to those found in previous studies using federal natality data(Abrevaya(2001);Koenker and Hallock(2001)).For the panel-data results in Table3,unobserved heterogeneity is modeled as in Section2.2(see equations(16) and(17)).For the pooled quantile regressions,Table3reports the estimates of the marginal e?ectsβτ.The estimates of the parametersλ1τandλ2τare reported in the Appendix(Tables5 and6);these estimates measure the extent of the cross-sectional bias(through the relationship of the unobserved heterogeneity with the observables).To provide a more complete view of the variables’e?ects on birthweights and to allow an easy comparison with the cross-sectional estimates,Figures1 and2plot the estimated e?ects from both the panel and cross section.For these?gures,the quantile regressions were estimated at2%intervals,from the4%quantile through the96%quantile (inclusively).The panel-data estimates are represented with a solid line,and the90%con?dence intervals(bootstrap percentile intervals)for these estimates are represented with dotted lines.The cross-sectional estimates,computed at the same quantiles,are represented with a dashed line.(To avoid cluttering the?gures,con?dence intervals for the cross-sectional results(which can be inferred from Table2)are not reported.)Since both age and education have quadratic terms in the model speci?cation,the marginal-e?ect plots for age and education are based upon estimates evaluated at speci?c values for the two variables(25years old for age and12years for education level).
15
Table3:Panel Data Estimation Results(βτ),Washington data.The dependent variable is birth-weight(in grams).
10%25%50%75%90%OLS
(11.75)(8.30)(7.03)(7.69)(11.38)(5.99)
Male child100.67***131.64***146.69***160.12***173.35***138.75***
(7.76)(5.15)(4.40)(4.82)(6.64)(3.68)
Age-38.73***-20.88***-30.13***-31.52***-48.18***-33.06***
(11.26)(7.00)(6.27)(6.61)(9.47)(5.31)
Age20.515***0.290**0.467***0.529***0.845***0.483***
(0.197)(0.120)(0.109)(0.117)(0.161)(0.092)
Education35.54*17.2627.22***17.96** 5.9719.16**
(18.25)(11.91)(10.23)(8.46)(13.47)(7.62)
Education2-1.436*-0.881*-1.029**-0.761**-0.594-0.851**
(0.762)(0.503)(0.433)(0.388)(0.607)(0.334)
Married39.71**15.4927.99***21.13*11.4127.87***
(17.26)(11.71)(9.21)(11.32)(16.05)(8.30)
No prenatal care-323.39*-13.77 1.6626.03271.21***-18.08
(172.94)(60.05)(51.94)(54.87)(76.06)(49.01) 2nd-trimester care26.80* 6.63-0.2022.95**34.39**22.19***
(14.48)(10.46)(8.72)(10.18)(13.52)(6.93)
3nd-trimester care62.87*71.79***31.7530.5039.2255.73***
(34.96)(25.25)(22.67)(23.91)(32.04)(17.32)
Smoke-24.58-60.64***-81.43***-57.70***-56.19***-56.86***
(19.47)(14.23)(11.45)(12.09)(17.58)(9.10)
Drink-44.24-32.19 4.29-1.579.25 2.11
(35.84)(25.36)(20.00)(24.77)(29.79)(17.11)
#prenatal visits20.01***14.79***12.70***12.32***12.65***17.48***
(1.59)(1.10)(0.88)(1.01)(1.43)(0.93)
and1000bootstrap replications.Year dummies were included in all regressions.
‘*’:signi?cant at10%level(2-sided);‘**’:5%level;‘***’:1%level.
16
0.20.40.60.8
125
175
S e c o n d C h i l d
0.20.40.60.8
100
150
200
M a l e
0.20.40.60.8?20
T o t a l A g e E f f e c t a t 25 y e a r s
0.20.40.60.8
?20
20
T o t a l E d u c a t i o n E f f e c t a t 12 y e a r s 0.2
0.40.60.8
1020
30
# p r e n a t a l v i s i t s
F i g u r e 1:P a r t 1o f t h e e s t i m a t e d m a r g i n a l e ?e c t s o n t h e c o n d i t i o n a l q u a n t i l e s f o r W a s h i n g t o n b i r t h s .T h e d e p e n d e n t v a r i a b l e i s b i r t h w e i g h t (i n g r a m s ).T h e s o l i d l i n e i n d i c a t e s t h e p a n e l -d a t a e s t i m a t e s ,t h e d o t t e d l i n e s a r e 90%c o n ?d e n c e b a n d s f o r t h e p a n e l -d a t a e s t i m a t e s ,a n d t h e d a s h e d l i n e i n d i c a t e s t h e c r o s s -s e c t i o n a l e s t i m a t e s .
17
0.20.40.60.80
50
100M a r r i e d 0.20.40.60.8
?1000
?500
500
N o p r e n a t a l c a r e
0.20.40.60.80
50
2n d ?t r i m e s t e r c a r e
0.20.40.60.8
100
200
3n d ?t r i m e s t e r c a r e 0.2
0.40.60.8
?200?100
S m o k e
0.20.40.60.8
?1000
100D r i n k F i g u r e 2:P a r t 2o f t h e e s t i m a t e d m a r g i n a l e ?e c t s o n t h e c o n d i t i o n a l q u a n t i l e s f o r W a s h i n g t o n b i r t h s .T h e d e p e n d e n t v a r i a b l e i s b i r t h w e i g h t (i n g r a m s ).T h e s o l i d l i n e i n d i c a t e s t h e p a n e l -d a t a e s t i m a t e s ,t h e d o t t e d l i n e s a r e 90%c o n ?d e n c e b a n d s f o r t h e p a n e l -d a t a e s t i m a t e s ,a n d t h e d a s h e d l i n e i n d i c a t e s t h e c r o s s -s e c t i o n a l e s t i m a t e s .
18
The estimated e?ects of the various variables,as presented in Tables2and3and Figures1 and2,are discussed in more detailed below:
Second child:Birthweights are uniformly larger for second children at all quantiles,for both the cross-sectional and panel estimates.The panel estimates of the second-child e?ect are somewhat larger than the cross-sectional estimates,with the largest e?ects at the lowest quantiles(e.g.,137grams at the10%quantile).
Male child:It is well-known that,on average,male babies weigh more at birth than female babies.
The quantile estimates indicate that the positive male-child e?ect on birthweight is present at all quantiles of the conditional birthweight distribution.The magnitude of the e?ect increases when one moves from lower quantiles to higher quantiles,with the panel estimates indicating
a slightly higher e?ect(10–20grams)than the cross-sectional estimates.
Age and education:Figure1shows the estimated(one-year)e?ects of age and education,evaluated at25years of age and12years of education,respectively.For age,both the cross-sectional and panel estimates are very close to zero in magnitude(and statistically insigni?cant at a5%level for all quantiles).For education,the cross-sectional estimates are positive across the quantiles and statistically signi?cant(at a5%level)except at quantiles above80%.In contrast,the panel estimates are statistically insigni?cant across all quantiles.This di?erence could be due to two factors:(i)the amount of within-mother variation in education is quite small,with the average change in education for the sample being about0.2years;and,(ii)the level of education may be related to the mother-speci?c unobservable.For the latter factor,years of schooling is likely positively related to c m,which would imply an upward bias in the cross-sectional estimates that is consistent with Figure1.The issue of education being potentially mismeasured is brie?y discussed in Section4.3.2.Results for other age and education levels are reported in Abrevaya and Dahl(2006).
Marital status:The estimated positive e?ects of marriage on birthweight are quite similar for the cross-sectional and panel speci?cations,in the20–50gram range over the quantiles considered.
One should be cautious about interpreting the cross-sectional marriage estimates as causal since marital status is an explanatory variable that a priore would appear to serve as a proxy for mother-speci?c unobservables(i.e.,marital status positively correlated with c m).The panel estimates are slightly lower than the cross-sectional estimates in the lower quantiles (until around the40%quantile),suggesting that this might be a factor in the lower quantiles.
Somewhat surprisingly,however,the panel estimates of the marriage e?ect remain positive throughout the range of quantiles and signi?cantly so(at the10%level)at nearly all the quantiles below80%.On the whole,the estimates are consistent with a situation in which
19
marriage provides the birth mother with support(?nancial support,emotional support,etc.) that would lead to a more favorable birth outcome.
Prenatal-care visits:Lack of prenatal care is found to have a signi?cant negative e?ects at lower quantiles and signi?cant positive e?ects at the upper quantiles.The estimated e?ects are similar for both the cross-sectional and panel regressions.As discussed above,a logical explanation is that the“No prenatal care”indicator variable may proxy for poor care at lower quantiles but for problem-free pregnancies at upper quantiles.For the third-trimester-care indicator variable,the cross-sectional and panel estimates are also similar,indicating positive e?ects(as compared to?rst-trimester care)which become less statistically signi?cant at higher quantiles.For the indicator variables,the largest di?erence between the cross-sectional and panel results shows up in the second-trimester-care variable;the cross-sectional estimates are statistically signi?cant at all quantiles and range from25to50grams,whereas the panel estimates are somewhat lower(close to zero in intermediate quantiles)and only signi?cantly positive at the highest quantiles.The e?ect of the number of prenatal visits is estimated to be signi?cantly positive across all quantiles,with larger e?ects found at lower quantiles and the e?ects essentially“?attening out”(at around14–15grams per visit for the cross-sectional results and12–13grams per visit for the panel results).The estimated e?ects for the panel speci?cation exhibit a sharper decline,leading to lower estimates(roughly a2-gram per-visit di?erential)than the cross-sectional speci?cation.This variable shows up signi?cantly in the λ1τandλ2τestimates(see Tables5and6),leading to the di?erences found and suggesting that the variable is related to the mother-speci?c unobservable.
Smoking:The most dramatic di?erences between the cross-sectional and panel results are the estimated e?ects of smoking.The cross-sectional results indicate that the negative e?ects of smoking are in the range of150–200grams,with larger e?ects at lower quantiles.The panel estimates are still signi?cantly negative at all but the lowest quantiles,but the esti-mated e?ects are much lower in magnitude(mostly in the50–80gram range between the20% and80%quantiles).The omitted-variables explanation of this large di?erence would be that the smoking indicator in the cross-sectional speci?cation is negatively related with the error disturbance in the birthweight regression equation.Consistent with this explanation,the smoking coe?cients in bothλ1τandλ2τare found to be signi?cantly negative across the quan-tiles(Tables5and6).The magnitudes of the panel estimates are also signi?cantly lower than those found in previous work,including quasi-experimental estimates based upon cigarette-tax changes(e.g.,Evans and Ringel(1999)and Lien and Evans(2005))and experimental estimates(e.g.,Permutt and Hebel(1989)).These studies have estimated causal(IV)e?ects of smoking on birthweight which are not statistically di?erent from the OLS estimates;these
20
正在阅读:
The effects of birth inputs on birthweight- evidence from quantile estimation on panel data05-07
(试卷合集)大庆市2018年八年级物理上学期期末试卷16套合集含答04-22
简述第三章的作业体会06-02
2016年公需科目:物联网技术 物联网技术 继续教育 考试题答案05-19
美国户外防紫外线(f1)认证-UL卡12-07
时丰中学2018年春季开学工作总结03-04
布袋除尘操作规程05-13
openfalcon+grafana安装配置手册及注意事项12-04
化妆与礼仪 - 教案01-12
用盛气凌人造句02-11
- 1after effects教案
- 2Practice in Organizing and Connecting Specific Evidence
- 3Risk assessment models and uncertainty estimation
- 4Effects of Biodiversity on ecosystem function
- 5Unit3 The Birth of Jeans 精讲
- 6Buckling Analysis of Debonded Sandwich Panel Under Compressi
- 7After Effects试题
- 8Spring Data JPA
- 9Introduction to Data Mining
- 10Iron_and_the_Effects_of_Exercise1
- 教学能力大赛决赛获奖-教学实施报告-(完整图文版)
- 互联网+数据中心行业分析报告
- 2017上海杨浦区高三一模数学试题及答案
- 招商部差旅接待管理制度(4-25)
- 学生游玩安全注意事项
- 学生信息管理系统(文档模板供参考)
- 叉车门架有限元分析及系统设计
- 2014帮助残疾人志愿者服务情况记录
- 叶绿体中色素的提取和分离实验
- 中国食物成分表2020年最新权威完整改进版
- 推动国土资源领域生态文明建设
- 给水管道冲洗和消毒记录
- 计算机软件专业自我评价
- 高中数学必修1-5知识点归纳
- 2018-2022年中国第五代移动通信技术(5G)产业深度分析及发展前景研究报告发展趋势(目录)
- 生产车间巡查制度
- 2018版中国光热发电行业深度研究报告目录
- (通用)2019年中考数学总复习 第一章 第四节 数的开方与二次根式课件
- 2017_2018学年高中语文第二单元第4课说数课件粤教版
- 上市新药Lumateperone(卢美哌隆)合成检索总结报告
- birthweight
- estimation
- evidence
- quantile
- effects
- inputs
- birth
- panel
- data
- 考研高数复习具体时间规划绝对精品Word版
- 新湘美版六年级下册美术教学计划
- 防雷、防静电安全管理制度示范文本
- 安徽省2020-2021年中考语文模拟示范卷
- 三年级下册数学期中试题 青岛版五四制(含答案)
- 深圳华南中英文学校二年级数学下册第八单元《克和千克》单元测试题(答案解析)
- 2019年最新外科手术部位目标性监测方案
- 2020年P气瓶充装新版试题
- 秘书基础 秘书学案例分析及答案
- 精选期末考试时激励自己的句子
- 模电数电题面试题集锦
- 四年级民族团结教育教案
- 新生儿预防接种操作规程
- 幼儿园大班下学期数学教案《新建小区》含反思
- 抗酪氨酸磷酸酶抗体IgG测定试剂盒(化学发光免疫分析法)产品技术要求新产业
- 超声检测处理系统使用手册
- 二期人防工程水暖施工方案
- 最新中考电学专题复习教案
- XX公司战略企划方案
- Arbeidsmarktbeleid Heusden 2007Arbeidsmarktgegevens, beleid en acties