The effects of birth inputs on birthweight- evidence from quantile estimation on panel data

更新时间:2023-05-07 00:52:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

The e?ects of birth inputs on birthweight:

evidence from quantile estimation on panel data

by Jason Abrevaya?and Christian M.Dahl?

ABSTRACT

Unobserved heterogeneity among childbearing women makes it di?cult to isolate the causal e?ects of smoking and prenatal care on birth outcomes(such as birthweight).Whether or not a mother smokes,for instance,is likely to be correlated with unobserved characteristics of the mother.This paper controls for such unobserved heterogeneity by using state-level panel data on maternally linked births.A quantile-estimation approach,motivated by a correlated random-e?ects model,is used in order to estimate the e?ects of smoking and other observables(number of prenatal-care visits,years of education,etc.)on the entire birthweight distribution.

?Department of Economics,The University of Texas,Austin,TX78712.

?CREATES and School of Economics and Management,University of Aarhus,Aarhus,Denmark;e-mail: cdahl@econ.au.dk.

1Introduction

Adverse birth outcomes have been found to result in large economic costs,in the form of both direct medical costs and long-term developmental consequences.It is not surprising,then,that the public-health community has focused e?orts on prenatal-care improvements(e.g.,through smoking cessation,alcohol-intake reduction,and/or better nutrition)that are thought to improve birth out-comes.Birthweight has served as a leading indicator of infant health,with“low birthweight”(LBW) infants classi?ed as those weighing less than2500grams at birth.Observable measures of poor prenatal care,such as smoking,have strong negative associations with birthweight.For instance, according to a report by the Surgeon General,mothers who smoke during pregnancy have babies that,on average,weigh250grams less(Centers for Disease Control and Prevention(2001)).

The direct medical costs of low birthweight are quite high.Based upon hospital-discharge data from New York and New Jersey,Almond et.al.(2005)report that the hospital costs for newborns peaks at around$150,000(in2000dollars)for infants that weigh800grams;the costs remain quite high for all“low birthweight”outcomes,with an average cost of around$15,000for infants that weigh2000grams.The infant-mortality rate also increases at lower birthweights.

Other research has examined the long-term e?ects of low birthweight on cognitive develop-ment,educational outcomes,and labor-market outcomes.LBW babies have developmental prob-lems in cognition,attention,and neuromotor functioning that persist until adolescence(Hack et. al.(1995)).LBW babies are more likely to delay entry into kindergarten,repeat a grade in school, and attend special-education classes(Corman(1995);Corman and Chaikind(1998)).LBW babies are also more likely to have inferior labor-market outcomes,being more likely to be unemployed and earn lower wages(Behrman and Rosenzweig(2004);Case et.al.(2005);Currie and Hyson(1999)).

Although it has received less attention in the economics literature,high-birthweight out-comes can also represent adverse outcomes.For instance,babies weighing more than4000grams (classi?ed as high birthweight(HBW))and especially those weighing more than4500grams(clas-si?ed as very high birthweight(VHBW))are more likely to require cesarean-section births,have higher infant mortality rates,and develop health problems later in life.

A di?culty in evaluating initiatives aimed at improving birth outcomes is to accurately estimate the causal e?ects of prenatal activities on these birth outcomes.Unobserved heterogeneity among childbearing women makes it di?cult to isolate causal e?ects of various determinants of birth outcomes.Whether or not a mother smokes,for instance,is likely to be correlated with unobserved characteristics of the mother.To deal with this di?culty,various studies have used an instrumental-variable approach to estimate the e?ects of smoking(Evans and Ringel(1999); Permutt and Hebel(1989)),prenatal care(Currie and Gruber(1996);Evans and Lien(2005);

1

Joyce(1999)),and air pollution(Chay and Greenstone(2003a,2003b))on birth outcomes.

Another approach has been to utilize panel data(i.e.,several births for each mother)to iden-tify these e?ects from changes in prenatal behavior or maternal characteristics between pregnancies (Abrevaya(2006);Currie and Moretti(2002);Rosenzweig and Wolpin(1991);Royer(2004)).One concern with the panel-data identi?cation strategy is the presence of“feedback e?ects,”speci?cally that prenatal care and smoking in later pregnancies may be correlated with birth outcomes in ear-lier pregnancies.Royer(2004)provides an explicit estimation strategy to deal with such feedback e?ects(using data on at least three births per mother).Abrevaya(2006)shows that feedback e?ects are likely to cause the estimated(negative)smoking e?ect to be too large in magnitude.

Since the costs associated with birthweight have been found to exist primarily at the low end of the birthweight distribution(with costs increasing signi?cantly at the very low end),most studies have estimated the e?ects of birth inputs on the fraction of births below various thresholds(e.g., 2500grams for LBW and1500grams for“very low birthweight”).As an alternative,this paper considers a quantile-regression approach to estimating the e?ects of birth inputs on birthweight, so it is useful to compare the two approaches.The threshold-crossing approach?xes a common unconditional threshold for the entire sample,whereas the quantile-regression approach focuses upon particular conditional quantiles of the birthweight distribution.Denoting birthweight by bw and a birth input vector by x,a probit-based threshold-crossing model for LBW outcomes would be Pr(bw<2500|x)=Φ(x γ).For each x,there is a conditional probability of the LBW outcome (bw below the common threshold)and estimates ofγcan be used to infer the marginal e?ects of the birth inputs upon these conditional probabilities.For the quantile approach,a simple(linear) model for,say,the5%conditional quantile would be Q5%(bw|x)=x β.The value of the conditional quantile Q5%(bw|x)may be below the LBW threshold of2500grams for some x values and above it for other x values.The estimated marginal e?ects(inferred from the estimates ofβ)would indicate how the5%conditional quantile would be a?ected at all x values.These e?ects are not directly comparable to the probit-based e?ects.

For the question of economic costs,both the probit approach and quantile approach have drawbacks:(i)the probit approach is inherently discontinuous and o?ers only predictions of LBW vs.non-LBW outcomes,and(ii)the quantile approach combines predictions from extremely ad-verse x values(lower Q5%(bw|x)),where the costs are higher,and less adverse x values(higher Q5%(bw|x)),where the costs are lower.For the question of what causes LBW outcomes,the simple probit-based approach is certainly su?cient.The quantile approach,however,provides a convenient method for determining how birth inputs a?ect birthweight at di?erent parts of the distribution. The closest analogy with the threshold-crossing approach would be to continuously alter the thresh-old value and estimate a series of probit models.Given the di?erent aspects of the birthweight

2

distribution being modeled and estimated by the two approaches,our view is that these approaches should be viewed as complements to each other rather than substitutes.

A recent literature on estimation of quantile treatment e?ects,including Abadie,Angrist, and Imbens(2002)and Bitler,Gelbach,and Hoynes(2006),has argued that traditional estimation of average(mean)treatment e?ects may miss important causal impacts.Speci?cally,an aver-age treatment e?ect inherently combines the magnitudes of causal e?ects upon di?erent parts of the conditional distribution.It is quite possible,as in our birthweight application(and also in wage-distribution applications),that societal costs and bene?ts are more pronounced at the lower quantiles of the conditional distribution.As an example,if one estimated the average causal e?ect of smoking to be a reduction in birthweight of150grams,it could be the case that the e?ect of smoking on lower quantiles is substantially higher or lower than150grams.If a200-gram e?ect were estimated at lower quantiles and a100-gram e?ect at higher quantiles,this would argue for a stronger policy response than if the e?ects were instead stronger at the higher quantiles.Ulti-mately,consideration of how e?ects vary over the quantiles is an empirical question and one which we attempt to answer in the context of birthweight regressions in this paper.

Previous quantile-estimation approaches to estimating birth-outcome regressions have used cross-sectional data and,therefore,have su?ered from an inability to control for unobserved heterogeneity.For instance,Abrevaya(2001)(see also Koenker and Hallock(2001)and Cher-nozhukov(2005))uses cross-sectional federal natality data and?nds that various observables have signi?cantly stronger associations with birthweight at lower quantiles of the birthweight distribu-tion;unfortunately,one can not interpret these“e?ects”as causal since the estimation has a purely reduced-form structure that does not account for unobserved heterogeneity.

The outline of the paper is as follows.Section2details the quantile-estimation approach, motivated by the“correlated random e?ects model”of Chamberlain(1982,1984).We consider a notion of marginal e?ects upon conditional quantiles in which we explicitly control for unobserved heterogeneity by allowing the“mother random e?ect”to be related to observables.Section3 describes the maternally-linked birth panel data for Washington and Arizona that are used in this study.Section4reports the main empirical results of the paper.There are some interesting di?erences between the panel-data and cross-sectional results.For example,the results from panel-data estimation,which controls for unobserved heterogeneity,indicate that the negative e?ects of smoking on birthweight are signi?cantly lower(in magnitude)across all quantiles than indicated by the cross-sectional estimates.Section4.2provides a general hypothesis testing framework. Section4.3discusses issues related to endogeneity(e.g.,feedback e?ects and measurement error)in the panel-data context.Section5discusses the theoretical panel-data model in greater detail and highlights directions for future research.

3

2Quantile estimation for two-birth panel data

Despite the widespread use of both panel-data methodology and quantile-regression methodology, there has been little work at the intersection of the two methodologies.As discussed in this section, the most likely explanation is the di?culty in extending di?erencing methods to quantiles.The outline of this section is as follows.Section2.1brie?y reviews the?xed e?ects and correlated random e?ects models for conditional expectations.Building upon the correlated random e?ects framework of Section2.1,Section2.2extends the notion of marginal e?ects(and their estimation) to conditional quantile models.Section2.3discusses previous related studies.

2.1Review of conditional expectation models with panel data

Suppose that the data source contains information on exactly two births for a large sample of mothers.A standard linear panel-data model for such a situation would be

y mb=x mbβ+c m+u mb(b=1,2;m=1,...,M),(1)

where m indexes mothers,b indexes births,y denotes a birth outcome(e.g.,birthweight),x denotes a vector of observables,c denotes the(unobservable)“mother e?ect,”and u denotes a birth-speci?c disturbance.To simplify notation,let x m≡(x m1,x m2)denote the covariate values from both births of a given mother.From the basic model in(1),several di?erent types of panel-data models arise from the assumptions concerning the unobservable c m.In the“pure”random-e?ects version of(1), c m is assumed to be uncorrelated with x m.This assumption is implausible in the context of our empirical application,so attention is focused upon two models that allow for dependence between c m and x m:(1)the?xed-e?ects model and(2)the correlated random-e?ects model.

Fixed-e?ects model:The?xed-e?ects model allows correlation between c m and x m in a com-pletely unspeci?ed manner.The“meaning”of the parameter vectorβis given by

β=?E(y mb|x m,c m)

?x mb

(2)

under the following assumption:

(A1)E(u m1|x m,c m)=E(u m2|x m,c m)=0?m.(3) It is well known that,under(A1),βcan be consistently estimated by a?rst-di?erence regression

(i.e.,regressing y m2?y m1on x m2?x m1).The reason that this strategy works for the conditional expectation hinges critically upon the fact that an expectation is a linear operator,a property that is not shared by conditional quantiles.

4

Correlated random-e?ects model:The correlated random-e?ects model of Chamberlain(1982, 1984)views the unobservable c m as a linear projection onto the observables plus a disturbance:

c m=ψ+x m1λ1+x m2λ2+v m,(4)

whereψis a scalar and v m is a disturbance that(by de?nition of linear projections)is uncorrelated with x m1and x a18def34d15abe23482f4dfcbining equations(1)and(4)yields

y m1=ψ+x m1(β+λ1)+x m2λ2+v m+u m1(5)

y m2=ψ+x m1λ1+x m2(β+λ2)+v m+u m2.(6)

The parameters(ψ,β,λ1,λ2)in(5)and(6)can be estimated by least-squares regression or other methods(see,e.g.,Wooldridge(2002,Section11.3)).The vector x m1a?ects y m1through two channels,(i)a direct e?ect(expressed by the x m1βterm)and(ii)an indirect e?ect working through the unobservable e?ect c m.In contrast,the vector x m1a?ects y m2only through the unobservable e?ect c m.In fact,under the additional assumption

(A2)E(v m|x m)=0,(7) the“meaning”ofβis given by the following equation

β=?E(y m1|x m)

?x m1

??E(y m2

|x m)

?x m1

=

?E(y m2|x m)

?x m2

??E(y m1

|x m)

?x m2

.(8)

That is,βtells us how much x m1a?ects E(y m1|x m)above and beyond the e?ect that works through the unobservable c m.

2.2Estimation of e?ects on conditional quantiles with panel data

For conditional quantiles,a simple di?erencing strategy is infeasible since quantiles are not linear operators—that is,in general,Qτ(y m2?y m1|x m)=Qτ(y m2|x m)?Qτ(y m1|x m),where Qτ(·|·) denotes theτ-th conditional quantile function forτ∈(0,1).This inherent di?culty has been recognized by others and is summarized nicely in a recent quantile-regression survey by Koenker and Hallock(2000):“Quantiles of convolutions of random variables are rather intractable objects, and preliminary di?erencing strategies familiar from Gaussian models have sometimes unanticipated e?ects.”Without being more explicit about the relationship between c m and x m,it is di?cult to envision an appropriate strategy for dealing with conditional quantiles,although Koenker(2004) has made some progress on this front.

To consider the relevant e?ects of the observables on the conditional quantiles Qτ(y mb|x m) (rather than E(y mb|x m)),we consider the analogous e?ects to those given in equation(8).In

5

particular,the e?ects of the observables on a given conditional quantile are given by

?Qτ(y m1|x m)

?x m1??Qτ(y m2

|x m)

?x m1

(9)

and

?Qτ(y m2|x m)

?x m2??Qτ(y m1

|x m)

?x m2

.(10)

For example,the di?erence in equation(9)is the e?ect of x m1(?rst-birth observables)on Qτ(y m1|x m) above and beyond the e?ect on theτ-th conditional quantile that works through the unobservable.

To estimate the e?ects given in equations(9)and(10),a model for both Qτ(y m1|x m)and Qτ(y m2|x m)is needed.Unfortunately,it is non-trivial to explicitly determine the conditional quan-tile models.Consider,for example,the simple case in which the data-generating process is given by equations(1)and(4)(which then imply equations(5)and(6)).If all of the error disturbances (u m1,u m2,v m)were independent of x m,then the conditional quantile functions would take a simple form(analogous to that of the conditional expectation function under assumption(A2)):

Qτ(y m1|x m)=ψ1τ+x m1(β+λ1)+x m2λ2(11)

Qτ(y m2|x m)=ψ2τ+x m1λ1+x m2(β+λ2).(12) Under this independence assumption,the e?ect of the disturbances is re?ected by a locational shift in the conditional quantiles(ψ1τandψ2τ);the slopes do not vary across the conditional quantiles. Without the independence assumption,however,the simple linear form for the conditional quantile functions(like those in equations(11)and(12))only arises in very special cases.In general,the conditional quantile functions involve more complicated non-linear expressions and,in fact,can not be explicitly written down without a complete parametric speci?cation of the error disturbances.

Therefore,the conditional quantiles are viewed as somewhat general functions of x m:say, Qτ(y m1|x m)=f1τ(x m)and Qτ(y m2|x m)=f2τ(x m).To estimate the e?ects in(9)and(10),then, reduced-form models for Qτ(y m1|x m)and Qτ(y m2|x m)are speci?ed.These reduced-form models should be viewed as approximating the“true”conditional quantile functions f1τ(x m)and f2τ(x m). In this paper,a very simple form for the reduced-form models is considered,in which the conditional quantiles are expressed as linear(and separable)functions of x m1and x m2:

Qτ(y m1|x m)=φ1τ+x m1θ1τ+x m2λ2τ(13)

Qτ(y m2|x m)=φ2τ+x m1λ1τ+x m2θ2τ.(14) A more general model,as well as the appropriateness of linearity and separability,is discussed in greater detail in Section5.Based upon(13)and(14),the e?ects of the observables on the conditional quantiles(see(9)and(10))are equal toθ1τ?λ1τ(for the?rst-birth outcome)and

6

θ2τ?λ2τ(for the second-birth outcome).The parameters(φ1τ,φ2τ,θ1τ,θ2τ,λ1τ,λ2τ)can be consistently estimated with linear quantile regression(Koenker and Bassett(1978)).

Although the linear approximation may at?rst appear to be restrictive,this strategy is the one usually employed in cross-sectional quantile regression.In the cross-sectional case,even if the data-generating process is linear in the covariates with a mean-zero error,the conditional quantiles will only be linear in the covariates in very special cases(see,e.g.,Koenker and Bassett(1982)). Even in cross-sectional applications,then,the speci?cation chosen by an empirical researcher(lin-ear usually)should also be viewed as a reduced-form approximation to the true conditional quantile function.In fact,empirical applications of quantile regression generally start(either explicitly or implicitly)with a reduced-form approximating model of the conditional quantile function rather than the data-generating process(see,e.g.,Buchinsky(1994)and Bassett and Chen(2001)).An-grist,Chernozhukov,and Fernandez-Val(2006)provide a framework for analyzing misspeci?cation of the conditional quantile function.Although beyond the scope of this paper,it would be inter-esting to apply their methodology to the panel-data setting considered here.

The linear approximation approach is also an inherent feature of the correlated random-e?ects approach for the conditional expectation model given by(1)and(4).As Chamberlain(1982) originally pointed out,if assumption(A2)does not hold,the conditional expectation function is non-linear;in this case,equations(5)and(6)represent linear approximations(projections)andβrepresents the marginal e?ects of the covariates upon these linear approximations.

For the application in this paper,we impose the additional restriction that the e?ects on the conditional quantiles are the same for both birth outcomes.This restriction is similar to the implicit restriction embodied in the linear panel-data model(1),whereβdoes not vary with b.For the conditional quantiles,letβτdenote the(common)e?ect vector,so that the restriction is

βτ=θ1τ?λ1τ=θ2τ?λ2τ.(15) Under this restriction,the conditional quantile functions in(13)and(14)can be re-written as Qτ(y m1|x m)=φ1τ+x m1(βτ+λ1τ)+x m2λ2τ=φ1τ+x m1βτ+x m1λ1τ+x m2λ2τ(16)

Qτ(y m2|x m)=φ2τ+x m1λ1τ+x m2(βτ+λ2τ)=φ2τ+x m2βτ+x m1λ1τ+x m2λ2τ.(17) The simplest estimation strategy,based upon the second equalities in both(16)and(17),is to run a pooled linear quantile regression in which the observations corresponding to both births of a given mother are stacked together as a pair.In particular,a quantile regression(using the estimator for

7

theτ-th quantile)would be run using

???

??????????????y11

y12

···

y21

y22

···

..

.

···

y M1

y M2

?

??

??

??

??

??

??

??

??

and

?

??

??

??

??

??

??

??

??

10x 11x 11x 12

11x 12x 11x 12

···············

10x 21x 21x 22

11x 22x 21x 22

···············

..

.

···············

10x M1x M1x M2

11x M2x M1x M2

?

??

??

??

??

??

??

??

??

(18)

as the left-hand-side and right-hand-side variables,respectively.This pooled regression directly estimates(φ1τ,φ2τ?φ1τ,βτ,λ1τ,λ2τ).The di?erenceφ2τ?φ1τrepresents the e?ect of birth parity. Birth parity can not be included explicitly in x since the associated components ofβτ,λ1τ,and λ2τwould not be separately identi?ed.In a traditional panel-data context,the di?erenceφ2τ?φ1τwould represent the“time e?ect.”Although the application considered here does not have any birth-invariant explanatory variables(“time-invariant”variables),such variables could be easily incorporated into(18)as additional columns in the RHS matrix;like birth parity,it would not be possible to separately identify the direct e?ects of these variables on y from the indirect e?ects (working through c)on y.

The only di?culty introduced by the pooled regression approach involves computation of the estimator’s standard errors.Since there is dependence within a mother’s pair of births,the standard asymptotic-variance formula(Koenker and Bassett(1978))and the standard bootstrap approach,which are both based upon independent observations,can not be applied.Instead,a given bootstrap sample is created by repeatedly drawing(with replacement)a mother from the sample of M mothers and including both births for that mother,where the draws continue until the desired bootstrap sample size is reached.For a given bootstrap sample,the pooled quantile estimator is computed.After repeating this process for many bootstrap samples,the original estimator’s variance matrix can be estimated by the empirical variance matrix of the bootstrap estimates.Similarly,bootstrap percentile intervals for the parameters can be easily constructed.

2.3Review of related studies

In their recent survey of quantile regression,Koenker and Hallock(2000)cite only a single panel-data application.The cited study by Chay(1995)uses quantile regression on longitudinal earnings data to estimate the e?ect of the1964Civil Rights Act on the black-white earnings di?erential. Chay(1995)allows the individual e?ect to depend on the racial indicator variable,which amounts to a shift in the conditional quantile function and is a special case of the general approach described

8

in Section2.2.Interestingly,the application of Chay(1995)involves censored earnings data,so that quantile regression methods for censored data(Powell(1984,1986))are needed.Such censored-data quantile methods would also work with the general model of Section2.2but are not needed for the application considered in this paper.

A more recent application of quantile regression on panel data is Arias et.al.(2001),who estimate the returns to schooling using twins data.To deal with the unobserved“family e?ect,”the authors include proxy variables(father’s education and sibling’s education)in the model.This proxy-variable approach is related to the correlated random e?ects model in the sense that the latter speci?cation can be viewed as using the observables x m1and x m2as proxies for the unobserved individual e?ect.One could also incorporate an external proxy(such as father’s education in the Arias et.al.(2001)case)into the correlated random e?ects framework.

Another panel-data study that is directly related to our empirical application is Royer(2004), who applies a correlated random e?ects model to maternally linked data from Texas.Royer(2004) estimates the e?ects of various observables(with a focus upon maternal age)on“binary”birth outcomes(such as premature birth or LBW birth).Fixed-e?ects estimation is also possible(in the context of the linear probability model)whereas no such alternative is available in the conditional quantile case.Royer(2004)also relaxes the strict exogeneity assumption(required for consistency of the?xed-e?ects estimator)in several interesting ways.Unfortunately,identi?cation of the least restrictive models requires panel data with at least three births per mother.As a practical matter, this requirement reduces the sample size to an extent that makes the estimated e?ects of observ-ables rather imprecise and introduces a possible selection bias(see the discussion in Royer(2004, pp.39?)).Analogous extensions to the conditional quantile models are left for future research.

3Data

Detailed“natality data”are recorded for nearly every live birth in the United a18def34d15abe23482f4dfcrmation on maternal characteristics(age,education,race,etc.),birth outcomes(birthweight,gestation, etc.),and prenatal care(number of prenatal visits,smoking status,etc.)is collected by each state (with federal guidelines on speci?c data-item requirements).Unfortunately,due to con?dentiality restrictions,comprehensive natality data with personal identi?ers are not available at the federal level,making it di?cult to reliably construct maternally-linked panel data.However,individual states may release such personal identi?ers to researchers,subject to con?dentiality agreements in most cases.The data used in this study were obtained from two states,Washington and Arizona, and are described in detail below:

9

1.Washington data:The Washington State Longitudinal Birth Database(WSLBD)was pro-

vided by Washington’s Center for Health Statistics.The WSLBD is a panel dataset consisting of all births between1992and2002that could be accurately linked together as belonging to the same mother.(The original WSLBD has births dating back to1980,but mother’s edu-cation is not available as a data item until1992.)The matching algorithm used to construct the WSLBD used personal identifying information such as mother’s full maiden name and mother’s date of birth.For two births to be linked together,(i)an exact match on mother’s name,mother’s date of birth,mother’s race,and mother’s state of birth was required,and (ii)consistency of birth parity and the reported interval-since-last-birth was required.Only births that could be uniquely linked together were retained in the WSLBD.

2.Arizona data:The Arizona Department of Health Services provided the authors with data

on all births occurring in the state of Arizona between1993and2002.Although names were not provided,the exact dates of birth for both mother and father were provided in the data.

To maternally link births together,we followed as closely as possible the algorithm used for the Washington data.For two births to be linked together,(i)an exact match on mother’s date of birth,father’s date of birth,mother’s race,and mother’s state of birth was required, and(ii)consistency of birth parity and the reported interval-since-last-birth was required.As with the Washington data,only births that could be uniquely linked together were retained.

Since births could not be linked by maternal name,we decided to also require an exact match on father’s date of birth in order to minimize the chance of false matches entering the sample.

(Roughly3.5%of births that were linked on the basis of mother’s birthdate are dropped when links are also based upon father’s birthdate.)This choice turns out to have very little impact on the estimation results reported in Section4.The decision to match upon father’s birthdate restricts the Arizona sample to mothers whose children had the same birth father, which is not a restriction of the Washington sample.

For this study,we consider only pairs of?rst and second births to white mothers.Birth outcomes(and the e?ects of other variables upon birth outcomes)have been found to di?er across di?erent races and at higher birth parities.The choice of subsample circumvents these issue by fo-cusing upon a more homogeneous sample.The resulting estimates,of course,should be interpreted as being applicable to the subpopulation represented by this sample choice.

Estimation was carried out separately for the Washington data and Arizona data.The Washington data has several advantages over the Arizona data:(i)the matching of siblings for the Washington data is of higher quality due to the use of mothers’names,(ii)the Washington data is not restricted to siblings with the same fathers,and(iii)the Washington data includes information

10

on the month of?rst prenatal visit.For these reasons,most of the detailed analysis will be reported for the Washington data.Results for Arizona will be discussed more brie?y,but these results serve as a useful comparison to the Washington results.

Table1provides descriptive statistics for the Washington and Arizona samples,broken down by?rst-child and second-child births.Any mother with missing data items in either of her two births (for the variables summarized in Table1)was dropped from the sample.The resulting samples used for estimation consist of45,067Washington mothers(90,134births)and56,201Arizona mothers (112,402births).Sample averages are reported for all variables,as well as standard deviations for the non-indicator variables.The“Smoke”(“Drink”)variable is equal to one if the mother reported smoking(drinking alcohol)during pregnancy.Although alcohol consumption during pregnancy is known to be severely under-reported,the“Drink”variable is included in the regressions as it may be useful a proxy for other unobservables.For Washington,the four prenatal-care categories(“No prenatal care,”“1st-trimester care,”“2nd-trimester care,”“3rd-trimester care”)were constructed on the basis of the reported month of the?rst prenatal-care visit.Unfortunately,the month of?rst prenatal-care visit is not reported in the Arizona data until1997.As a result,only the number of prenatal visits and an indicator variable for“no prenatal care”(equal to one if there are no prenatal visits)are summarized in Table1and used in the empirical analysis of Section4.The other variables are self-explanatory.

The descriptive statistics in Table1indicate that average birthweight increases by88grams at the second birth for both Washington mothers and Arizona mothers.For their second birth, women are less likely to smoke and drink and more likely to be married,have a male child,and have a?rst-trimester prenatal-care visit.Based on the summary statistics,the two samples of mothers are quite similar.On average,Arizona mothers are slightly less educated and have higher birthweight babies.The largest di?erence between the two samples appears to be the level of smoking:Washington mothers report smoking in13.7%of pregnancies(close to the national average during this time period),whereas only4.7%of Arizona mothers report smoking.These smoking percentages are below the overall smoking percentages for pregnant women in these two states during the periods of interest(8.9%in Arizona and18.4%in Washington),indicating that the matching algorithms result in subsamples that over-represent non-smokers.For instance,unmarried Arizona mothers(for whom the smoking percentage is12.7%)are far more likely to have father’s date-of-birth missing from the data(45.9%of the time,as compared to1.1%for married mothers) and,therefore,not included in the matched sample.The reported rate of drinking during pregnancy is also lower in Arizona than Washington;these reported percentages are also lower than the overall percentages for pregnant women in the two states(2.7%in Washington,1.4%in Arizona).

11

Table1:Descriptive Statistics,Washington and Arizona Birth Panels Variable Washington Arizona

1st Child2nd Child1st Child2nd Child Male child0.5150.5110.5200.516 Mother’s age25.27(5.25)27.89(5.35)25.23(5.26)27.85(5.36) Mother’s education13.52(2.32)13.72(2.21)13.21(2.68)13.39(2.61) Married0.7510.8530.7800.886 No prenatal care0.0040.0030.0050.006

1st-trimester care0.8790.895——

2nd-trimester care0.1070.093——

3rd-trimester care0.0140.012——Smoke0.1430.1320.0490.044 Drink0.0170.0140.0090.007

#prenatal visits12.06(3.53)11.63(3.25)11.83(3.59)11.73(3.55) Quantiles of birthweight:

10%quantile2807289227502863 25%quantile3146322030613146 50%quantile3458354333733445 75%quantile3770385536853742 90%quantile4060416739684040

12

Table1also provides the(unconditional)10%/25%/50%/75%/90%quantiles for?rst and second births in Washington and Arizona.These quantiles indicate fairly symmetric birthweight distributions,with the median quite close to the mean,the25%and75%quantiles roughly equidis-tant from the median,and the10%and90%quantiles roughly equidistant from the median.For both states,there is a positive shift in the entire birthweight distribution from?rst to second births. The shift is largest in magnitude at the90%quantile(107grams)for Washington births and at the 10%quantile(113grams)for Arizona births.Finally,we note that the LBW cuto?of2500grams corresponds to the3–5%quantiles of the unconditional birthweight distributions,whereas the HBW cuto?of4000grams corresponds to the85–92%quantiles of the unconditional distributions.

4Results

Regression results for the two maternally linked datasets are provided in Section4.1,within the strict-exogeneity framework introduced in Section2.A straightforward approach to hypothesis testing is provided in Section4.2.Section4.3provides discussion related to possible violations of strict exogeneity(e.g.,feedback e?ects or mismeasured variables).

4.1Regression results

In the interest of space,the full set of numerical results(tables)and a detailed discussion are provided only for the Washington data(Section4.1.1).The Arizona results are reported in a graphical format comparable to the Washington results(Section4.1.3),but the detailed tables have been omitted and the discussion is limited to comparisons with the Washington results.(Complete tables are available upon request from the authors.)

4.1.1Washington data

The tables report estimates for the quantilesτ∈{0.10,0.25,0.50,0.75,0.90}(along with least-squares estimates for comparison),although the?gures presented in this section consider marginal e?ects at2-percent intervals(speci?cally,τ∈{0.04,0.06,...,0.94,0.96}).Throughout this section, the dependent variable of interest is birthweight(measured in grams).In order to have a relevant comparison for the panel-data results,cross-sectional results(without incorporating the correlated random e?ects)are also reported.For the cross-sectional results,the panel structure of the data is only used for computing standard errors.Since each mother appears twice in the data,the pair-sampling bootstrap described at the end of Section2.2is used.

Tables2and3report the cross-sectional results and panel-data results,respectively.The model speci?cation includes the variables summarized in Table1,along with an indicator variable

13

Table2:Cross-Sectional Estimation Results,Washington Data.The dependent variable is birth-weight(in grams).

10%25%50%75%90%OLS Second child100.65***92.87***93.86***99.99***110.37***98.95***

(7.24)(4.96)(4.40)(4.89)(6.84)(4.04)

Male child87.22***115.87***128.34***142.70***160.83***124.34***

(6.25)(4.33)(3.85)(4.25)(5.65)(3.57)

Age19.30***13.81***7.39**7.92** 5.4712.57***

(6.35)(4.02)(3.58)(3.93)(5.17)(3.41)

Age2-0.385***-0.258***-0.131**-0.122*-0.070-0.228***

(0.115)(0.070)(0.064)(0.070)(0.091)(0.061)

Education31.02**22.46**30.23***28.81***23.87**26.94***

(14.32)(8.80)(7.70)(6.71)(10.45)(7.16)

Education2-0.744-0.603*-0.927***-0.987***-0.793**-0.789***

(0.525)(0.324)(0.285)(0.250)(0.388)(0.264)

Married36.58***27.52***26.74***22.41***16.20*28.11***

(10.22)(7.61)(6.03)(6.91)(9.19)(6.23)

No prenatal care-324.75*-22.76-36.0515.54176.44**-34.57

(177.34)(55.29)(41.13)(44.24)(74.27)(47.90) 2nd-trimester care37.34***27.93***24.72***31.58***37.96***38.73***

(11.86)(8.41)(7.00)(8.27)(10.43)(6.48)

3rd-trimester care106.60***59.36***37.26*26.0724.4066.39***

(30.75)(20.43)(19.79)(18.81)(23.84)(15.96)

Smoke-186.05***-182.29***-178.57***-176.05***-160.02***-177.97***

(11.40)(7.64)(6.23)(7.51)(9.92)(6.16)

Drink-21.80-20.08-7.27-15.0117.03 3.89

(28.14)(22.67)(16.10)(20.00)(26.55)(16.47)

#prenatal visits19.33***16.52***15.11***14.82***13.92***18.46***

(1.35)(0.89)(0.76)(0.82)(1.10)(0.85)

Bootstrapped standard errors in parentheses,using bootstrap sample size of20,000(10,000pairs)

and1,000bootstrap replications.Year dummies were included in all regressions.

‘*’:signi?cant at10%level(2-sided);‘**’:5%level;‘***’:1%level.

for the second child,quadratic variables for both mother’s age and education,and a full set of year-of-birth dummy variables.For the prenatal-care variables,the omitted category corresponds to?rst-trimester prenatal care,so the estimates for the other three prenatal-care variables(“No prenatal care,”“2nd-trimester care,”and“3rd-trimester care”)should be interpreted as di?erences from?rst-trimester prenatal care.The e?ect of prenatal care will therefore be captured by(i)the trimester of the?rst prenatal visit(if any)and(ii)the number of prenatal visits(if any).

It should be pointed out that interpreting the e?ect of any prenatal-care variable is a bit di?cult since the observed prenatal care proxies for both intended prenatal care and pregnancy problems.For instance,if two mothers have identical intentions(at the beginning of pregnancy) with respect to prenatal-care visits,the mother that experiences problems early in her pregnancy would be more likely to have an earlier?rst prenatal-care visit and to have more prenatal-care visits overall.The estimated e?ects of the prenatal-care variables,therefore,may re?ect the combined

14

e?ects of intended care and pregnancy complications.This idea has been independently investigated by Conway and Deb(2005),who(i)?nd that bimodal residuals result from a standard2SLS regression of birthweight and(ii)use a two-class mixture model to explicitly allow for a di?erence between“normal”and“complicated”pregnancies.The estimates for the no-prenatal-care indicator variable in both Tables2and3,which are signi?cantly negative at the10%quantile and signi?cantly positive at the90%quantile,illustrate this point.A possible explanation for the dramatic di?erence at the two ends of the distribution is that lack of prenatal care is more likely to proxy for lack of intended care at the lowest quantiles and more likely to proxy for a problem-free pregnancy at the highest quantiles.Alternatively,the positive e?ect found at higher quantiles could still be consistent with a lack of intended care since HBW outcomes have previously been associated with poor prenatal care and disadvantage mothers.(Unfortunately,the leading indicators of HBW outcomes are mother’s weight prior to pregnancy and weight gain during pregnancy.Neither of these items is available in the datasets,forcing us to focus less on the e?ects of birth inputs on HBW outcomes.)At the intermediate quantiles,the e?ect of the no-prenatal-care indicator is found to be statistically insigni?cant in both the cross-sectional and panel results.

Overall,the cross-sectional results in Table2are very similar to those found in previous studies using federal natality data(Abrevaya(2001);Koenker and Hallock(2001)).For the panel-data results in Table3,unobserved heterogeneity is modeled as in Section2.2(see equations(16) and(17)).For the pooled quantile regressions,Table3reports the estimates of the marginal e?ectsβτ.The estimates of the parametersλ1τandλ2τare reported in the Appendix(Tables5 and6);these estimates measure the extent of the cross-sectional bias(through the relationship of the unobserved heterogeneity with the observables).To provide a more complete view of the variables’e?ects on birthweights and to allow an easy comparison with the cross-sectional estimates,Figures1 and2plot the estimated e?ects from both the panel and cross section.For these?gures,the quantile regressions were estimated at2%intervals,from the4%quantile through the96%quantile (inclusively).The panel-data estimates are represented with a solid line,and the90%con?dence intervals(bootstrap percentile intervals)for these estimates are represented with dotted lines.The cross-sectional estimates,computed at the same quantiles,are represented with a dashed line.(To avoid cluttering the?gures,con?dence intervals for the cross-sectional results(which can be inferred from Table2)are not reported.)Since both age and education have quadratic terms in the model speci?cation,the marginal-e?ect plots for age and education are based upon estimates evaluated at speci?c values for the two variables(25years old for age and12years for education level).

15

Table3:Panel Data Estimation Results(βτ),Washington data.The dependent variable is birth-weight(in grams).

10%25%50%75%90%OLS

(11.75)(8.30)(7.03)(7.69)(11.38)(5.99)

Male child100.67***131.64***146.69***160.12***173.35***138.75***

(7.76)(5.15)(4.40)(4.82)(6.64)(3.68)

Age-38.73***-20.88***-30.13***-31.52***-48.18***-33.06***

(11.26)(7.00)(6.27)(6.61)(9.47)(5.31)

Age20.515***0.290**0.467***0.529***0.845***0.483***

(0.197)(0.120)(0.109)(0.117)(0.161)(0.092)

Education35.54*17.2627.22***17.96** 5.9719.16**

(18.25)(11.91)(10.23)(8.46)(13.47)(7.62)

Education2-1.436*-0.881*-1.029**-0.761**-0.594-0.851**

(0.762)(0.503)(0.433)(0.388)(0.607)(0.334)

Married39.71**15.4927.99***21.13*11.4127.87***

(17.26)(11.71)(9.21)(11.32)(16.05)(8.30)

No prenatal care-323.39*-13.77 1.6626.03271.21***-18.08

(172.94)(60.05)(51.94)(54.87)(76.06)(49.01) 2nd-trimester care26.80* 6.63-0.2022.95**34.39**22.19***

(14.48)(10.46)(8.72)(10.18)(13.52)(6.93)

3nd-trimester care62.87*71.79***31.7530.5039.2255.73***

(34.96)(25.25)(22.67)(23.91)(32.04)(17.32)

Smoke-24.58-60.64***-81.43***-57.70***-56.19***-56.86***

(19.47)(14.23)(11.45)(12.09)(17.58)(9.10)

Drink-44.24-32.19 4.29-1.579.25 2.11

(35.84)(25.36)(20.00)(24.77)(29.79)(17.11)

#prenatal visits20.01***14.79***12.70***12.32***12.65***17.48***

(1.59)(1.10)(0.88)(1.01)(1.43)(0.93)

and1000bootstrap replications.Year dummies were included in all regressions.

‘*’:signi?cant at10%level(2-sided);‘**’:5%level;‘***’:1%level.

16

0.20.40.60.8

125

175

S e c o n d C h i l d

0.20.40.60.8

100

150

200

M a l e

0.20.40.60.8?20

T o t a l A g e E f f e c t a t 25 y e a r s

0.20.40.60.8

?20

20

T o t a l E d u c a t i o n E f f e c t a t 12 y e a r s 0.2

0.40.60.8

1020

30

# p r e n a t a l v i s i t s

F i g u r e 1:P a r t 1o f t h e e s t i m a t e d m a r g i n a l e ?e c t s o n t h e c o n d i t i o n a l q u a n t i l e s f o r W a s h i n g t o n b i r t h s .T h e d e p e n d e n t v a r i a b l e i s b i r t h w e i g h t (i n g r a m s ).T h e s o l i d l i n e i n d i c a t e s t h e p a n e l -d a t a e s t i m a t e s ,t h e d o t t e d l i n e s a r e 90%c o n ?d e n c e b a n d s f o r t h e p a n e l -d a t a e s t i m a t e s ,a n d t h e d a s h e d l i n e i n d i c a t e s t h e c r o s s -s e c t i o n a l e s t i m a t e s .

17

0.20.40.60.80

50

100M a r r i e d 0.20.40.60.8

?1000

?500

500

N o p r e n a t a l c a r e

0.20.40.60.80

50

2n d ?t r i m e s t e r c a r e

0.20.40.60.8

100

200

3n d ?t r i m e s t e r c a r e 0.2

0.40.60.8

?200?100

S m o k e

0.20.40.60.8

?1000

100D r i n k F i g u r e 2:P a r t 2o f t h e e s t i m a t e d m a r g i n a l e ?e c t s o n t h e c o n d i t i o n a l q u a n t i l e s f o r W a s h i n g t o n b i r t h s .T h e d e p e n d e n t v a r i a b l e i s b i r t h w e i g h t (i n g r a m s ).T h e s o l i d l i n e i n d i c a t e s t h e p a n e l -d a t a e s t i m a t e s ,t h e d o t t e d l i n e s a r e 90%c o n ?d e n c e b a n d s f o r t h e p a n e l -d a t a e s t i m a t e s ,a n d t h e d a s h e d l i n e i n d i c a t e s t h e c r o s s -s e c t i o n a l e s t i m a t e s .

18

The estimated e?ects of the various variables,as presented in Tables2and3and Figures1 and2,are discussed in more detailed below:

Second child:Birthweights are uniformly larger for second children at all quantiles,for both the cross-sectional and panel estimates.The panel estimates of the second-child e?ect are somewhat larger than the cross-sectional estimates,with the largest e?ects at the lowest quantiles(e.g.,137grams at the10%quantile).

Male child:It is well-known that,on average,male babies weigh more at birth than female babies.

The quantile estimates indicate that the positive male-child e?ect on birthweight is present at all quantiles of the conditional birthweight distribution.The magnitude of the e?ect increases when one moves from lower quantiles to higher quantiles,with the panel estimates indicating

a slightly higher e?ect(10–20grams)than the cross-sectional estimates.

Age and education:Figure1shows the estimated(one-year)e?ects of age and education,evaluated at25years of age and12years of education,respectively.For age,both the cross-sectional and panel estimates are very close to zero in magnitude(and statistically insigni?cant at a5%level for all quantiles).For education,the cross-sectional estimates are positive across the quantiles and statistically signi?cant(at a5%level)except at quantiles above80%.In contrast,the panel estimates are statistically insigni?cant across all quantiles.This di?erence could be due to two factors:(i)the amount of within-mother variation in education is quite small,with the average change in education for the sample being about0.2years;and,(ii)the level of education may be related to the mother-speci?c unobservable.For the latter factor,years of schooling is likely positively related to c m,which would imply an upward bias in the cross-sectional estimates that is consistent with Figure1.The issue of education being potentially mismeasured is brie?y discussed in Section4.3.2.Results for other age and education levels are reported in Abrevaya and Dahl(2006).

Marital status:The estimated positive e?ects of marriage on birthweight are quite similar for the cross-sectional and panel speci?cations,in the20–50gram range over the quantiles considered.

One should be cautious about interpreting the cross-sectional marriage estimates as causal since marital status is an explanatory variable that a priore would appear to serve as a proxy for mother-speci?c unobservables(i.e.,marital status positively correlated with c m).The panel estimates are slightly lower than the cross-sectional estimates in the lower quantiles (until around the40%quantile),suggesting that this might be a factor in the lower quantiles.

Somewhat surprisingly,however,the panel estimates of the marriage e?ect remain positive throughout the range of quantiles and signi?cantly so(at the10%level)at nearly all the quantiles below80%.On the whole,the estimates are consistent with a situation in which

19

marriage provides the birth mother with support(?nancial support,emotional support,etc.) that would lead to a more favorable birth outcome.

Prenatal-care visits:Lack of prenatal care is found to have a signi?cant negative e?ects at lower quantiles and signi?cant positive e?ects at the upper quantiles.The estimated e?ects are similar for both the cross-sectional and panel regressions.As discussed above,a logical explanation is that the“No prenatal care”indicator variable may proxy for poor care at lower quantiles but for problem-free pregnancies at upper quantiles.For the third-trimester-care indicator variable,the cross-sectional and panel estimates are also similar,indicating positive e?ects(as compared to?rst-trimester care)which become less statistically signi?cant at higher quantiles.For the indicator variables,the largest di?erence between the cross-sectional and panel results shows up in the second-trimester-care variable;the cross-sectional estimates are statistically signi?cant at all quantiles and range from25to50grams,whereas the panel estimates are somewhat lower(close to zero in intermediate quantiles)and only signi?cantly positive at the highest quantiles.The e?ect of the number of prenatal visits is estimated to be signi?cantly positive across all quantiles,with larger e?ects found at lower quantiles and the e?ects essentially“?attening out”(at around14–15grams per visit for the cross-sectional results and12–13grams per visit for the panel results).The estimated e?ects for the panel speci?cation exhibit a sharper decline,leading to lower estimates(roughly a2-gram per-visit di?erential)than the cross-sectional speci?cation.This variable shows up signi?cantly in the λ1τandλ2τestimates(see Tables5and6),leading to the di?erences found and suggesting that the variable is related to the mother-speci?c unobservable.

Smoking:The most dramatic di?erences between the cross-sectional and panel results are the estimated e?ects of smoking.The cross-sectional results indicate that the negative e?ects of smoking are in the range of150–200grams,with larger e?ects at lower quantiles.The panel estimates are still signi?cantly negative at all but the lowest quantiles,but the esti-mated e?ects are much lower in magnitude(mostly in the50–80gram range between the20% and80%quantiles).The omitted-variables explanation of this large di?erence would be that the smoking indicator in the cross-sectional speci?cation is negatively related with the error disturbance in the birthweight regression equation.Consistent with this explanation,the smoking coe?cients in bothλ1τandλ2τare found to be signi?cantly negative across the quan-tiles(Tables5and6).The magnitudes of the panel estimates are also signi?cantly lower than those found in previous work,including quasi-experimental estimates based upon cigarette-tax changes(e.g.,Evans and Ringel(1999)and Lien and Evans(2005))and experimental estimates(e.g.,Permutt and Hebel(1989)).These studies have estimated causal(IV)e?ects of smoking on birthweight which are not statistically di?erent from the OLS estimates;these

20

本文来源:https://www.bwwdw.com/article/dohe.html

Top