Ch11-HLM-Meta-Analysis - 图文

更新时间:2023-09-29 11:22:01 阅读量: 综合文库 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

第十一章:「多层线性模型」(HLM) 和 「聚合分析」(Meta-Analysis) 的例子

黄炽森

引言

在前三章我们分别介绍实验设计、回归分析及结构方程模型的方法及一些研究

例子,这三种方法目前是组织行为及人力资源管理(OBHR)的研究中应用最多的研究方法。本章介绍「多层线性模型」(Hierarchical Linear Modeling;HLM)和「聚合分析」(Meta-Analysis)方法。与前三章介绍的方法有点不一样,这两种方法虽然在OBHR的研究中也很有用,但它们是针对较为特殊的研究问题而设计的,所以研究人员不一定都会应用,例如我自己过去曾参与的研究刚好都没有碰到要应用这两个方法的研究问题。但是,我们必须对这两个方法的原理有所认识,否则我们便不一定能看懂应用这些方法的论文了,所以,本章的目的是把它们的原理作简单的介绍,然后各以一篇论文为例子,至于要详细到真能应用的层面,则有待你仔细研读本章的参考文献了。

「多层线性模型」(HLM)的原理

需要用「多层线性模型」(Hierarchical Linear Modeling;HLM)来处理的研究问题是当影响依变项(Dependent Variable;Y)的自变项是来自两个或以上层次的,例如在Liao and Chuang (2004)的论文中,其中一个依变项是个别服务员提供的服务表现(Service Performance;Y),影响这个依变项的构念最少会来自两个层次,第一个是

1

每一个服务员自己的因素(例如性格;Personality;X),这是每一个员工都不一样的。第二个是员工所在的分店、店铺层面的因素(例如分店提供的人力资源管理措施;W),这是每一分店(但不是每一员工)都不一样的。另外,依变项是在第一层次而不是第二或更高层次的,才可应用「多层线性模型」的分析方法。

为了方便以下的讨论,我们假设祗有两个层次,而每一层次祗有一个自变项。对于第一个层次的自变项(即X)对依变项(即Y)的影响,我们可以对每一分店的员工来作回归分析,例如对第j个分店,其回归模型如下:

Yij = β0j + β1j Xij + rij

而β0j是此模型的平均常数;rij是回归分析的随机误差,因此是随机及常态分布的。如果我们有n间分店,那么我们便有n个b0j(β0j的样本估计)及b1j(β1j的的样本估计)数据。这些b0j及b1j的数据不可能对每一分店都一样,所以它们也会形成一个分布。如果我们要看不同的分店在平均的Y (即Service Performance)是否一样,在抽到的样本中,我们可以用b0j的分布来测试β0j的变异量(我们一般以τ00代表)是否等于零的假设,如果我们没有证据推翻此一假设,那么便可接受店铺层面的因素(即W)对Y没有直接影响。但是,如果我们接受了β0j的变异量不等于零的结论,那么,这些变异又是否可用W来解释呢?要验证W对β0j的影响,我们便需要用以下回归模型来分析:

β0j = γ00 + γ01Wj + u0j 而γ00是此模型的平均常数;u0j是此模型的的随机误差。如果γ01在统计测试中,我们接受γ01不等于零的结论,那么便验证了W对Y的影响。

以上的介绍可能令人以为我们可以先对每一店铺作Y与X的回归分析,得到所有店铺的b0后,再进行b0与W的回归分析,便可把两个层次的因素逐一验证。事实上这样做是很有问题的,首先,如果在不同店铺得到的b1(即X对Y的影响)都不一样,有一些甚至可以是没有达到统计上显著的程度,那么我们又怎能从这么多b1的估计中,判断X对Y的影响到底是怎样的?其次,不同店铺可能因为我们能收集

2

的样本数不同及其它原因,使其在β0 (或/及β1)的估计中,准确度都不一样,但在第二次的回归(即「β0j = γ00 + γ01Wj + u0j」)中却没有考虑这个因素 (注:各店铺计算的b的准确度便是它们「样本统计数的分布」(Sampling Distribution of b)的「Standard Error」,这个「Standard Error」愈大,估计的b的准确度便愈低。) 。最后,从数学的推理中,我们知道u0j 的分布不是常态的,而是卡方分布(χ2 distribution),因此在作出回归分析时应作出调整。

基本上而言,「多层线性模型」(HLM)就是针对以上对两个层次要独立进行分析的缺点,而将两层次的分析同时(simultaneously)进行,并且考虑了以上的缺点,但在原理上而言,与把两个层次进行独立的分析没有太大的分别。

如果我们的理论是认为W对Y有直接影响,那么我们自然是验证W对β0j的影响。但是,如果我们的理论是W会调节(moderate)X对Y的关系,那么我们便应该分析W对β1j的影响,即以下的回归模型:

β1j = γ00 + γ11Wj + u1j 而γ00是此模型的平均常数;u1j是此模型的的随机误差(也是卡方分布;χ2

Distribution)。如果γ11在统计测试中,我们接受γ11不等于零的结论,那么验证了W对X和Y的关系有调节作用。这个调节的具体形态,要视乎估计的β1及γ11的数值而定,例如:如果β1的数值是正的,而γ11的数值也是正的,那代表W的增加可以加强了X对Y的影响;如果β1的数值是正的,而γ11的数值是负的,那代表W的增加会降低了X对Y的影响。

「多层线性模型」(HLM)的例子

以上的讨论我们用了Liao and Chuang (2004)的研究为例子,其实在这篇论文中,第一层的因素是大五因子的性格构念(即有五个X),而第二层的因素则有四个(即有四个W),包括:店铺的服务气氛(Service climate)、店铺的员工在重要决定中的参与程度(Employee involvement)、店铺的服务训练(Service training)、及对绩效的奖励(Performance incentives)。此研究提出了一系列的X和W对Y (即Service Performance)的影响、及W对X和Y之间关系调节的假说(Hypotheses),并以在25间餐厅服务的257员工和44名经理的样本来验证这些假说。他们的HLM分析及结果如下:

3

「The service performance model to be tested was hierarchical, with the dependent variable, employee service performance, being an individual construct, and the predicting variables spanning the individual and store levels. The data were also hierarchical, since employees were “nested” in restaurants. We therefore adopted the hierarchical linear modeling (HLM; Bryk & Raudenbush, 1992) method and tested the model in four steps. First, we estimated a null model that had no predictors at either level 1 (the

individual level) or level 2 (the store level) to partition the service performance variance into within- and between-stores components. Second, in a level 1 analysis, within each restaurant, service performance was regressed on grand-mean-centered individual-level predictors of personality. A regression line was estimated for each of the 25 stores in this step. In the third step, or the level 2 analysis, we used the intercept estimates obtained from level 1 as outcome variables and regressed these on the store-level

predictors, including service climate and HR practices, to assess the main effects of the store-level factors. In the last step, we regressed the slope estimates obtained from level 1 on the store-level factors to detect cross-level interaction effects. We also computed the proportion of variance in service performance explained by individual-level factors (R2within-store) as well as by store-level factors (R2between-stores) using procedures described in Bryk and Rauden-bush (1992).」

「Null model. Our hypothesis predict that both individual- and store-level variables would be significantly related to employee service performance. In order for these hypotheses to be supported, there had to be significant between-store variance in

employee service performance. Thus, using HLM, we estimated a null model in which no predictors were specified for either the level 1 or level 2 function to test the

significance level of the level 2 residual variance of the intercept (τ00=.35, p<.001)….

Individual-level predictors only. Hypothesis1a, 1b, 1c, and 1d predict that individual personalities will be associated with individual employees’ service

performance. We estimated a level 1 model including these variables, with no predictors specified for the level 2 model. As a block, the personality variables explained 24 percent of the within-store variance. Specifically, conscientiousness

(γ=.58, p<.001) and extraversion (γ=.37, p<.001) had significantly positive relationships with employee service performance…

Adding store-level predictors. To test Hypotheses 2a, 2b, 2c, and 2d, we estimated an HLM model in which the personality variables were the level 1 predictors and then regressed the intercept coefficients obtained from level 1 on the measures of store-level

4

service climate and HR practices at level 2. As reported in Table 2, both service climate (γ=.45, p<.01) and employee involvement (γ=.39, p<.05) demonstrated significant relationships with service performance, after we had accounted for individual-level predictors…..As a group, the specified store-level variables accounted for 29 percent of the between-stores variance in service performance…..

Testing cross-level interactions. Hypothesis 3 posits that the store-level variables will moderate the relationship between personalities and individual employees’ service performance. A prerequisite for testing these cross-level interactions was that there be significant random variance for the personality variables in the intercepts-as-outcomes models estimated in the previous step. As reported in Table 2, in which estimates of the random-variance components appear in parentheses, only neuroticism had significant ranom variance (γ22=.18, p<.01), suggesting significant variability in the level 1

neuroticism-service performance relationship across stores. We then examined whether this variance could be explained by store-level factors; none of these variables was significantly related to the neuroticism slopes. Therefore, Hypothesis 3 was not supported.」

5

HLM 小结

以上的介绍主要是关于HLM的基本原理,掌握后可使我们大致看得懂应用HLM作分析工具的研究论文。但要掌握真的操作细节则仍未足够,关于这方面,可仔细参看张雷、雷震、郭伯良(2003)及Bryk & Raudenbush (1992)。此外,HLM的应用尚在发展之中,例如跨层次的中介变项(Mediator)是否可用HLM作有效的验证,目前还没有广为接受的方法,这有待进一步的探索。

「聚合分析」(Meta-Analysis)的原理

在第三章里,我们曾讨论了社会科学很重要的特点及由此特点带来对社会科学研究的要求,这里我们先重温一下:

「相对来说,社会科学的研究对象主要是人或由人组成的群体,他们却是有自由意志的。因此,纵然外在条件配合下,我们也不可能百分百确定所有人或群体的反应一定会如何的,例如我们可以说在一定的外在条件下,人身体的某部分遇到痛楚时便会退缩,但是,我们永远不能确定在这些外在条件下,每一个人都会因遇到痛楚而退缩,因为人可以「选择」忍受痛楚而不退缩。这个本质上的不一样,使社会科学有两个重要的特点。

第一,社会科学的理论是「机率性」,而不是「绝对性」的。承接以上的痛楚与退缩的例子,社会科学的理论会是这样的﹕「总的来说,在一定的外在条件下,当人身体的某部分遇到痛楚时,他退缩的机会是与所承受的痛苦成正比的。」这个理论虽然描述了人在遇到痛楚时会退缩的现象,但同时也表明了人是否真的会退缩祗是机率性的,例如在某一痛楚程度下,会有百分之九十五以上的人退缩,但我们永远无法肯定所有人都会退缩,而且我们也不能确定同一个人,虽然在同一条件和痛楚程度下,他是否一定会再次退缩,我们祗能说,如果他第一次退缩,那么他在第二次也退缩的机会是百分之九十五以上。

6

第二,要验证社会科学的理论,我们需要反复地对不同的样本进行,例如有一理论描述了某一管理措施是员工某种行为的原因,要验证此一理论,我们必须对不同类型和行业的企业,不同地区﹑类型和种族的员工来加以研究,然后才可以确知此理论的描述是否真确及是否有限制。其次,除了研究样本外,社会科学也需要以更多元的研究方法来验证同一理论,因为以不同的研究方法都能得到同一结论,则这结论是正确的机会便更大了。在这样对社会科学特点的了解下,让我们进一步探讨社会科学的研究方法。」

由于有这样的特点,在组织行为及人力资源管理(OBHR)的研究里,关于两个构念之间的关系(例如X和Y),我们可能要反复地对不同样本作出测量。当我们累积了相当数量、对不同样本的实证研究后,我们便可以对这些不同样本的结果作出综合性的分析,希望能得到更清晰及接近事实的结论。「聚合分析」(Meta-Analysis)便是以计量的方法来综合前人所做的研究,这是在1970年代末期由一些人力资源管理的研究人员发展出来的(Hunter & Schmidt, 1982;Hunter, Schmidt & Jackson, 1982)。

这些人力资源管理的研究人员当时面对的问题是:不同的样本研究关于人的智能(General Mental Abilities;X)是否能预测工作绩效(Job Performance;Y)时,得到的结果很不一样,有些样本的相关系数(r)是正的及在统计测试中达到显著的程度,有些则不显著,甚至有小部分的样本X和Y的相关系数是负数。因此,很多人开始怀疑到底X和Y是否真的有关,又或者二者的关系是否存在着很多的调节变项

(Moderators),使我们在不同的样本中看到很不一样的X和Y的相关系数。除了研究人员很感困惑外,一些负责拨款研究预测工作绩效因素的机构,例如美国政府,也开始怀疑是否值得继续支持这类型的研究。

发展「聚合分析」(Meta-Analysis)的研究人员则认为在不同样本中计算出来的相关系数在数值上有所不同,其实是很正常的,因为在同一母体中抽取不同的样本,计算其相关系数(r),这些相关系数是会有一个分布的,即其「相关系数的样本分布」(Sampling Distribution of the Correlation Coefficients),而每一次抽取样本的样本数(n)与这个「相关系数的样本分布」的变异程度(即Standard Error的平方;Se2)有关,样

7

本数愈大,不同样本的相关系数的变异程度愈小。Se2的方程式及下图表示了这个情况:

S2e?1????n?122

在上图,当母体的真正相关系数(ρ)是0.38时,如果样本数(n)祗有30,这些样

本的相关系数(r)分布比n是100的广,变异也就较大,而其变异量的方程式也显示了n和Se2的关系。因此,如果我们看到不同样本相关系数的变异量,与正常情况下由于抽样本而带来的变异量(我们称为「抽样误差」;Sampling Error)的程度接近,那么我们便不能说这些样本是来自不同母体,而这些母体对X和Y的相关系数有着不同的数值了。

在这样的情形下,要综合所有样本的相关系数来估计母体的相关系数,最准确

8

的当然是以样本数来对每一相关系数的加权平均(weighted average by sample size)了,如果有k个样本(以后的方程式均以此为假设),即是以下的方程式:

??r???nr?nii?1iki

其中ni是第i个样本的样本数;ri是第i个样本的相关系数。再用以上求得对母体相关系数(ρ)的估计,我们便可用以下的方程式来计算每一个样本会带来的「抽样误差」(Sampling Error)的变异量,例如对第i个样本的方程式是:

S2ei???1???ni?122

如果所有样本都来自同一母体,由样本因「抽样误差」带来的总变异量(Se2)的最佳估计,便是所有Sei2以样本数计算的加权平均,即以下的方程式:

S2enS???nii2ei

这个预期因「抽样误差」而带来的变异量(Se2),可与我们真的从各样本观察到的相关系数的变异量(Sr2)带来的变异量比较,由于样本数的不一样,因此Sr2也应是加权的,其方程式如下:

S2r??n?r?????niii2

以Se2和Sr2比较,「聚合分析」(Meta-Analysis)的研究人员认为如果Se2是Sr2

的75%或以上,那已经是很强的证据显示这些样本其实是来自同一母体的,因为导致Sr2的因素还有很多(例如各样本的自变项及/或依变项的范围可能受到不同的限制(range restrictions of variables);各样本的测量工具的信度的差异等等),如果单是「抽样误差」带来的变异量(即Se2)已经能解释75%或以上我们样本间的变异量(即Sr2),我们实在没有理由说Sr2是因各样本是来自不同的母体的。

9

此外,我们也可根据Se2和ρ的估计,计算母体相关系数(ρ)的信赖区间

(confidence interval),可能观察如果其信赖区间并不包括零,我们便可下结论说综合所有的样本证据,X和Y是相关的;同样地,我们也可根据Sr2和各样本的r分布估计在进行研究时,可能观察到的样本相关系数的分布区间(credibility interval)。

如果我们真的认为过往研究的样本是来自不同的母体,而这些母体相关系数(ρ)是不一样的,那么我们可以把认为是来自相同母体的样本放在同一组,然后对每一组的样本进行上述的「聚合分析」。如果各组计算所得的信赖区间并无重迭,便验证了这些样本是来自不同母体的,而我们用来把样本分组的准则,便是X和Y关系的调节变项(Moderator)。

最后需要一提的是,「聚合分析」(Meta-Analysis)的研究人员都建议在进行分析前,我们应先以第五章介绍的方程式,先用X和Y信度的资料(r1及r2)和观察所得的相关系数(Ro),把两个构念的真正的相关系数(Rt)计算出来,如果你没有忘记,这方程式是:

Rt?

RO

r1?r2这样,「聚合分析」是用Rt而不是Ro进行。「聚合分析」的研究人员认为这才是正确的做法,因为我们最终是希望估计两个构念的真正关系,而不是受测量误差影响下观察所得的相关,这个建议也广被接受而成为「聚合分析」的标准步骤。

「聚合分析」(Meta-Analysis)的例子

作为例子,我们这里看一下在综合过往关于性格(大五因子)和工作满足感的相关研究时,Judge, Heller and Mount (2002)如何收集资料及报告其「聚合分析」的结果:

10

「To identify all possible studies of the relationship between the Big Five traits and job satisfaction, we searched the PsycINFO database (1887-2000) for studies (articles, book chapters, dissertations, and unpublished reports) that referenced personality and job satisfaction. In addition to searching for keywords such as personality, Big Five, Agreeableness, Conscientiousness, Extraversion, Openness to Experience, and

Neuroticism, we searched for a list of additional traits and measures that were included in Barrick and Mount’s (1991) review. These efforts resulted in the identification of 1,277 abstracts (including doctoral dissertations). Of these 1,277 abstracts, 737 were obtained by searching for the keywords “personality and job satisfaction.” An additional 540 records were obtained by using names of personality inventories, common specific traits, and the Big Five traits in combination with job satisfaction. In reviewing these abstracts, we eliminated most because (a) they did not appear to measure any discernible personality trait, (b) they assessed a trait that was not classifiable in terms of the

five-factor model, or (c) it was clear that they did not report data (e.g., as was the case with most book chapter). We also used several raw data sets that were available to the authors.

For the remaining 430 journal articles and doctoral dissertations, we examined each study to determine whether it contained the necessary information. Eighty-two articles and 53 doctoral dissertations met these criteria. Several studies contained multiple independent samples. Thus, in all, 163 independent samples and 334 correlations were included in analyses.」

「We used the meta-analytic procedures of Hunter and Schmidt (1990) to correct observed correlations for sampling error and unreliability in measures of personality and job satisfaction. Correlations were corrected individually. When authors of original studies reported an overall internal consistency reliability for personality or job

satisfaction, we used this value to correct observed correlation for attenuation. When reliabilities for personality or job satisfaction were not reported, we used the mean

reliability for job satisfaction or the relevant Big Five trait for those studies that did report a reliability estimate. Finally, three original studies used single-item measures of job satisfaction; consequently, no internal consistency reliabilities were reported. In these cases, we used meta-analytically derived estimates of the reliability of single-item

measures of job satisfaction (Wanous, Reichers, & Hudy, 1997). Hence, we assumed a reliability of .68 for single-item satisfaction scales.

In addition to reporting estimates of the mean true score correlations, it is also

11

important in meta-analysis to describe variability in the correlations. Accordingly, we report 80% credibility intervals and 90% confidence intervals around the estimated

population correlations. Although some meta-analyses report only confidence intervals (e.g., Ernst Kossek & Ozeki, 1998) whereas others report only credibility intervals (e.g., Vinchur, Schippmann, Switzer, & Roth, 1998), it is important to report both because each tells us different things about the nature of the correlations. Confidence intervals provide an estimate of the variability around the estimated mean correlation; a 90% confidence interval excluding zero indicates that we can be 95% confident that the average true correlation in nonzero (5% of average correlations would lie beyond the upper limit of the distribution). Credibility intervals provide an estimate of the variability of individual correlations across studies; an 80% credibility interval excluding zero indicates that 90% of the individual correlations in the meta-analysis excluded zero (for positive correlations, 10% are zero or less and 10% lie at or beyond the upper bound of the interval).」

「The moderators were determined by examining the articles and coding the

necessary information. For most of the moderators, this information was easily obtained (e.g., longitudinal vs. cross-sectional design). In terms of measures, most articles reported the measure of personality and job satisfaction…..Measures of job satisfaction were classified into the following categories: the Brayfield and Rothe (1951) measure (17%), the Hoppock (1935) Job Satisfaction Blank (8%), the Job Descriptive Index (P.C. Smith et al., 1969) (13%), Minnesota Satisfaction Questionnaire (Weiss, Dawis, England, & Lofquist, 1967) (17%), other validated measures (21%), and ad hoc (previously unvalidated) measures (24%).」

「Results of the meta-analyses relating the Big Five traits to job satisfaction are

provided in Table 1. Neuroticism (ρ=-.29) was the strongest correlate of job satisfaction, followed closely by Conscientiousness (ρ=.26) and Extraversion (ρ=.25). Both the confidence intervals and credibility intervals excluded zero for two traits: Neuroticism and Extraversion. For two other traits—Conscientiousness and Agreeableness—the confidence intervals excluded zero, indicating that we can be confident that these average correlations are distinguishable from zero. However, the 80% credibility interval included zero for these traits, suggesting that the relationship of Conscientiousness and Agreeableness with job satisfaction does not fully generalize across studies (e.g., in about 10% of studies, the relationship between Conscientiousness and job satisfaction was zero or negative). Finally, Openness to Experience showed a weak correlation with job satisfaction (ρ=.02) that was indistinguishable from zero.

12

…..The moderator results by job satisfaction measure are provided in the Appendix. As is shown in the Appendix, across the traits, personality—job satisfaction correlations tended to be higher for several measures, most notably the Brayfield and Rothe (1951) measure. It also is noteworthy that the personality—job satisfaction correlations generally were not lower for ad hoc, or previously unvalidated, measures of job satisfaction.」

「聚合分析」(Meta-Analysis)小结

以上我们介绍了「聚合分析」的主要原理,虽然相关系数是最广为分析的对象,但事实上「聚合分析」的研究人员也发展了对其他统计数据的综合分析,其原理与相关系数的分析是一样的,此外也有一些学者提供了一些准则以确定是否有需要寻找可能的调节变项,这里我们不作详细介绍了。

13

参考文献

Bryk, A.S., & Raudenbush, S.W. (1992). Hierarchical linear models. Newbury Park, CA: Sage.

Hunter, J.E., Schmidt, F.L, & Jackson, G.B. (1982). Meta-analysis: Cumulating research findings across studies. Beverly Hills, CA:Sage.

Hunter, J.E. & Schmidt, F.L. (1990). Methods of meta-analysis: Correcting error and bias in research findings. Newbury Park, CA:Sage. Hedges, L.V. & Olkin, I. (1985). Statistical methods for meta-analysis. New York:Academic Press. Judge, T., Heller, D., Mount, M. (2002). Five-Factor model of personality and job satisfaction: A meta-analysis. Journal of Applied Psychology, 87(3), 530-541. Liao, H., & Chuang, A. (2004). A multilevel investigation of factors influencing

employee service performance and customer outcomes. Academy of Management Journal, 47(1), 41-58. 张雷、雷震、郭伯良(2003)。「多层线性模型应用」(Applied Multilevel Data Analysis)。北京:教育科学出版社。

14

本文来源:https://www.bwwdw.com/article/r9hd.html

Top