A Meta-Analysis of
the Effectiveness of Bilingual Education
by Jay P. Greene
Assistant Professor of Government
University of Texas at Austin
March 2, 1998
Sponsored by
The Tomas Rivera Policy Institute
The Public Policy Clinic of the Department of Government, University of
Texas at Austin
The Program on Education Policy and Governance at Harvard University
Introduction
The voters in California are being asked to consider an initiative this
June that would ban the use of foreign languages in the instruction of
younger children with limited English proficiency. Both advocates and opponents
of this initiative claim that scholarly research supports their case, but
their reading of the literature is often selective, exaggerated, and distorted.
With the sponsorship of the Tomas Rivera Policy Institute, the Public Policy
Clinic of the University of Texas' Government Department, and Harvard University's
Program on Education Policy and Governance, I have conducted a systematic,
statistical review of the literature on the effectiveness of bilingual
education. With this technique known as meta-analysis to summarize the
scholarly research, I find that children with limited English proficiency
who are taught using at least some of their native language perform significantly
better on standardized tests than similar children who are taught only
in English. In other words, an unbiased reading of the scholarly research
suggests that bilingual education helps children who are learning English.
Estimated Benefit of Bilingual Education
This conclusion is based on the statistical combination of eleven studies
that meet minimal standards for the quality of their research design from
a total of seventy-five studies reviewed. These eleven studies include
standardized test score results from 2,719 students, 1,562 of whom were
enrolled in bilingual programs, in thirteen different states. The estimated
benefit of using at least some native language in instruction on all scores
measured in English is .18 of a standard deviation on standardized tests.
The average student in these bilingual programs was tested in third grade
after two years of bilingual instruction. Bilingual programs produce .21
of a standard deviation improvement on reading tests and .12 of a standard
deviation improvement on math tests measured in English. The gain in all
test scores measured in Spanish is .74 of a standard deviation. All of
these gains, except for math, are statistically significant, meaning that
they are unlikely to have been produced by chance. (See Table 1 for summary
of results.)
Interpreting Standard Deviations
To put the size of this benefit in perspective, the gap between the
scores of minority and white students on standardized tests nationwide
is about 1 standard deviation. The estimated benefits of bilingual education
are also comparable to the improvements produced by the school choice program
in Milwaukee that I have studied, where students gained between 1/3 and
½ of a standard deviation after four years of participation (Greene
et al, 1997). Education researchers generally consider a gain of .1 standard
deviation as slight, .2 or .3 of a standard deviation as moderate, and
.5 of a standard deviation as large. (Hanushek, 1996; Hedges and Greenwald,
1996)
In more concrete terms, we can imagine two identical students with limited
English proficiency who enter first grade scoring at the 30th percentile
on the reading component of the Iowa Test of Basic Skills (ITBS), meaning
that 70% of those who take the same test in first grade perform better
than they do. After two years in which one student was in a bilingual program
and the other student was in an English-only program, the bilingual student
would be performing about 1/5 of a standard deviation better than the English-only
student on the ITBS reading test. If the English-only student scored at
the 26th percentile at the end of those two years, we would expect the
bilingual student to score at the 34th percentile. (See Figure 1) The English-only
student would be five months behind grade level, while the student in the
bilingual program would be only two months behind grade level. According
to this hypothetical, students in bilingual programs receive the equivalent
of roughly three additional months of learning over a two-year period compared
to similar students in English-only programs.
Estimated Benefits of Bilingual Education from Random Assignment
Studies
Random assignment to treatment and control groups, as in medical experiments,
is the highest quality research design because it increases the confidence
in the conclusion that any differences between the groups after a period
of treatment can be attributed to that treatment. The results from the
five studies in which subjects were randomly assigned to bilingual and
control programs favor bilingual education even more strongly. The estimated
benefit of bilingual programs on all test scores in English according to
these studies with random assignment is .26 of a standard deviation. The
positive effect on reading scores is .41 of a standard deviation among
the studies with random assignment. And the improvement in scores measured
in Spanish is .92 of a standard deviation in the studies with random assignment
to treatment and control groups. All of these estimated benefits of bilingual
education from studies with random assignment are extremely unlikely to
have been produced by chance (the odds are fewer than 1 in 100). (See Table
2 for summary of results) The fact that the studies of bilingual programs
with random assignment, the highest quality research design, have even
stronger results greatly increases the confidence in the conclusion that
bilingual education positively affects educational attainment.
A meta-analysis of the 11 studies that meet minimal standards for the
quality of their research design as well as the 5 highest quality studies
based on random assignment show positive, statistically significant, benefits
for bilingual education. The results of this meta-analysis are similar
to the meta-analysis conducted by Ann Willig in 1985 based on the Baker
and de Kanter review of the literature in 1981. While few acceptable-quality
studies have been conducted in the intervening years, the conclusions that
Willig drew from the literature are still true today: the evidence that
is available suggests that native language instruction has a significant,
positive impact on children learning English.
Method for Selecting Studies and Computing Results
The eleven studies included in this meta-analysis are drawn from a list
of 75 "methodologically acceptable" studies compiled by Christine
Rossell and Keith Baker, two vocal critics of bilingual education, in a
1996 literature review (Rossell and Baker 1996). The Rossell and Baker
list is used as the pool of studies examined for this meta-analysis for
a few reasons despite the potential for bias in their selections. First,
Rossell and Baker claim to have selected their methodologically acceptable
studies based on criteria that I believe are reasonable. To be acceptable
the studies had to:
1) compare students in a bilingual program to a control group of similar
students
2) differences between the treatment and control groups had to be controlled
statistically or assignment to treatment and control groups had to be random
3) results had to be based on standardized test scores in English, and
4) differences between the scores of treatment and control groups had
to be determined by applying appropriate statistical tests.
In addition to these requirements this meta-analysis only included studies
that measured the effects of bilingual programs after at least one academic
year. Bilingual programs were defined as ones in which students with limited
English proficiency are taught using at least some of their native language.
An appropriate control group was one in which students were taught only
in English. If students were not assigned to treatment and control groups
randomly, adequate statistical controls for this non-random assignment
was defined as requiring controls for individual previous test scores as
well as at least some of the individual demographic factors that influence
test scores (e.g. family income, parental education, etc). Rossell and
Baker identify 72 studies that they say meet these standards (although
there are actually 75 citations listed under their heading for acceptable
studies in a mimeo provided by Rossell).
Second, critics of Rossell and Baker's literature review have not offered
additional studies that meet the above criteria. Stephen Krashen, a vocal
proponent of bilingual education for example, has instead suggested that
the standards are too strict and has proposed that Rossell and Baker include
additional studies favorable to bilingual education even though they do
not meet the criteria (Krashen 1996). Here I have to agree with Rossell
and Baker that their standards are reasonable and reject considering Krashen's
additional studies. The inability of others to advance the names of more
studies that meet Rossell and Baker's criteria lends credence to the assumption
that their list is a comprehensive pool from which to select acceptable
studies for a meta-analysis.
Unfortunately, only 11 of the 75 studies identified as acceptable by
Rossell and Baker actually meet their own criteria for an acceptable study.
Fifteen of the studies duplicate the evaluations found in some of the remaining
60 studies. That is, 15 of the 75 are separately released reports of the
same programs by the same authors that are already included in Rossell
and Baker's list. Where appropriate I combine results so that each remaining
observation represents an independent evaluation of a program. Despite
our best efforts, an additional 5 studies in Rossell and Baker's list could
not be found. While Christine Rossell was very helpful in locating some
of the more difficult to find studies, she did not have these 5 nor were
they available from the library at the University of Texas, which has one
of the world's largest collections. (See the annotated bibliography for
a list of studies and the reasons for their exclusion or inclusion in the
meta-analysis).
Of the remaining 55 studies, 3 are excluded because they are not evaluations
of bilingual programs. One is about "direct instruction"(Becker
1982) and makes no mention of foreign language learning. Another is a list
of exemplary bilingual programs (Campeau 1975), not an evaluation of programs.
And yet another is primarily about the effects of retention (being held
back a grade) (Webb 1987).
An additional 14 studies are excluded because they do not have adequate
control groups. In most of these studies both the treatment and control
groups receive bilingual instruction, meaning that all students are taught
in both their native language and in the target language in varying amounts.
I only include in the meta-analysis studies that compare bilingual instruction
(meaning the use of at least some native language in instruction) to "English-only"
instruction. There are several reasons for this choice. First, comparing
the use of some native language to English-only instruction is the clearest
division possible in the literature. Program labels, such as transitional
bilingual education, English as a second language, immersion, submersion,
and maintenance bilingual education, have no consistent meaning in the
evaluations, nor are the detailed features of many programs fully described.
The only division of programs that can accurately and consistently be applied
is whether native languages are used in instruction or not.
Second, the most policy-relevant question and the issue raised by the
initiative in California is whether it is desirable to ban the use of native
language instruction in the education of younger students with limited
English proficiency. The question is not whether it is better to use a
modest amount of native language versus a large amount, nor is the issue
whether it is better to have children in bilingual programs for a short
versus long time. Thus only studies that speak to the policy-relevant issue
of comparing bilingual to English-only instruction are included in this
meta-analysis. In addition, it is not possible to extrapolate results from
studies that compare different amounts or lengths of bilingual instruction
to whether bilingual instruction is desirable at all. Similarly, if one
wanted to know whether acetaminophen was effective in treating headaches,
it would be incorrect to infer an answer from a study that gave different
doses to a treatment and control group. Giving 500 mg of acetaminophen
to one group may cure their headaches and giving 10,000 mg to another group
may kill them. It would then be wrong to extrapolate from these results
to the claim that acetaminophen is harmful in any dose. The only way to
evaluate whether the use of any native language instruction is harmful
or helpful is to compare students who receive any bilingual instruction
to those who are taught only in English.
Of the remaining 38 studies, 2 are excluded because they measure the
effects of bilingual programs after an unreasonably short period of time.
One study evaluates a program after 7 weeks of bilingual instruction for
35 minutes a day (Barclay 1969). The other evaluates a program after 10
weeks (Layden 1972). Every study included in the meta-analysis measures
effects after at least one academic year (about 40 weeks). While the requirement
that studies evaluate the effects of bilingual programs after at least
one academic year was not one of Rossell and Baker's original criteria
for identifying acceptable studies, this is a reasonable standard to add.
To make the analogy to headache cures again, measuring bilingual programs
after 7 or 10 weeks is like measuring the effects of aspirin after 1 minute.
No valuable information can be gained from evaluating such a short period
of treatment.
An additional 25 studies are excluded because they inadequately control
for the differences between students assigned to bilingual programs and
students assigned to English-only control groups. If students are randomly
assigned to these two groups, then no controls are necessary and one can
place high confidence in the results. But when students are not randomly
assigned, it is necessary to control statistically for the differences
between the groups that may affect their future performance. Three of these
25 studies make no effort to control for the differences between bilingual
and English-only students (Curiel 1979, Valladolid 1991, and Yap 1988).
Some studies do not control for individual characteristics of students
but instead match aggregate characteristics of students in a program with
aggregate characteristics of a control group (see Stebbins 1977 for example).
Without controlling for individual level factors these studies suffer from
the "ecological fallacy" where the uncontrolled individual factors
that contribute to improved performance seriously bias the aggregate results.
Most of the other 25 studies excluded for inadequate background controls,
however, only control for test scores taken earlier or IQ test scores when
estimating the effects of bilingual instruction.
For these to be adequate controls for the differences between the groups,
one would have to assume that the rate of test score gains, absent any
treatment, would be the same for students with different initial test scores
or different IQ's. Yet considerable evaluation research (See Campbell and
Erlebacher 1970) has shown that students who begin with different test
scores often have different rates of growth in their test scores. For example,
a student with low initial scores may have those scores in part because
she is poor and does not have parents involved in her education. Those
same factors that contributed to the low initial score may continue to
reduce her educational progress in the future. Unless one controls for
the differences in initial scores, as well as some of the important factors
that produce those different scores, evaluations of educational progress
are likely to be significantly biased.
The remaining 11 studies that are included in the meta-analysis consist
of 5 studies in which students are randomly assigned and 6 in which there
is non-random assignment but some effort to control for the individual
background characteristics as well as test scores that separate those in
bilingual and English-only programs. A single, average effect size was
calculated for each study for each subject area and for all tests in English
and Spanish. The effect sizes were standardized and adjusted for their
sample size into corrected units of standard deviations known as Hedge's
g. The mean of the 11 Hedge's g's was then computed as the reported estimated
effect of bilingual programs. A single, average z-score (a statistical
measure of confidence in the estimated effect) was also calculated for
each study for each subject area and for all tests in English and Spanish.
The z-scores were combined by adding them and then dividing by the square
root of the number of studies to compute a combined z-score. P values can
then be calculated from the combined z-score. These techniques are described
at greater length in Rosenthal 1991 and Cook, et al. 1992.
Differences between These Results and Rossell and Baker's Results
It is important to note that the positive estimated effects of bilingual
education in this meta-analysis are not simply a product of the selection
of these 11 acceptable studies. Of the 38 studies that evaluate bilingual
versus English-only programs in Rossell and Baker's list, 21 have an average
positive estimated effect and 17 have an average negative estimated effect.
Simply counting positive and negative findings, however, is less precise
than a meta-anlaysis because it does not consider the magnitude or confidence
level of effects. In addition, once we include unacceptable studies from
Rossell and Baker's list we would also have to consider the methodologically
unacceptable studies advanced by Krashen and other supporters of bilingual
education. Nevertheless, even when studies with inadequate background controls
and short measurement periods only from Rossell and Baker's list are included,
we still find that the scholarly literature favors the use of native language
in instruction.
Rossell and Baker report a different number of positive and negative
studies for a few of reasons. First, they include in their report studies
that are redundant with other studies, not available, not evaluations of
bilingual programs, and do not have English-only control groups. Second,
they do not apply any consistent rule for classifying studies as positive
or negative. For example, Ramirez 1991 is classified as showing "no
difference" despite having significant, positive effects for bilingual
instruction in reading. Similarly, Education Operation Concepts 1991 is
classified as showing that bilingual education has a negative effect on
reading scores despite having no statistically significant effects (and
the average effect is actually positive, not negative). One of the advantages
of meta-analysis is that it forces one to be consistent in summarizing
other research. Third, there are some studies in their categories of positive
and negative studies that are not found in their list of acceptable studies
(such as Olesini 1971 and Elizondo 1972). It is clear that Rossell and
Baker's review of studies is useful as a pool for a meta-analysis, but
the lack of rigor and consistency in how they classify studies and summarize
results prevent their conclusions from being reliable.
Conclusion
While it would be desirable to have a meta-analysis based on a greater
number of studies, the unfortunate reality is that the vast majority of
evaluations of bilingual programs are so methodologically flawed in their
design that their results offer more noise than signal. Adding seriously
flawed studies would bias the results of this meta-analysis in ways that
are nearly impossible to predict or correct. In addition, including studies
that do not meet minimal criteria would require identifying the entire
universe of inadequate studies and including all or a random sample of
those studies in a meta-analysis. The incredible amount of effort that
would require is not justified given the low amount of information that
could be gained. Focusing on studies that meet certain "bright-line"
criteria, such as all studies that control for individual background characteristics
as well as pretest scores or on the smaller group of studies based on random
assignment, provides an unbiased sample of studies that can offer useful
information on the effects of bilingual education. Despite the relatively
small number of studies, the strength and consistency of these results,
especially from the highest quality randomized experiments, increases confidence
in the conclusion that bilingual programs are effective at increasing standardized
test scores measured in English.
The limited number of useful studies, however, makes it difficult to
address other important issues, such as the ideal length of time students
should be in bilingual programs, the ideal amount of native language that
should be used in instruction, and the age groups in which these techniques
are most appropriate. It is possible that the individual needs of students
are so varied that there may be no simple set of ideal policies. But if
we want to learn more about how to develop public policy that is most effective
at addressing the needs of students with limited English proficiency, we
need to conduct a series of experiments in which students are randomly
assigned to different types of programs. These randomized experiments yield
the clearest and most precise information to help guide policymaking. The
results from the 5 randomized experiments examined here clearly suggest
that native language instruction is useful. We need additional randomized
experiments to determine how best to design those bilingual programs.
Table 1: Results from the Meta-Analysis of the Effects
of Bilingual Education
|
All tests in English |
Reading
(in English)
|
Math
(in English)
|
All tests in Spanish |
Benefit of Bilingual Programs in Standard Deviations (Hedge's g) |
.18 |
.21 |
.12 |
.74 |
z-score |
2.41 |
2.46 |
1.65 |
3.53 |
p-value < |
.05 |
.05 |
.10 |
.01 |
Table 2: Results from the Meta-Analysis of the Effects
of Bilingual Education for Studies with Random Assignment to Bilingual
and Control Programs
|
All tests in English |
Reading
(in English)
|
Math
(in English)
|
All tests in Spanish |
Benefit of Bilingual Programs in Standard Deviations (Hedge's g) |
.26 |
.41 |
.15 |
.92 |
z-score |
2.71 |
3.47 |
1.25 |
5.21 |
p-value < |
.01 |
.01 |
.21 |
.01 |
Table 3: Summary of Results from Studies Included
in Meta-Analysis
Study |
English |
|
Reading |
|
Spanish |
|
Treatment |
Control |
Random Assignment |
|
ES |
Z |
ES |
Z |
ES |
Z |
N |
N |
Yes/No |
Bacon, 1982 |
.79 |
2.39 |
.68 |
2.07 |
NA |
NA |
18 |
18 |
No |
Covey, 1973 |
.34 |
2.94 |
.74 |
4.87 |
NA |
NA |
86 |
89 |
Yes |
Danoff, 1977 |
-.03 |
-.39 |
-.12 |
-1.50 |
NA |
NA |
955 |
523 |
No |
Huzar 1973 |
.18 |
.83 |
.18 |
.83 |
NA |
NA |
43 |
43 |
Yes |
Kaufman, 1968 |
.20 |
.72 |
.20 |
.72 |
1.65 |
6.05 |
43 |
31 |
Yes |
Plante, 1976 |
.52 |
1.34 |
.52 |
1.34 |
1.09 |
2.89 |
16 |
12 |
Yes |
Powers, 1978 |
.001 |
.01 |
-.33 |
-1.53 |
NA |
NA |
44 |
43 |
No |
Ramirez, 1991 |
.01 |
.08 |
.12 |
.73 |
NA |
NA |
88 |
160 |
No |
Rossell, 1990 |
-.01 |
.03 |
-.05 |
-.20 |
NA |
NA |
174 |
173 |
No |
Rothfarb, 1987 |
.05 |
.24 |
NA |
NA |
.01 |
.09 |
70 |
49 |
Yes |
Skoczylas, 1972 |
-.05 |
-.18 |
.13 |
.46 |
.20 |
.68 |
25 |
25 |
No |
ES = Average effect size measured in standard deviations (Hedge's
g)
N = Largest number of subjects in any analysis in the study. For
Huzar, 1973 and Rossell, 1990 the number of subjects in the treatment and
control groups had to be estimated by halving the total reported sample.
Annotated Bibliography
Methodologically Acceptable Studies Included in the Meta-Anlaysis
Bacon, Herbert L., Kidd, and Gerald D., et al. 1982. "The Effectiveness
of Bilingual Instruction with Cherokee Indian Students." Journal
of American Indian Education. February. pp. 34-43.
Covey, D. D. 1973. "An Analytical Study of Secondary Freshmen Bilingual
Education and its Effects on Academic Achievement and Attitudes of Mexican
American Students." Ph.D. dissertation. Arizona State University.
Random assignment.
Danoff, Malcom N., Arias, B.M., Coles, Gary J., and Others. 1977a. Evaluation
of the Impact of ESEA Title VII Spanish/English Bilingual Education Program.
American Institutes for Research. Palo Alto.
Huzar, Helen. 1973. "The Effects of an English-Spanish Primary Grade
Reading Program on Second and Third Grade Students." M.Ed. thesis.
Rutgers University.
Random assignment.
Kaufman, Maurice. 1968. "Will Instruction in Reading Spanish Affect
Ability in Reading English?" Journal of Reading. Vol. 11. pp.
521-527.
Random assignment.
Plante, Alexander J. 1976. A Study of Effectiveness of the Connecticut
"Paring" Model of Bilingual/Bicultural Education. Connecticut
Staff Development Cooperative. Hamden.
Random assignment.
Powers, Stephen. 1978. "The Influence of Bilingual Instruction on
Academic Achievement and Self-Esteem of Selected Mexican American Junior
High School Students." Ph.D. dissertation. University of Arizona.
Ramirez, J. David, Pasta, David J, Yuen, Sandra, Billings, David K., and
Ramey, Dena R. 1991. Final Report: Longitudinal Study of Structural
Immersion Strategy, Early-Exit, and Late-Exit Transitional Bilingual Education
Programs for Language-Minority Children. Aguirre International (Report
to the U.S. Department of Education). San Mateo.
Rossell, Christine H. 1990. "The Effectiveness of Educational Alternatives
for Limited-English-Profficient Children.: in Imhoff, Gary. (ed.). Learning
in Two Languages. Transaction Publishers. New Brunswick.
Rothfarb, Sylvia H., Ariza, Maria J. and Urrutia, Rafael. 1987. Evaluation
of the Bilingual Curriculum Content (BCC) Project: A Three-Year Study,
Final Report. Office of Educational Accountability. Dade County.
Skoczylas, Rudolph V. 1972. "An Evaluation of Some Cognitive and Affective
Aspects of a Spanish Bilingual Education Program." Ph.D. dissertation.
University of New Mexico.
Studies Excluded Because They are Redundant
Ariza, Maria. 1988. "Evaluating Limited English Proficient Students'
Achievement: Does Curriculum Content in the Home Language Make a Difference?"
Paper presented at the April meetings of the American Educational Research
Association. New Orleans.
Redundant with Rothfarb et al, 1987.
Barik, Henri, and Swain, Merrill. 1978. Evaluation of a Bilingual Education
Program in Canada: The Elgin Study Through Grade Six. Commission Interuniversitaire
Suisse de Linguistique Appliquee. Switzerland.
Redundant with Barik et al 1977.
Cohen, Andrew D., Fathman, Ann K., and Merino, Barbara. 1976. The
Redwood City Bilingual Education Report, 1971-1974: Spanish and English
Proficiency, Mathematics, and Language-Use Over Time. Ontario Institute
for Studies in Education. Toronto.
Redundant with Cohen 1975.
Curiel, Herman, Stenning, Walter, and Cooper-Stenning, Peggy. 1980. "Achieved
Ready Level, Self-Esteem, and Grades as Related to Length of Exposure to
Bilingual Education." Hispanic Journal of Behavioral Sciences.
Vol. 2. pp. 389-400.
Redundant with Curiel, 1979.
Danoff, Malcom N., Coles, Gary J., McLaughlin, Donald H., and Reynolds,
Dorothy J. 1977b. Evaluation of the Impact of ESEA Title VII Spanish/English
Bilingual Education Programs, Vol. I: Study Design and Interim Findings.
American Institutes for Research. Palo Alto.
Redundant with Danoff et al 1977a.
--------------------- 1978. Evaluation of the Impact of ESEA Title VII
Spanish/English Bilingual Education Programs, Vol. III: Year Two Impact
Designs. American Institutes for Research. Palo Alto.
--------------------- 1978b. Evaluation of the Impact of ESEA Title
VII Spanish/English Bilingual Education Programs, Vol. IV: Overview of
the Study and Findings. American Institutes for Research. Palo Alto.
Educational Operations Concepts, Inc. 1991b. An Evaluation of the
Title VII ESEA Bilingual Education Program for Hmong and Cambodian Students
in Kindergarden and First Grade St, Paul.
Redundant with Educational Operations Concepts, Inc 1991a.
El Paso Independent School District. 1992. Bilingual Education Evaluation.
Office for Research and Evaluation. El Paso.
Redundant with El Paso 1987.
El Paso Independent School District. 1990. Bilingual Education Evaluation:
The Sixth Year in a Longitudinal Study. Office for Research and Evaluation.
El Paso.
Genesee, Fred, Lambert, Wallace E., and Tucker, G. R. 1977. An Experiment
in Trilingual Education. McGill University. Montreal.
Redundant with Genesee et al 1983.
McConnell, Beverly Brown. 1980b. "Individualized Bilingual Instruction,
Final Evaluation, 1978-1979 Program." Pullman.
Redundant with McConnell 1980a.
-------------- 1980c. "Individualized Bilingual Instruction for
Migrants." Paper presented at the October meeting of the International
Congress for Individualized Instruction. Windsor.
McSpadden, J.R.. 1980. Arcadia Bilingual Bicultural Education Program:
Interim Evaluation Report, 1979-80. Lafayette Parish.
Redundant with McSpadden 1979.
Teschner, Richard V. 1990. "Adequate Motivation and Bilingual Education."
Southwest Journal of Instruction. Vol. 9, pp. 1-42.
Redundant with El Paso, 1990.
Studies Excluded Because They are Unavailable
American Institutes for Research. 1975b. "Bilingual Education Program
(Aprendamos En Dos Idiomas). Corpus Christi. Identification and Description
of Exemplary Bilingual Education Programs. Palo Alto.
Lambert, Wallace E., and Tucker, G. R. 1972. Bilingual Education
of Children: The St. Lambert Experience. Newbury House. Rowley.
McSpadden, J.R. 1979. Arcadia Bilingual Bicultural Education Program:
Interim Evaluation Report, 1978-79. Lafayette Parish.
Morgan, Judith Claire. 1971. "The Effects of Bilingual Instruction
of the English Language Arts Achievement of First Grade Children."
Ph.D. dissertation. Northwestern State University of Louisiana.
Ramos, M., Aguilar, J.V., and Sibayan, B.F. 1967. The Determination
and Implementation of Language Policy. Phillipine Center for Language
Study: Monograph Series 2. Quezon City.
Studies Excluded Because They are not Evaluations of Bilingual Programs
Becker, Wesley C. and Gersten, Russell. 1982. "A Follow-Up of Follow
Through: The Latter Effects of the Direct Instruction Model on Children
in Fifth and Sixth Grades." American Educational Research Journal.
Vol. 19. pp. 75-92.
Campeau, Peggie L., Roberts, A. Oscar H., Bowers, John E., Austin, Melanie,
and Roberts, Sarah J. 1975. The Identification and Description of Exemplary
Bilingual Education Programs. American Institutes for Research. Palo
Alto.
Webb, John A., Clerc, R.J., and Gavito, Alfredo. 1987. Houston Independent
School District: Comparison of Bilingual and Immersion Programs Using Structural
Modeling. Houston Independent School District.
Studies Excluded Because There is not an Appropriate Control Group
Barik, Henri, Swain, Merrill. and Nwanunobi, E. A. 1977. "English-French
Bilingual Education: The Elgin Study Through Grade Five." Canadian
Modern Language Review. Vol. 33. pp. 459-475.
Bruck, Margaret, Lambert, Wallace E., and Tucker, G. Richard. 1977.
"Cognitive Consequences of Bilingual Schooling: The St. Lambert Project
Through Grade Six." Linguistics. Vol. 24. pp. 13-33.
Burkheimer, Graham J., Conger, A.J., Dunteman, G.H., Elliott, B.G.,
and Mowbray, K.A. 1989. Effectiveness of Services for Language-Minority
Limited-English-Proficient Students. Report to the U.S. Department
of Education.
Day, Elaine M., and Shapson, Stan M. 1988. "Provincial Assessment
of Early and Late French Immersion Programs in British Columbia, Canada."
Paper presented at the April meetings of the American Educational Research
Associates. New Orleans.
No background controls or individual level data reported.
El Paso Independent School District. 1987. Interim Report of the
Five-Year Bilingual Education Pilot 1986-1987 School Year. Office for
Research and Evaluation. El Paso.
No background or pretest controls.
Genesee, Fred., and Lambert, W. E. 1983. "Trilingual Education
for Majority-Language Children." Child Development. Vol. 54.
pp. 105-114.
No background controls.
Genesee, Fred, Holobow, Naomi E., Lambert, Wallace E, and Chartrand,
Louise. 1989. "Three Elementary School Alternatives for Learning Through
a Second Language. The Modern Language Journal. Vol. 73. pp. 250-263.
No background controls.
Gersten, Russell. 1985. "Structured Immersion for Language-Minority
Students: Results of a Longitudinal Evaluation." Educational Evaluation
and Policy Analysis. Vol. 7. pp. 187-196.
No background controls.
Malherbe, E. C. 1946. The Bilingual School. Longmans Green. London.
No background or pretest controls.
McConnell, Beverly Brown. 1980a. "Effectiveness of Individualized
Bilingual Instruction for Migrnat Students." Ph.D. dissertation. Washington
State University
Medina, Marcello, and Escamilla, Kathy. 1992. "Evaluation of Transitional
and Maintenance Bilingual Programs." Urban Education. Vol.
27. No. 3. p. 263-290.
Melendez, William Anselmo. 1980. "The Effect of the Language of Instruction
on the Reading Achievement of Limited English Speakers in Secondary Schools."
Ph.D. dissertation. Loyola University of Chicago.
No background controls.
Stern, Carolyn. 1975. Final Report to the Compton Unified School District's
Title VII Bilingual/Bicultural Project: September 1969 Through June 1975.
Compton City Schools. Compton.
Vasquez, Miriam. 1990. "A Longitudinal Study of Cohort Academic
Success and Bilingual Education." Ph.D. dissertation. University of
Rochester.
No background controls.
Studies Excluded Because the Effects are Measured after an Unreasonably
Short Period
Barclay, Lisa. 1969. "The Comparative Efficacies of Spanish, English,
and Bilingual Cognitive Verbal Instruction with Mexican American Head Start
Children." Ph.D. dissertation. Stanford University.
Positive Average Effect.
Layden, Russell Glenn 1972. "The Relationship Between the Language
of Instruction and the Development of Self-Concept, Classroom Climate,
and Achievement of Spanish Speaking Puerto Rican Children." Ph.D.
dissertation. University of Maryland.
Negative Average Effect.
Studies Excluded Because They Inadequately Control for Differences
between Bilingual and English-Only Students
Alvarez, Juan. 1975. "Comparison of Academic Aspirations and Achievement
in Bilingual Versus Monolingual Classrooms." Ph.D. dissertation. UT
Austin.
Negative Average Effect.
Ames, J., and Bicks, Pat. 1978. An Evaluation of Title VII Bilingual/Bicultural
Program, 1977-1978 School Year, Final Report. Community School District
22. Brooklyn. School District of New York.
Positive Average Effect.
Balasubramonian, K., Seelye, H., and Elizondo de Weffer, R. 1973. "Do
Bilingual Education Programs Inhibit English Language Achievement: A Report
on An Illinois Experiment." Paper presented at the 7th Annual Convention
of Teachers of English to Speakers of Other Languages. San Juan.
Positive Average Effect.
Barik, Henri, and Swain, Merrill. 1975. "Three Year Evaluation
of a Large-Scale Early Grade French Immersion Program: The Ottawa-Study."
Language Learning. Vol. 25. No. 1. pp. 1-30.
Negative Average Effect.
Bates, Enid May Buswell. 1970. "The Effects of One Experimental Bilingual
Program on Verbal Ability and Vocabulary of First Grade Pupils." Ph.D.
dissertation. Texas Tech University.
Negative Average Effect.
Carsrud, Karen, and Curtis, John. 1980. ESEA Title VII Bilingual Program:
Final Report. Austin Independent School District. Austin.
No statistical tests reported.
Positive Average Effect.
Ciriza, Frank. 1990a. Evaluation Report of the Preschool Project
for Spanish-Speaking Children, 1989-1990. Planning, Research and Evaluation
Division. San Diego City Schools. San Diego.
Positive Average Effect.
Cohen, Andrew D. 1975. A Sociolinguistic Approach to Bilingual Education.
Newbury House Press. Rowley, MA.
Negative Average Effect.
Cottrell, Milford C. 1971. "Bilingual Education in San Juan Co.,
Utah: A Cross-Cultural Emphasis." Paper presented at the April meetings
of the American Educational Research Association. New York City.
Negative Average Effect.
Curiel, Herman. 1979. "A Comparative Study Investigating Achieved
Reading Level, Self-Esteem, and Achieved Grade Point Average Given Varying
Participation." Ph. D. dissertation. Texas A&M.
Negative Average Effect.
de Weffer, Rafalea de Carmen Elizondo. 1972. "Effects of First
Language Instruction in Academic and Psychological Development of Bilingual
Children." Ph.D. dissertation. Illinois Institute of Technology.
Positive Average Effect.
de la Garza, Jesus Valenzuela, and Marcella, Medina. 1985. "Academic
Achievement as Influenced by Bilingual Instruction for Spanish-Dominant
Mexican American Children." Hispanic Journal of Behavioral Sciences.
Vol. 7. No. 3. pp. 247-259.
Positive Average Effect.
Educational Operations Concepts, Inc. 1991a. An Evaluation of the
Title VII ESEA Bilingual Education Program for Hmong and Cambodian Students
in Junior and Senior High School. St, Paul.
Positive Average Effect.
Lampman, Henry P. 1973. "Southeastern New Mexico Bilingual Program:
Final Report." Artesia Public Schools. Artesia.
Positive Average Effect.
Legarreta, Dorothy. 1979. "The Effects of Program Models on Language
Acquisition by Spanish-Speaking Children." TESOL Quarterly.
Vol. 13. No. 4. pp. 521-534.
Positive Average Effect.
Lum, John Bernard. 1971. "An Effectiveness Study of English as
a Second Language (ESL) and Chinese Bilingual Methods." Ph.D. dissertation.
U.C. Berkeley
Negative Average Effect.
Maldonado, Jesus Ruben. 1974. "The Effect of the ESEA Title VII
Program on the Cognitive Development of Mexican American Students."
Ph.D. dissertation. University of Houston.
Negative Average Effect.
Matthews, T. 1979. "An Investigation of the Effects of Background
Characteristics and Special Language Services on the Reading Achievement
and English Fluency of Bilingual Students." Seattle Public Schools:
Department of Planning, Research, and Evaluation. Seattle.
Negative Average Effect.
Moore, Fernie B. and Parr, Gerald D. 1978. "Models of Bilingual
Education: Comparisons of Effectiveness." The Elementary School
Journal. Vol. 79. pp. 93-97.
Negative Average Effect.
Pena-Hughes, Eva, and Solis, Juan. 1980. ABC's. McAllen Independent
School, District. McAllen.
Positive Average Effect.
Prewitt Diaz, Joseph O. 1979. "An Analysis of the Effects of a
Bicultural Curriculum on Monolingual Spanish Ninth Graders as Compared
with Monolingual English and Bilingual Ninth Graders with Regard to Language
Development, Attitude Toward School, and Self-Concept." Ph.D.
dissertation. University of Connecticut.
Positive Average Effect.
Stebbins, Linda B., St. Pierre, Robert G., Proper, Elizabeth C., Anderson,
Richard B., and Carva, Thomas. 1977. "Education as Experimentation:
A Planned Variation Model, Vol. IV-A. An Evaluation of Follow Through."
ABT Associates. Cambridge.
Positive Average Effect.
Valladolid, Lupe A. 1991. "The Effects of Bilingual Education of
Students' Academic Achievement as They Progress Through a Bilingual Program."
Ph.D. dissertation. United States International University.
No background or pretest controls.
Negative Average Effect.
Yap, Kim O., Enoki, Donald Y., and Ishitani, Patricia. 1988. "LEP
Student Achievement: Some Pertinent Variables and Policy Implications."
Paper presented at the April meetings of the American Educational Research
Association. New Orleans.
No background or pretest controls.
Negative Average Effect.
Zirkel, Perry A. 1972. "An Evaluation of the Effectiveness of Selected
Experimental Bilingual Education Programs in Connecticut." Ph.D. dissertation.
University of Connecticut.
Positive Average Effect.
Other Sources
Baker, Keith. 1987. "Comment on Willig's 'A Meta-Analysis of Selected
Studies in the Effectiveness of Bilingual Education.'" Review of
Educational Research. Vol. 57, pp. 351-362.
Baker, K.A. and de Kanter, A.A. 1981. Effectiveness of bilingual
education: A review of the literature. Washington, D.C.: U.S. Department
of Education, Office of Planning, Budget and Evaluation.
Campbell, D. T. and Erlebacher, A. E. 1970. How regression artifacts
in quasi-experimental evaluations can mistakenly make compensatory education
look harmful. In J. Hellmuth (Ed.) Compensatory Education: A National
Debate. Vol. 3, Disadvantaged Child. New York: Brunner/Mazel.
Cook, T. D., et al. 1992. Meta-Analysis for Explanation: A Casebook.
New York: Russell Sage Foundation.
Greene, J.P., Peterson, P.E., and Du, J. 1997. Effectiveness of School
Choice: The Milwaukee Experiment. Harvard Program on Education Policy
and Governance Working Paper 97-1.
Hanushek, Eric A. 1996. "School Resources and Student Performance."
In Does Money Matter, ed. Gary Burtless. Washington, D.C.: Brookings,
pp. 43-73.
Hedges, Larry V. and Rob Greenwald. 1996. "Have Times Changed?"
In Does Money Matter, ed. Gary Burtless. Washington, D.C.: Brookings,
pp. 74-92.
Krashen, S. 1996. Under Attack: The Case Against Bilingual Education.
Culver City, CA: Language Education Associates.
Rosenthal, R. 1991. Meta-Analytic Procedures for Social Research.
Newbury Park: Sage Publications.
Rossell C.H. and Baker K. 1996. "The Educational Effectiveness
of Bilingual Education." Research in the Teaching of English,
Vol. 30, no 1.
Willig, A. 1985. "A Meta-Analysis of Selected Studies on the Effectiveness
of Bilingual Education," Review of Educational Research, Vol.
55, no. 3.
Willig, A. 1987. "Examining Bilingual Education Research Through
Meta-Analysis and Narrative Review: A Response to Baker." Review
of Educational Research, Vol. 57, no 3.
Research assistance was provided by Luis Guevera. I also want to
thank Larry Bernstein, Elsa Del Valle-Gaster, Rudy de la Garza,, Charles
Glenn, Aleza Greene, Kenji Hakuta, Stephen Krashen, Michael Kwiatkowski,
Tse Min Lin, Gary Orfield, Harry Pachon, Paul Peterson, Joel Spalter, and
Ann Willig for their helpful comments. In addition to helpful suggestions,
Christine Rossell and James Yates provided some of the harder to find studies.
See A Note on Greene's
Meta-Analysis of the Effectiveness of Bilingual Education, by Stephen
Krashen, University of Southern California
|