Sherman: Evidence for the Educational Benefits of Diversity in Higher Education:
By Patricia Gurin
The National Association of Scholars (NAS) released a report, authored by Thomas E. Wood, Executive Director of the California Association of Scholars, and Malcolm J. Sherman, Associate Professor of Mathematics and Statistics, State University of New York at Albany, which the NAS claims refutes "the University of Michigan's diversity theory." Part of the Wood-Sherman report is a critique of my expert reports in the Gratz and Grutter cases, referred to by the NAS as the "Gurin Report". As author of the expert report, I feel compelled to respond here to each of the central charges. These criticisms were not raised during my deposition in these cases, nor did any expert witness step forward to counter my testimony in court. The NAS raised some of these issues in a general way in an Amicus brief filed in the summer of 2000, which I answered in a supplemental expert report in January, 2001. However, it was only after the opinion by Judge Duggan in the Gratz et al. v Bollinger, et al. case that the NAS released the Wood-Sherman Report. In no way does the report refute the testimony I provided in the Gratz or the Grutter cases.
The University provided my statistical analyses to the lawyers for the plaintiffs in discovery; Wood and Sherman apparently obtained them from those lawyers. Wood and Sherman suggest that what my report refers to as structural diversity (i.e., percent minority students in a student body) does not have many statistically reliable direct effects on educational outcomes. Their repeated demonstration of this is beside the point, and statistically and logically flawed.
The key conclusion emerging from the extensive social science analysis in my Expert Report, informed by social psychological theory and the work of many others in the field, involves the consistent impact of interactions occurring among diverse students on important educational outcomes. The University has a deliberate policy, not only of building a diverse student body, but
also of promoting diversity experiences for students that in turn are related to educational outcomes. This is not a policy of simply recruiting a diverse student body and then neglecting the intellectual environment in which students interact. To do so would be irresponsible. Like all resources, structural diversity must be used intelligently to fulfill its potential.
Wood and Sherman argue that structural diversity must by itself be sufficient for achieving desired outcomes if the university policy is to be justified. But if that were true, then having good buildings, high faculty salaries, and good libraries would all be sufficient to ensure a good education. No one with the responsibility to run a university would make such an argument, precisely because the nature of educational activities and the extent to which students avail themselves of these resources is crucial to achieving an excellent education. Similarly, students must be engaged with diverse peers if we expect learning and development to occur as observed in the national and University of Michigan contexts. A diverse student body is a resource and a necessary condition for engagement with diverse peers that permit higher education to achieve these educational goals.
At Issue: The Meaning of Campus Diversity
The heart of the Wood-Sherman critique is their suggestion that the only evidence relevant to the educational benefits of diversity is evidence showing that percentage of minority students on a campus directly affects student outcomes. This contention is flawed.
In this paper, I address the main Wood/Sherman points and show that my expert report findings hold up under stringent conditions of control for alternative explanations. They are consistent across many measures of educational outcomes and across national and institutional contexts as well as across different racial/ethnic groups. In sum, diversity experiences during college with diverse peers produce the kind of educational outcomes we desire for a skilled population and a democracy of many peoples.
Justice Powell's Diversity Argument
The interpretation of what Justice Powell meant in providing the diversity rationale for the use of race as one of many factors in university admissions is, of course, up to the courts. I believe, nonetheless, that Justice Powell understood that actual interaction with diverse peers is the vehicle by which diversity benefits students.
Wood and Sherman cite Justice Powell's decisive statement as support for their claim that diversity means only percentage of minority students:
"The atmosphere of 'speculation, experiment, and creation' so essential to the quality of higher education is widely believed to be promoted by a diverse student body. As the Court noted in Keyishian, it is not too much to say that the 'nation's future depends upon leaders trained through wide exposure' to the ideas and mores of students as diverse as this Nation of many peoples." Regents of the University of California v. Bakke, 438 U.S. 265, 312-13 (1978).
Wood and Sherman leave out the reference to a crucial footnote in this passage that appears between "diverse student body" and "as the court". Footnote 48 (on page 312) reads as follows:
"The president of Princeton University has described some of the benefits derived from a diverse student body: '(A) great deal of learning occurs informally. It occurs through interactions among students of both sexes; of different races, religions, and backgrounds; who come from cities and rural areas, from various states and countries; who have a wide variety of interests,
talents, and perspectives; and who are able, directly or indirectly to learn from their differences and to stimulate one another to reexamine even their most deeply held assumptions about themselves and their world. As a wise graduate of ours observed in commenting on this aspect of the educational process, 'People do not learn very much when they are surrounded only by the likes of themselves.'
It is clear from this passage that Justice Powell understood that actual interaction with diverse peers is precisely how campus diversity has its effects. My expert testimony's focus on interaction with diverse peers in what we call diversity experiences follows the logic of Justice Powell as to how a diverse student body improves understanding and personal growth.
Multiple Databases Are Relevant for
Because Wood and Sherman contend that the only evidence relevant to the court comes from showing direct effects of percentage of minority students, they also argue that the only relevant data must come from a nationally representative sample of campuses with sufficient variance in the proportions of minorities. They further assert that there is one such data set, a product of the Cooperative Institutional Research Program (CIRP).
This conclusion depends completely upon accepting the NAS premise that the Powell argument requires proof about direct effects of percentage of minority students on a campus.
I do not accept that premise. Other databases are relevant because actual experience with diversity is the process or mechanism by which campus diversity affects students. Multiple databases -- the CIRP national database, the Michigan student study, and a study of a specific classroom program at Michigan in fact are critically important for supporting the conclusions I reached about the impact of diversity experiences. I am able to show across these databases consistent effects of actual experience with diversity on a variety of educational outcomes.
There is nothing automatic about the impact of percentage of minority students on a college campus. Having diverse students on the campus is necessary, but universities also have to make use of structural diversity. Universities have to create educational
programs and to foster actual interaction with diverse peers for campus racial diversity to have an impact on students. That is exactly what the University of Michigan aims to do. It does so, first, through admissions policies that create a student body that is diverse in a variety of ways, including by race and ethnicity. It does so, second, by promoting curricular and student life policies that help shape the very interactions that are critical to the positive impact of diversity, according to Justice Powell and according to the theory I offer in my expert testimony on diversity.
All three of the databases that I used are important in assessing the impact of experience with diversity and to show consistency across studies. In light of their argument, Wood and Sherman find the results from the Michigan studies and the consistency of results across studies simply irrelevant. As a social scientist, I believe that evidence across multiple levels of data and across groups is necessary to confirm the impact of diversity experiences on educational outcomes.
Actual Experience with Diversity Is the Key
Social psychological theory and the research of many other scholars in the field that I surveyed in my expert report provide the explanation for the educational benefits of experience with diversity. The keys to these benefits are: (1) that race remains an important issue in American life, and (2) that most students at schools like the University of Michigan, who come to universities at a critical stage of their development, come from racially-segregated backgrounds and have had little or no significant contact with members of different racial and ethnic groups. Most University of Michigan undergraduate students about 90% of the white students and 50% of the African American students lived in racially homogeneous neighborhoods and attended racially homogeneous high schools before enrolling at the University. For them, interaction with diverse peers is new, unfamiliar, and discrepant with their pre-college racially segregated lives. Novelty and discrepancy are the very conditions that foster active engagement in thinking and learning. Engagement and motivated cognition are what I call learning outcomes in my Expert Report. Encountering novelty, difference, and discrepancy involves actual interaction with diverse peers. Learning "from their differences" and stimulating "one another to reexamine even their most deeply held assumptions about themselves and the world" (Bowen quoted by Justice Powell) does not come simply from observing people who look different. The elements of what we call democracy educational outcomes -- learning to take the perspectives of others, understanding that difference is congenial to commonality, learning to work with diverse others, and participating in citizen activities -- also require actual experience with diversity.
In applying this social psychological theory in my testimony, I was clear that mere numbers of minority students are not sufficient to produce educational benefits. For example, on pages 22-23 in the section on "Conceptual Model of the Impact of Diversity" that follows my theoretical discussion and presents the rationale for my choice of the three studies and data analyses that follow, I state:
"The impact of structural diversity depends greatly on classroom and informal interactional diversity. Structural diversity is essential, but, by itself, usually not sufficient to produce substantial benefits; in addition to being together on the same campus, students from diverse backgrounds must also learn about each other in the courses that they take and in informal interaction outside the classroom. For new learning to occur, institutions of higher education have to make appropriate use of structural diversity. They have to make college campuses authentic public places, where students from different backgrounds can take part in conversations and share experiences that help them develop an understanding of the perspectives of other people. Formal classroom activities and interaction with diverse peers in the informal college environment must prompt students to think in pluralistic and complex ways, and to encourage them to become committed to life-long civic action. In order to capitalize amply on such opportunities for cognitive growth, institutions of higher education must bring diverse students together, provide stimulating courses covering historical, cultural, and social bases of diversity and community, and create opportunities and expectations for students to interact across racial and other divides. Otherwise, many students will retreat from the opportunities offered by a diverse campus to find settings within their institutions that are familiar and that replicate their home environments."
As I indicated in my review of the research literature in Appendix B of my Expert Report, my focus on the significance of interaction with diverse peers conforms to the emphases in the other higher education research on diversity. This literature supports the view that actual interaction with diverse students is the major process through which diversity affects students. The impact of campus diversity is not a matter of simply observing people who look different but rather of actually interacting with students from diverse background who were not part of the pre-college environment.
Experience with Diversity Requires the Presence of Diverse Peers
Wood and Sherman argue, in effect, that a student can have experience with diversity without diverse others. So, too, did the NAS Amicus brief submitted to the district court in the summer of 2000. Their argument seems to be that any effects we demonstrated could result simply from readings, lectures, and teaching about race and ethnicity. They are especially critical because my Expert Report did not include data on the racial and ethnic distribution of students in racial and ethnic courses. They surmise that effects of this measure of classroom diversity in the CIRP data tell nothing about the presence of diverse peers.
My own experience as a professor at the University of Michigan and data from the University of Michigan's Registrar Office confirm that such classes tend to draw diverse students. I am confident, at least, that the measure of classroom diversity in the Michigan Student Study refers to classes that were composed of diverse students. In 1994 when these students were seniors, they had to have taken at least one course to meet the graduation race and ethnicity requirement. By 1994, 111 courses had been approved by the LS&A Curriculum Committee to meet that requirement. I obtained the racial/ethnic
distribution of students enrolled in these 111 courses for 1993/94, the year that the MSS gathered senior data. Two-thirds of these courses had between 20 and 80% students of color in them. Thus, there is a very strong probability that a significant number of racially and ethnically diverse students were present in the class that the students were referring to in the MSS measure of classroom diversity. 1 These data suggest that my measure of classroom diversity is actually a combination of course content diversity and interactional diversity. If anything, the distinction I have drawn in my report between classroom and interactional diversity underestimates the impact of interracial and interethnic interaction. 2
I know from my years of teaching experience that learning from peers and especially from diverse peers is vitally important in undergraduate classrooms. I have every reason to believe that diversity in law school classrooms is equally important. For example, in a first-year seminar on groups and community, a particularly poignant class event revealed the power of real interaction with diverse peers. In general, students find the idea of groups a bit uncomfortable. They long to just be individuals, which, of course, they are even as they are also members of race, class, gender, age, geographic, religious, and other groups. In one class session, a white woman student who had grown up in a homogeneously white town in Michigan expressed, with considerable emotion, that she was tired of being categorized as white. "I'm just an individual. No one knows if I hold similar beliefs to those of other white students just by looking at me. I hate being seen just as white." She ended in tears. An African-American male student who had grown up in a virtually all white city in Connecticut replied as he walked toward her across the classroom. "I just want to be an individual also. But every day as I walk across this campus just as I am walking across this room right now I am categorized. No one knows what my thoughts are, or if my thoughts align with other African- American students. They just see me as a black male. And at night, they often change their pace to stay away from me. The point is groups do matter. They matter in my life and (as he approached the other student whose hand he then took), they matter in your life." There was silence in the room. The students learned about the meaning of groups and the meaning of individuals in a way that they won't soon forget.
I could not have taught this from a lecture. Real interaction with diverse others in a classroom makes this learning powerful and indelible.
Social Science and Higher Education Research Supports My Argument
In both my original and supplemental expert reports, I refer to supporting theory and evidence that have come from fifty years of research on interracial social contact and from the higher education research on the impact of college on students. Both bodies of research confirm that one should not automatically expect direct effects of proportion of minorities or the mere co-existence of diverse others in a college.
I noted in my original expert report that:
This conclusion from the higher education research literature is important because it indicates that the significance of indirect rather than direct effects of institutional characteristics does not reflect a special limitation of structural diversity. Rather, this is a phenomenon that applies generally to institutional characteristics of universities.
Despite the suggestion by Wood and Sherman, I did not shift my argument. I always argued that structural diversity operates by providing opportunities for interaction with diverse others and that it is the interaction in classrooms and in the informal campus environment that produces student outcomes.
The Effects of Structural Diversity: Presumed Refutation of My Testimony
The central contention of the Wood-Sherman report and the major new material they add to the previous NAS Amicus brief of summer 2000 -- is that the regressions on which I based my original expert report show that structural diversity (as indicated by the percent minority of the student body) is not empirically related to the educational outcomes I investigated. If this is true, that is, if assembling a diverse student body does not predict the outcomes of interest, they argue, the case for insuring ethnic diversity collapses.
The premise of this argument (that their tables indicate that percentage minority is unrelated to outcomes) is flawed statistically. Moreover, even if the premise were true, the conclusion (that the case for using race in admissions collapses) is flawed logically.
The Statistical Flaw. The flaw in the Wood-Sherman statistical approach is that their assessment of the association between structural diversity (percent minority students) and outcomes is based on regressions (specifically, the step 3 regressions in my original Expert Report) that control for diversity experiences (the ways in which students interact with diverse others). Thus, in evaluating the impact of structural diversity, Wood and Sherman are looking at its effects controlling for the very mechanisms through which structural diversity must operate if it is to affect the outcomes.
This fundamental error is quite remarkable because it is so basic and widely recognized in the statistical community. Indeed, the danger associated with controlling for such explanatory mechanisms (referred to statistically as endogenous covariates) is emphasized in every first-year course in applied statistics.
Consider a causal model for the effect of smoking on lung cancer. Suppose our theory is that smoking damages lung tissue and that the damaged lung tissue tends to become cancerous. In this case, smoking is the causal variable of interest, lung cancer is the outcome, and damaged tissue is the mechanism by which smoking causes lung cancer. Now suppose we study the association between smoking and lung cancer, controlling background variables, and we find an association between smoking and lung cancer. What would happen if we then controlled tissue damage as an additional predictor in our model? If our theory is correct, there would be no association between smoking and lung cancer once tissue damage is controlled. Here tissue damage is an "endogenous covariate:" "endogenous" means that it is an outcome of smoking; it is a "covariate" because it is another explanatory variable used in the analysis. If we want to know whether a variable X (in this case smoking) is causally related to a variable Y (in this case lung cancer) we would never control for an endogenous covariate (in this case tissue damage). On the other hand, if we want to study the mechanism by which smoking causes lung cancer, we would control tissue damage. In fact, the disappearance of the smoking effect, once tissue damage is controlled in the analysis, would provide evidence in favor of the theory that smoking causes lung cancer by damaging lung tissue. What the disappearance of the smoking effect would indicate is that damage to the lung tissue is the complete explanation for the effect of smoking on cancer. If not for the fact that smoking causes lung tissue damage, smoking would be a benign habit. But since smoking does damage lung tissue, the disappearance of the smoking effect does not tell us that smoking is not hazardous to one's health.
There are many similar examples in social science. In asking whether the amount of education people attain increases their earnings we would not control cognitive skill, an endogenous covariate, because it is an outcome of education and a mechanism by which education would plausibly affect earnings. However, if we wanted to see whether the effect of education on earnings works through cognitive skill, we would control cognitive skill.
In formulating and evaluating admission policy at Michigan, mechanism is all-important. Given a somewhat diverse student body, do students of varied background interact in ways that promote desired outcomes? If they do, the policy is achieving its aims. My analyses specify the kinds of interactions among diverse students that promote the desired outcomes. By controlling percent minority students (in what I call step 3 regressions in my Expert Report), I attempted to insure that the analysis isolated the active ingredient -- the interactions among diverse others -- that is essential. I also controlled two other institutional features related to diversity overall campus emphasis on diversity, and faculty emphasis on diversity for the same reason. I wanted to be assured that the diversity experiences students have in classes and in interaction with diverse peers outside of classes are the actual mechanism through which diversity affects student outcomes. This was entirely appropriate given my theoretical understanding of how diversity works in education.
These analyses (the step 3 regressions), however, supply no evidence about the causal effect of percent minority students. That was not my aim. And it is entirely inappropriate for Wood and Sherman to use those regressions with that aim in mind. To do so
controls for theoretically and empirically compelling endogenous covariates, inevitably biasing inferences about the causal effect of percent minority students on outcomes. The bias, as the discussion above shows, is toward finding few or no effects of percent minority students, which is, of course, what they find.
This fundamental error runs through table after table of the Wood-Sherman report. They claim that these tables indicate that percentage of minority students on campus does not affect outcomes. One might ask whether the authors understood that they were indeed controlling for important endogenous covariates. They did know this; remarkably, they thought it was the correct thing to do. This is unmistakably clear in several places in the Wood-Sherman report. Thus on page 82 we read:
". . . the four campus experience variables that Gurin considers (i.e., the things that Astin typically calls "diversity" activities") are controlled for in the ACE-HERI-CIRP regressions (their emphasis). After all these variables have been controlled, the regressions fail to find significant correlations between racial diversity and final student outcomes. This means that Gurin cannot argue that racial diversity produces educational benefits even when it is conjoined with these other factors (their emphasis). This finding completely devastates Gurin's 'in turn' hypothesis." (p. 82)
The flaw in this reasoning is so basic and so fatal to the conclusions of the Wood-Sherman critique that I must comment further at the risk of redundancy. It is precisely after controlling for diversity experiences that effects of structural diversity (here percent minority students) ought to disappear under my theory and the work of many scholars in the field. Just as the smoking-lung cancer association ought to disappear (or be diminished) once we control damaged lung tissue, just as the education-earnings association ought to disappear (or be diminished) after we control cognitive skill, my theory predicts that the percent minority student associations with outcomes ought to disappear (or be diminished) once diversity experiences are taken into account. Whereas Wood and Sherman argue that "conjoining" the causal variable with the endogenous covariates ought to strengthen the causal variable's association with the outcome, just the opposite is true. Controlling the endogenous covariate removes or otherwise biases the causal effect of the predictor of interest toward zero.
The exact same error is even more succinctly advocated on pages 85-86. Referring to me, we read:
"Her task was to demonstrate that structural diversity has direct effects, not indirect ones. That she fails to do, just as Astin did."
This statement is completely wrong. The direct effect of structural diversity (percent minority students) is its association with outcomes, after controlling diversity experiences. That effect, under my theory, should be null. The indirect effect is indeed what is of interest here, because it specifies how a diverse student body, once constructed, can achieve the educational aims of the university.
A Further Logical Fallacy. Wood and Sherman claim that their tables show that structural diversity has no impact on outcomes. We have just seen that this claim is baseless. Rather than refuting the impact of diversity, the Wood-Sherman analysis has simply highlighted the importance of diversity experiences in accounting for any correlation between structural diversity and outcomes.
But what if Wood and Sherman had shown no causal effect of structural diversity? Would this imply that the argument in favor of the University's admission policies collapses? The answer, clearly is "No."
Wood and Sherman go beyond saying that structural diversity is insufficient for achieving the desired outcomes. The report asserts that structural diversity is not a necessary condition for the diversity experiences I studied. As indicated earlier, Wood and Sherman argue that one can have diversity experiences without diversity.
Referring to the step 3 regressions again (in which diversity experiences are related to outcomes, controlling for structural diversity), the report states:
"... according to her model, the size of the effect (of diversity experiences) is independent of the number of minorities on campus. Thus the claimed beneficial effects of her campus experience variables would remain statistically significant even if the number of minorities on campus were to drop." (p. 85)
This statement may be true within limits. That is, over some range of percent of minority students, the diversity experiences may well have the same effect. However, as the number of minorities declines, the number of persons exposed to the effect would inevitably decline. Wood and Sherman do not consider this fact, which is extremely salient to those responsible for running the University. Thus there is nothing illogical about saying 1) the model is correct (diversity experiences have good effects) when they occur in institutions with lower as well as higher percent of minorities, and 2) the summary effect of those experiences on the student body depends strongly on structural diversity. The summary effect of diversity experiences is the product of a) the magnitude of the effect on those exposed to diversity experiences, and b) and the number exposed. If minorities constitute a tiny percent of the student enrollment, the interactions those minorities have with other students may well benefit those few other students who happen to have
the interactions. However, the number of other students so affected must be constrained by the tiny number of minority students. Thus, having a small percent minority severely limits the impact of diversity experiences in the student body as a whole.
The argument that Wood and Sherman are making here is thus illogical. Their argument is that structural diversity can be unimportant even if diversity experiences are important. This cannot be true because structural diversity governs the number of students exposed to diversity experiences.
I discussed the extent of the constraint that is imposed by level of structural diversity in a supplemental expert report filed in Gratz, et al. v. Bollinger, et al., No. 97-75231 (E.D. Mich.) and in Grutter, et al. v. Bollinger, et al., No. 97-75928 (E.D. Mich.). I wrote that my own teaching experience suggests that the benefits of diversity are maximized when there are at least three members of a minority group in the class. This increases the likelihood that all students will be exposed to heterogeneity within the minority group and thus reduce stereotypic thinking. I also reviewed social psychological research showing that the presence of more than a token number of minority students does decrease the likelihood that those minority individuals will be stereotyped by other students. This research also demonstrates a number of other negative effects on minority individuals when they are just tokens in a learning and performance setting.
A supplemental report submitted by Stephen Raudenbush in Gratz, et al. V. Bollinger, et al. (July 13, 2000) and in Grutter, et al. v. Bollinger, et al. (December 21, 2000) shows that the probability of a white student having contact with three or more minority members would be dramatically reduced by the adoption of a race-blind admissions policy in LS&A and in the Law School. For example, the probability of having that experience in classes of the size where students can actually interact with each other, called a first-year seminar in LS&A and a half-section course in the Law School, is 56% under current policy and 16% under a race-blind admissions for the first-year seminar and 96% under current policy and 24% under race-blind admissions for the Law School half-course. The probability of encountering this amount of diversity in informal educational settings such as undergraduate student activities with seven-to-eight other students or such as Law School moot court exercises and legal clinics would also be greatly reduced. It is especially in small settings that students can get to know each other and learn from each other. These small settings would be most affected by a race-blind admissions system. Raudenbush's work also demonstrates the sharply increased probability for a minority student of being the only one -- or token -- in various educational settings in LS&A and the Law School.
Evidence: Effects of Experience with Diversity Are Consistent and Meaningful
Wood and Sherman criticize my testimony now, just as the NAS did in its amicus brief last year, because the effects I report are modest. Wood and Sherman also charge that I was trying to deceive readers by putting "little black boxes" in appendix tables
indicating when the relationships between measures of diversity experience and measures of educational outcomes are statistically significant.
The tables that I included in the appendix, far from trying to fool anyone, provided for each type of diversity experience and each educational outcome, the original product moment correlation. Since I indicate a "little black box" for statistically reliable effects after controlling for entering characteristics of students (step 1 regressions), for classroom diversity in assessing the effect of informal interaction with diverse peers (step 2 regressions), and institutional characteristics (step 3 regressions), any reader would know that these effects could not be larger than the original product-moment correlations between diversity experiences and educational outcomes. I used the "little black boxes" indicating a statistically reliable association in order to make the appendix tables easier to grasp, not to fool the reader.
Consistency of Effects. The effects between any single measure of diversity experience and any single measure of educational outcomes would be expected to be small. Using single-item measures makes it possible to assess level of consistency of statistically reliable effects across many measures. The level of consistency that we found is striking. Wood and Sherman attempt to trivialize the importance of consistency of findings.
The graphic presentation of the consistency across measures that our analyses of the CIRP data show, in Figure 1 below, for white students (who constitute approximately 90 percent of the students in the CIRP) speaks for itself. These are the effects of diversity experience on educational outcomes, controlling for student characteristics and institutional characteristics, and in the case of the effects of informal interaction with diverse peers controlling for classroom diversity as well. In many instances, these effects also control the students' scores on these outcome measures when they entered college.
Figure 1 shows that, even using the most stringent criterion of statistical reliability, 45 percent of the effects of diversity experiences on the multiple measures of learning outcomes are statistically reliable. Eighty-three percent of the diversity effects on the measures of democracy outcomes are reliable, and slightly over 30 percent on the measures of living in a diverse society after college are as well. As I said in my original Expert Report, this is an impressive level of consistency across measures in this field.
Any one effect of a single measure of diversity experience on a single measure of educational outcomes is bound to be small, especially after instituting these various control measures. The very large literature on college impact concludes that this would be expected, not only of diversity experience but of other college experiences as well.
Pascarella and Terenzini in the same classic overview of college impact research cited above emphasize that no single college experience has a very big effect.
"Most theoretical models of development in no way guarantee that any single experience will be an important determinant of change for all students. A majority of important changes that occur during college are probably the cumulative result of a set of interrelated experiences sustained over an extended period of time. Consequently, research that focuses on the impact of a single or isolated experience, a characteristic of most investigations of within-college influence, is unlikely to yield strong effects." (P 610)
Because research is able to demonstrate only small effects of any single college experience, they stress the importance of consistency of effects, just as I do.
"This conclusion (that single experiences don't have strong effects) implies that the enhancement of educational impact of a college is most likely if policy and programmatic efforts are broadly conceived and diverse. It also implies that they should be consistent and integrated." (P 655)
Analyses Using Multiple-Item Measures. In my response to the NAS Amicus brief of summer 2000, I presented evidence that subsequent analyses of both the CIRP and Michigan databases in which we formed multiple-item indices of both diversity experiences and of educational outcomes show, as would be expected, larger effects. Tables 1 and 2 below present these effects.
Table 1 shows the effects of diversity experiences on educational outcomes for white, African-American, and Latino students from analyses of the CIRP databases. The measure of classroom diversity in these analyses is still the single item, having taken an ethnic studies course, as this is the only measure available in the CIRP. The measure of informal interactional diversity is a three-item index of responses to three questions asked in 1989 of the extent to which students, over their college years, had socialized with someone from a different racial/ethnic group, had discussed racial issues, and had attended a racial/cultural workshop. The measures of learning and democracy outcomes also involve multiple items. The beta weights in the table indicate the effect of diversity experiences, controlling student characteristics, institutional characteristics, and where available the very same measure of educational outcome taken when the students first entered college (as described below in the section on Control Variables).
Table 2 shows the effects of diversity experiences on educational outcomes for white and African American students in the Michigan Student Study database. Here the measure of classroom diversity is the same two-item index used in my original Expert Report. The measure of informal interactional diversity is an index summarizing amount of contact students had with groups other than their own; how much such interactions had involved "meaningful, and honest discussions about race and ethnic relations;" how much 'sharing of personal feelings and problems;" and how many of their six closest friends at Michigan were not from their own
racial/ethnic group. The measures of learning and democracy outcomes also involve multiple items. The beta weights in the tables indicate the effect of diversity experiences, controlling the same student characteristics as in the CIRP analyses and where available the very same measure of educational outcomes taken when the students first entered the University of Michigan.
These tables show effect sizes, which range from .10 to .35, that compare favorably with many findings in prestigious journals in psychology and education.
These tables demonstrate that:
Effect Size and Policy. Wood and Sherman further criticize my expert testimony for viewing small effects as relevant to policy. I disagree. The size of these effects is commonly viewed in social science as highly consequential for policy, especially when outcomes and predictors are likely to be measured with substantial random error, as they typically are in studies of college impact. It is widely known that the kinds of processes and outcomes of interest here are difficult to measure with high precision and that the measurement error of both predictor and outcome sharply diminish standardized effect sizes.
A second point is technical. Wood and Sherman claim that diversity experiences explain small proportions of variance in outcomes four years or nine years after they entered college. However, small proportions of variance explained by a given variable can, in fact, reflect practically important effects for individuals. Perhaps the classic article on this is by Robert Rosenthal and Donald Rubin (1982 Journal of Educational Psychology) who provided a number of examples in which experimental treatments with important effects explained small proportions of variance. Their conclusion was to warn social scientists not to use proportion of variance as a measure of effect magnitude.
Statistical Controls, the CIRP Sample, and Self-Report Measures
To summarize, the key argument in the Wood-Sherman report is that, even if diversity experiences create positive student outcomes, structural diversity itself is not linked to those outcomes. This argument is flawed, as I have shown.
However, Wood and Sherman also criticize my testimony on a variety of specific grounds, including number of controls, choice of sample, and self-report measures.
My Analyses Exercise Appropriate Controls
Controls Used. I chose controls that helped to rule out alternative explanations, giving me a rigorous test of the impact of diversity experiences on educational outcomes. The regressions I carried out for both CIRP and the Michigan Student Study always controlled personal characteristics of the students at time of entrance to college. I also controlled institutional characteristics in the CIRP analyses.
I chose three sets of personal characteristics.
These personal characteristics are controlled in all the step-one regressions that are included in Appendix D of my testimony.
I also controlled two other sets of variables.
Wood and Sherman imply that it was my responsibility to control extremely large numbers of student background and institutional characteristics. One can control a multitude of measures, so long as the size of the sample is very large, without any theoretical rationale for including those controls. It is far more reasonable to choose controls dictated by a theoretical rationale. I was also constrained in the number of controls I could use because I wanted to use the same controls across the CIRP and Michigan studies, and across the white and student of color samples. Only the white sample in the CIRP study had a large enough N to allow an open-ended (and in my view non-theoretical) approach to controls.
As I indicated above, I used controls that my theory and familiarity with the literature in the field suggested were the most important to rule out alternative explanations. Moreover, even if I had extremely large samples of all groups of interest in my study, it would be my responsibility to find a balance between controlling those variables that, if uncontrolled, would plausibly bias the estimates of the effects of diversity experiences,6 while avoiding a mindless approach that would add every conceivable variable. The latter approach can lead to a problem of "over-fitting" in which all relationships are estimated unreliably. I believe that I achieved the appropriate balance.
Wood and Sherman also harshly criticize me for not controlling all relevant explanatory variables. Unfortunately, this criticism is closely connected to the fallacy discussed earlier (see discussion above on controlling endogenous covariates). Thus we read on page 85:
"For all of her rhetoric, all she (Gurin) is doing is running regressions that leave out relevant explanatory variables. If she were to include all of her four diversity activities in one model, it is likely that the effect of percentage minorities on campus would disappear."
As discussed above, the disappearance of that effect is exactly what my theory would predict. Percent minority could hardly affect outcomes except through experiences students have interacting with each other. NAS is again recommending the biased procedure of controlling endogenous covariates.
I Chose Colleges/Universities in the CIRP to Provide the Most Relevant
Wood and Sherman contend that I don't say why I chose my sample of institutions within the CIRP database. I do so on pages 5 and 6 of Appendix C in my Expert Report. I included students who had data at all three points in time (1985 entering college, 1989
in the senior year, and 1994 presumably for most students five years into the post-college years). My analyses are explicitly longitudinal and thus I wanted only those students with data across all of the years.
Wood and Sherman cite two kinds of schools that I dropped: two-year colleges, and historically black colleges. They seem to feel that my choice not to analyze these schools is very bad (page 80; see especially page 84 where the choice to drop two-year schools is labeled "not scholarly"). Two-year schools usually have open admissions and would tell us little about diversity in selective schools, such as the University of Michigan. The context of diversity is completely different in historically black colleges.. The greater risk to the validity of my analysis would have been to include these schools.
The policy and legal issues in this case involve a four-year university, the University of Michigan, that has, over most of its history, been overwhelmingly white in its enrollment. Of late, the University of Michigan has adopted a policy that increases the number of minority students admitted. Of greatest value in providing evidence for this case are other four-year schools having ethnic compositions that vary across the range of interest to policy and the law in this case. Under these conditions, data from two-year schools and from nearly all-black schools may provide little or no useful evidence. Indeed, a well-known error in statistical modeling is "extrapolation," that is generalization from unlike cases. Statistical models that use extrapolation often produce unreliable or even nonsensical results.
My Use of Self-Report Measures Is Standard
Wood and Sherman further criticize my use of self-report measures of student outcomes. Contrary to their assertions, self-assessments are credible and widely accepted methods for measuring learning. As part of the recent national concern with accountability and the need for assessment of educational outcomes, in 1991 the National Center for Education Statistics (NCES) began hosting a series of workshops to examine the feasibility of creating measures of college student learning similar to the NAEP. A result of these workshops, as well as the deliberations of other higher education groups, has been the recommendation that, given the difficulties and time that will be required to develop national assessment instruments, alternative measures should be used as proxies for the proposed national assessment. Student self-reports have been particularly strongly proposed as appropriate proxies. For example, NCES contracted with the National Center for higher Education Management Systems (NCHEMS) to review the research on a variety of possible indicators of college outcomes. One of their conclusions was that self-report data on academic development and experiences have moderate to high potential as proxies for a national test, and as possible indicators for decision-making in higher education (Ewell et al., 1993).
The recommendation to use self-ratings for the assessment of college student impact is based on a long history of research. In their review of over 2600 studies on the impact of college on students, Pascarella & Terenzini (1991) review and integrate the research
on the acquisition of specific academic skills as well as more general cognitive competencies. They review the studies that used self-reports as well as those that used standardized tests of skills and competencies, and of gains in these skills and competencies. They use these studies interchangeably as mutual supports for the conclusions they draw, showing that they do not believe that self-reports and standardized tests should be distinguished as totally different types of assessments. They justify their use of self-assessments by the research they reviewed that indicates that self-reports of learning outcomes are positively correlated with standard tests of achievement (just as I did on pages 5 and 6 of Appendix C of my Expert Report).
Further evidence for the convergence of self-reports and standardized test measures of learning outcomes comes from the fact that both methods have tended to yield similar results when related to student entrance characteristics and to student college experiences (Pascarella & Terenzini, 1991; Ewell and Jones, 1993). A recent study (Anaya, 1999) should be particularly noted because it comes from analyses conducted on a sub-sample of students who had taken the GRE drawn from the same cohort of CIRP that was used in the Gurin Report. Anaya examines what conclusions would be drawn using GRE, GPA, and student-reported growth as measures of learning. The results show that similar substantive conclusions can be made about the relationships of college experiences to learning outcomes, regardless of which measure of learning was used.
Democracy Outcomes Are Important Aspects
Contrary to Wood and Sherman and the earlier NAS Amicus brief, my inclusion of measures of democracy outcomes specifically addresses a major mission of higher education. From the earliest times in the United States, and certainly since the time of Thomas Jefferson, who felt that citizens must be created through education and who made the founding of the University of Virginia the primary work of his post-presidential years, a central mission of universities has been to produce an educated leadership for our democracy.
Benjamin Barber, a political scientist at Rutgers, stresses that all traditional political theory -- liberal, republic, and democratic -- have viewed citizens as created, not born. He asks the question: "Does a university have a civic mission? Of course, for it is (his emphasis) a civic mission. The cultivation of free community of civility itself" (Barber, 1998).
But how does diversity foster civic preparedness? It plays a role in two critically important theories discussed in my Expert Report: Aristotle's theory of democracy that is built on difference rather than on similarity, and Piaget's theory of moral development. Both of them emphasize the following conditions: the presence of diverse others who bring multiple, and sometimes conflicting, perspectives; discussion among peers who are equals; and discussion under rules of civil discourse.
Higher education must prepare students today to be leaders in an incredibly and increasingly diverse society. This is exactly what the General Motors Amicus brief contends that major corporations are also looking to the University of Michigan to do:
"Diversity in academic institutions is essential to teaching students the human relations and analytic skills they need to thrive and lead in the work environments of the twenty-first century. These skills include the abilities to work well with colleagues and subordinates from diverse backgrounds; to view issues from multiple perspectives; and to anticipate and respond with sensitivity to the needs and cultural differences of highly diverse customers, colleagues, employees, and global business partners" [See brief of General Motors Corp. as Amicus Curiae No. 97-75928, at 2, (E.D. Mich., July 17, 2000)]
The rationale for including democracy outcomes is therefore about preparing students for a future as leaders of democracy and of our major economic/other societal institutions. It is the mission of higher education to:
Students themselves articulate the impact of diversity on these kinds of democracy outcomes. A student's self-reflection paper in a course on intergroup relations at the University of Michigan stresses the impact of informal interaction with diverse peers on citizenship engagement:
"Before coming to the University of Michigan and getting to know so many different kinds of people, I never thought of myself a citizen. I am ashamed to say that I didn't even vote in the last election, though I was old enough. Now I realize that was because I didn't want to bother about any part of the world outside of my own little social circle. Politics was 'out there'. I was 'in here' in my own little world. Ironically, it was by hearing stories from African American students from Detroit and Latino students from the southwest that opened my eyes to the limits of always being 'in here.' I no longer want the walls. I'd
rather have a full life. Part of that life is that I do see myself now as being a citizen and making a difference in the communities in which I will eventually live."
The Wood-Sherman Report Does Not Refute My Testimony
The Wood-Sherman Report does not refute the Michigan diversity evidence.
Allport, Gordon (1954). The nature of prejudice. Reading, MA: Addison-Wesley.
Amir, Y. (1976) The role of intergroup contact in change of prejudice and ethnic relations. In Katz, P. A. (Ed.) Towards the elimination of racism. New York: Pergamon Press, Inc.
Anaya, Guadalupe (1999). College Impact on Student Learning: Comparing the Use of Self-Reported Gains, Standardized Test Scores, and College Grades. Research in Higher Education, 40 (5), 499-526.
Barber, Benjamin (1998). A Passion for Democracy. Princeton: Princeton University Press.
Cook. S. W. (1984) Cooperative interaction in multiethnic contexts. In N. Miller and M. B. Brewer (Eds.) Groups in contact: The psychology of desegregation. New York: Academic Press.
Ewell, Peter T. and Jones, Dennis, P. (1993). Actions Matter: The Case for Indirect Measures in Assessing Higher Education's Progress on the National Educational Goals. The Journal of General Education, 42. 123-148
Pascarella, Ernest T. and Terenzini, Patrick. T. (1991) How College Affects Students" Findings and Insights from Twenty Years of Research. San Francisco: Jossey-Bass
Pettigrew, T. F. (1991) Normative theory in intergroup relations: Explaining both harmonty and conflict. Psychology and developing societies, 3, 3-16
Rosenthal, R., and Rubin, D.B. (1982). Further Meta-Analytic Procedures for Assessing Cognitive Gender Differences. Journal of Eduational Psychology 74 (5): 708-712.