Daubert in the Practice Areas
Daubert in Employment Litigation
Examples Of Statistical Proof Of Employment Discrimination
1. [§13.41] In General
Statistical testing has been an element of proof in employment discrimination litigation since the appearance of the binomial model in Castaneda v. Partida, 430 U.S. 482, 97 S.Ct. 1272, 51 L.Ed.2d 498 (1977) (concerning race of jurors), and Hazelwood School District v. United States, 433 U.S. 299, 97 S.Ct. 2736, 53 L.Ed.2d 768 (1977) (concerning race of newly hired teachers). Since that time, statistical testing has been used extensively to compare the expected number of members of some protected group to the actual number of members of that protected group who have been hired, fired, or otherwise involved in significant employment actions. See, e.g., Hazelwood School District; Sheehan v. Daily Racing Form, Inc., 104 F.3d 940 (7th Cir. 1997). The important role of Daubert in employment class actions is discussed in §§13.49–13.55.
The persuasiveness of statistical proof in employment discrimi¬nation matters seems well established. See International Brotherhood of Teamsters v. United States, 431 U.S. 324, 97 S.Ct. 1843, 52 L.Ed.2d 396 (1977) (noting importance of role of statistical analyses in establishing prima facie case of racial discrimination in both jury selection and employment dis¬crimination cases). See also Bazemore v. Friday, 478 U.S. 385, 106 S.Ct. 3000, 92 L.Ed.2d 315 (1986). The notion of using regression analysis in employment discrimination litigation dates at least to 1975 and the publication of a student note that advocated the idea. See Note, Beyond the Prima Facie Case in Employment Discrimination Law: Statistical Proof and Rebuttal, 89 Harv.L.Rev. 387 (1975). The applicability of regression to the analysis of discrimination has been extensively discussed and documented since then. See Finkelstein, The Judicial Reception of Multiple Regression Studies in Race and Sex Discrimination Cases, 80 Colum.L.Rev. 737 (1980); Fienberg, The Increasing Sophistication of Statistical Assessments as Evidence in Discrimination Litigation, 77 J.Am.Stat.Ass’n 784 (1982); Note, Title VII, Multiple Linear Regression Models, and the Courts: An Analysis, 46 Law & Contemp. Probs. 283 (1983); Rubinfeld, Econometrics in the Courtroom, 85 Colum.L.Rev. 1048 (1985); Lempert, Statistics in the Courtroom: Building on Rubinfeld, 85 Colum.L.Rev. 1098 (1985).
The regression issues raised in §§13.17–13.24 and expanded on in §§13.25–13.40 inform the use of regression analysis in employment discrimination cases much as they do in antitrust and securities litigation. Regression models must be properly specified and must meet the basic regression assumptions in employment discrimination cases just as they must in other types of litigation. With respect to model specification (the term economists apply to include all relevant explanatory variables), it is appropriate to repeat here the conflict between Daubert — which says that regression must meet the standards that economists would apply to their nonlitigation research — and Bazemore — which seems to suggest that a regression analysis is not fatally flawed just because it leaves out a relevant variable. A Seventh Circuit opinion, Sheehan, highlights why an omitted variable is fatal to the ability of statistical analysis to inform the finder of fact, and the scholarly literature cited throughout this chapter discusses the undesirable results that occur when regression models are incorrectly specified.
2. [§13.42] Model Specification Error And Inadmissibility
Of Spurious Statistics: Sheehan v. Daily
Racing Form, Inc.
In Sheehan v. Daily Racing Form, Inc., 104 F.3d 940 (7th Cir. 1997), the plaintiff, Sheehan, was a well-regarded older employee of a racing newspaper company that used manual layout procedures to generate its papers. The defendant purchased a similar company that used computerized layout procedures and converted the existing operation to the computerized techniques.
In subsequent layoffs, Sheehan and most of the other older em¬ployees were terminated, while most of the younger workers were retained. Sheehan brought a lawsuit for age discrimination, and his expert proffered a statistical study that showed a strong correlation between age and the pattern of dismissal. The court excluded the expert’s testimony, noting that the expert had failed to “correct for any potential explanatory variables other than age.” Id. at 942. Especially important was the expert’s failure to consider computer skill as an explanatory variable in his analysis of terminations and that the omitted variable — computer skill — was correlated with age. As a result, if Daily Racing Form had terminated employees who lacked computer skills, and the older workers tended to lack computer skills, a study that omitted computer skills as an explanatory variable would find a correlation between dismissal and age, regardless of whether age was a criterion for dismissal. The opinion does not identify the type of statistical analysis employed, but this failure is an example of the class of misspecification problems discussed throughout this chapter. When a regression model omits explanatory variables that are correlated with included explanatory variables, the regression coefficients and their tests and error rate calculations lose the desirable properties that make them reliable. This is a prime example why regression that omits an important variable must be excluded by the gatekeeper. When important explanatory variables are omitted, the statistical analysis is unreliable. It not only misleads but also lacks the capacity to inform, so it cannot be shown to have probative value. In terms of Fed.R.Evid. 403, it has no probative value, but it surely has the capacity to misinform the jury; the latter danger must substantially outweigh the nonexistent former probative value. Analogous statements hold for nonregression statistical models.
3. [§13.43] Model Specification Error And Admissibility
Of Spurious Statistics: Obrey
The notions discussed in §13.42 are not universally understood. In Obrey v. Johnson, 400 F.3d 691 (9th Cir. 2005), the plaintiff, Obrey, alleged that the defendant, the Secretary of the Navy, had engaged in a pattern or practice of discriminating against qualified candidates of Asian-Pacific ancestry in favor of Caucasian applicants for senior management positions at the Pearl Harbor Shipyard. The district court excluded the principal expert evidence supporting Obrey’s pattern or practice claim, judgment was entered against him, and he appealed, claiming that the district court abused its discretion in failing to admit a statistical report showing a correlation between race and promotion at the Shipyard. The Ninth Circuit reversed and remanded, saying that leaving some variables out of a statistical analysis goes to weight, not to admissibility. It is instructive to compare the court’s reasoning with the Seventh Circuit’s more sophisticated analysis by Judge Posner of a similar statistical study in Sheehan v. Daily Racing Form, Inc., 104 F.3d 940 (7th Cir. 1997).
In Obrey, the government argued that “the statistical analysis was inadmissible because it failed to account for the relative qualifications of the applicants being studied,” but the court ruled that
Obrey’s statistical evidence was not rendered irrelevant under Rule 402 simply because it failed to account for the relative qualifications of the applicant pool. . . . A statistical study may fall short of proving the plaintiff’s case, but still remain relevant to the issues in dispute. . . . Thus, objections to a study’s completeness generally go to “the weight, not the admissibility[,] of the statistical evidence.”
Id. at 694–695, quoting Mangold v. California Public Utilities Commission, 67 F.3d 1470, 1476 (9th Cir. 1995). That “Obrey’s statistical evidence was not rendered irrelevant under Rule 402 simply because it failed to account for the relative qualifications of the applicant pool,” id., will come as startling news to those analysts, scientists, and statisticians who imagine that the relative qualifications of the applicant pool have something to do with who gets job offers, which is to say, all of those who understand the basic statistical concepts that govern this type of work.
The court in Obrey cited to Kumho Tire Co. and correctly stated that “[t]he Rule 702 inquiry is a ‘flexible one’ whose ‘overarching subject is the scientific validity’” of the expert’s methods. Obrey, 400 F.3d at 696. The possibility of scientific validity is precluded when conclusions are based on statistical models that use the wrong set of explanatory variables.
In a final irony, the court in Obrey cited Metabolife International, Inc. v. Wornick, 264 F.3d 832, 843 (9th Cir. 2001), for the proposition that “[r]ather than disqualify the study because of ‘incompleteness’ . . . , the district court should examine the soundness of the methodology employed,” apparently missing the fact that, for this kind of incompleteness, the two are the same. Omission of such a variable in such a study renders the study methodologically unsound at the most elementary levels. It cannot be reliable and should not be admitted. Judge Posner catches this error, and the better rule is that of Sheehan.