Biographical data, or biodata, are measures of key aspects of individuals’ life experiences intended to predict job applicants’ future performance in organizations, whether that performance is task-specific job performance, teamwork, or shoplifting. Although bio-data can be developed to measure a wide array of experiences and psychological constructs, the fundamental and general premises underlying the predictive power of biodata measures are that
- individuals in free societies shape their life experiences, and they also are shaped by them;
- this process of reciprocal influence between personality and situations occurs over a large time span; and therefore,
- measures of past experience should predict future work behavior, especially given a relatively unconstrained environment where employees’ typical performance can be wide-ranging.
In light of these premises, items on a biodata measure can be relatively personality oriented or covert in nature (e.g., “To what extent does your happiness depend on how things are going at work?”), or they can be relatively situation oriented and overt in nature (e.g., “Approximately how many books have you read in the past three months?”). In either case responding involves some cognitive processing where test takers are required to recall and summarize information, the accuracy of which depends on the accuracy of prior perception and storage, and in many cases the saliency or recency of the event.
Although biodata can vary widely in their content and constructs measured and can be scored in different ways, they have consistently demonstrated moderate to high levels of validity across job types (approximately .30); they also demonstrate incremental validity beyond ability and personality measures in predicting performance. Constituent biodata items either explicitly or implicitly reflect constructs such as ability, personality, motivation, interpersonal skills, and interests. They can be relatively pure measures of these constructs; however, biodata items that ask test takers about their experiences may be related to a combination of constructs, not just one. Analyses of the latter type of items may result in a weak general factor in a factor analysis or a low alpha reliability coefficient. Both test-retest reliability and alpha reliability should be considered when attempting to measure the stability of scores on biodata measures.
An outline of 10 major attributes of biodata items was proposed by F. A. Mael and is as follows:
- Historical versus hypothetical (past behaviors versus predicted behaviors in the future, or behaviors in what-if scenarios)
- External versus internal (behaviors versus attitudes)
- Objective versus subjective (observable or countable events versus self-perceptions)
- Firsthand versus secondhand (self-descriptions versus how people would say others describe them)
- Discrete versus summative (single events versus averaging over a period of time)
- Verifiable versus nonverifiable
- Controllable versus noncontrollable (circumstances that could or could not be influenced by a decision)
- Equal access versus unequal access (access to opportunities with respect to the group being tested)
- Job relevant versus nonjob relevant
- Noninvasive versus invasive
Historically, biodata measures have developed out of a tradition of strong empiricism, and therefore a wide variety of scoring methods have been proposed. The criterion-keying approach involves taking individuals’ responses to a given biodata item and calculating the mean criterion score or the criterion-related validity for each response option. This is done for each item, and these values are used as item response weights for scoring purposes. Weights may be rationally adjusted when nonlinear patterns in relatively continuous response options are found or when some weights are based on small sample sizes. A similar approach to criterion keying can be taken when keying biodata items not to criteria but rather to personality or temperament measures. This is a particularly interesting approach in keying a set of objective or verifiable biodata items, which tend to be less susceptible to faking but often are harder to assign to single psychological constructs. (Even if such keying is not done, it remains helpful to place the biodata measure within a nomological net of cognitive and noncognitive constructs.) When biodata items can be assigned to constructs in a relatively straightforward manner, such as by developing item content around constructs or through an a priori or post hoc subject matter expert (SME) item-sorting procedure, a straightforward scoring of each item along a single underlying continuum may be possible as is done with traditional Likertscale self-report measures of personality.
Configural scoring is an entirely different approach to scoring biodata items, because it involves grouping individuals into representative profiles of biodata scores. Subgroups are defined, both conceptually and methodologically, as internally consistent yet externally distinct, similar to the interpretation of statistically significant group differences in the analysis of variance. Individuals are often assigned to subgroups based on their similarity to a subgroup mean, such as in k-means analysis; or sometimes a set of data is aggregated until the appropriate balance between parsimony and descriptiveness is reached, such as in Ward’s method. Subgroup profiles may then be labeled (e.g., goal-oriented social leaders or emotional underachievers) and then related to relevant external criteria, or profiles of criteria, for purposes such as personnel selection and placement; or subgroup profiles can be used in their own right for training and development.
Two general points regarding the scoring of biodata items are worth noting. First, any appropriate scoring method should be informed by both rational and empirical approaches. Being purely rational or theory based ignores important empirical data that could serve to revise the theoretical underpinnings that generated the biodata items in the first place—or at least it could revise subsequent item-development rules. Conversely, being purely empirical in the absence of a theoretical or conceptual rationale would impede, if not preclude, appropriate item development, item revision, and score use and interpretation. Second, item-scoring methods that are developed on one sample should be cross-validated on an independent sample, such as a holdout sample from the original data set or an entirely different sample. Doing so helps ensure that the features of the model are generalizable and not sample specific; for example, cross-validation can ensure that increased validity, reduction of group mean differences, or a cleaned up exploratory factor analysis result achieved in one sample by selectively reweighting or removing biodata items can then be achieved in an independent sample using the same subset of items, so that the original results (in large part, at least) cannot be attributed to capitalization on chance. The same concern applies to regression models, where least-squares regression weights may capitalize on chance and thus artificially inflate validity. In this case, cross-validation formulas can be applied to the whole sample, to estimate what the shrinkage in validity would be should those weights be applied to an independent sample of the same size.
Because biodata items vary widely in content, no general statement about race differences can be made that is of any use. At a more specific level, however, bio-data containing culturally relevant content have demonstrated Black-White subgroup differences in terms of differential item functioning (DIF). Black-White differences in biodata have also been found in the domain of swimming proficiency. Other race differences are likely when the biodata measures are aligned with constructs where it is known that race differences exist, such as general cognitive ability or certain personality traits.
Meta-analysis indicates that studies using biodata measures generally show a favorability (i.e., job relevance and fairness) rating at about the midpoint of the scale, with measures such as interviews, resumes, and cognitive ability tests showing greater favorability and personal contacts and integrity tests showing less favorability. Although the meta-analytic mean across studies is stable, nontrivial variability in favorability ratings across studies exists; this is likely because of the variety of biodata measures that can be developed. This highlights a consistent theme in the research literature: Biodata measures tend to be viewed more favorably when they are perceived as relevant to the job at hand and part of a fair personnel selection system.
- Dean, M. A., & Russell, C. J. (2005). An examination of biodata theory-based constructs in a field context. International Journal of Selection and Assessment, 2, 139-149.
- Mael, F. A. (1991). A conceptual rationale for the domain and attributes of biodata items. Personnel Psychology, 44, 763-927.
- Mount, M. K., Witt, L. A., & Barrick, M. R. (2000). Incremental validity of empirically keyed biodata scales over GMA and the five factor personality constructs. Personnel Psychology, 53, 299-323.
- Oswald, F. L., Schmitt, N., Ramsay, L. J., & Gillespie, M. A. (2004). Developing a biodata measure and situational judgment inventory as predictors of college performance. Journal of Applied Psychology, 89, 187-207.
- Oullette, J. A., & Wood, W. (1998). Habit and intention in everyday life: The multiple processes by which past behavior predicts future behavior. Psychological Bulletin, 124, 54-74.
- Reiter-Palmon, R., & Connelly, M. S. (2000). Item selection counts: A comparison of empirical key and rational scale validities in theory-based and non-theory-based item pools. Journal of Applied Psychology, 85, 143-151.