‘What qualities does a good selection technique have?’

A Scientific Approach to Predicting Job Performance

Using reliable and valid assessment techniques is essential in predicting future job performance and development needs. Assessment techniques are also useful as a PR exercise to sell the organisation to candidates and to demonstrate adherence to employment legislation through using reliable and valid methods.

To fulfil the basic functions of a selection technique, methods used must have sound psychometric properties, in terms of both reliability (‘the extent to which a measure is free from error’) and validity (‘the extent to which a measure is measuring that which it purports to measure’).

Reliability of an assessment procedure needs to be looked at in terms of three main factors, Temporal stability i.e. would we get the same score over time? Internal consistency i.e. would we get the same score in all parts of the selection technique intended to measure that particular part of the domain? And Inter-rater reliability i.e. would different judges reach the same conclusions?

Validity of a selection technique is assessed using four main criteria. Face validity attempts to assess whether the test ‘looks’ like it measures what it says it does. Content validity aims to see if the test looks like it covers all the things it needs to assess. Criterion Validity is a combination of concurrent validity (Is a similar score obtained to the existing established measure) and predictive validity (is the test predictive of some future measure of performance). Finally construct validity examines the validity of the theoretical construct on which the selection technique is based.

Smith and Robertson (1993) discussed the major sequence of events involved in the design and validation of and personnel selection system. The traditional system involves detailed analyses of the job. Psychological attributes required by an individual who will effectively fit the position are taken from the analyses of the job. Personnel selection methods are designed with the aim of allowing selectors to attract and evaluate candidate’s capabilities on these psychological attributes. A validation process is used to assess the extent to which the personnel selection methods provide valid predictors of job performance or other criterion variables such as absenteeism or turnover.

The most significant area of change within personnel selection research has been the increased confidence in the validity of most personnel selection methods through meta-analytic studies (Hunter and Schmidt, 1990). Meta-analytic studies of a wide variety of personnel selection methods have indicated that when man-made effects of sampling error, range restriction and measurement unreliability are removed, the ‘true’ validity of personnel selection methods is much higher than originally believed.

Schmidt and Hunter (1998) reviewed 17 methods of selection to show the validity where job performance ratings were used as criteria. Several of the methods, cognitive ability tests, personality questionnaires, interviews, assessment centres and biodata have all been discovered to show reasonable validity. However a major area of concern for researchers and practitioners has been the fairness and adverse impact of selection methods.

Adverse impact occurs when members of one sub-group are selected disproportionately more or less often than members of another sub-group. Cognitive ability creates most problems when it comes into contact with adverse impact even when combined with methods with a lower adverse impact (Bobko, Roth and Potosky, 1999). Certain personnel selection methods that do not show an adverse impact for example personality questionnaires are being more widely used (Shakleton and Newell, 1997). Other methods such as biodata, which show minimal adverse impact and good levels of validity, continue to be used in a relatively small capacity (Bliesener, 1996).

The main techniques that are often critically evaluated as to their feasibility as valid and reliable selection methods are Cognitive Ability Tests, Personality, Interviews, Assessment Centres and Biodata.

Cognitive ability i.e. the concept of intelligence has been one of the most relied upon methods to discriminate between candidates and to predict subsequent job performance. Studies performed during the 1980’s (Schmidt and Hunter, 1988) have produced clear findings indicating the validity (‘r’ = 0.55) of cognitive ability and the extent of its fairness when testing people from different ethnic groups. Carroll’s (1993) three stratum structure of cognitive ability indicates and Upper Stratum (general mental ability), a Middle Stratum (broad cognitive speediness, broad retrieval ability or crystallized intelligence for example), and a Lower Stratum (includes numerous specialised abilities).

Cognitive ability has been shown to provide criterion-related validity to a majority of occupational areas and differential validity in terms of predicting accurate work performance predictions across ethnic groups. However, it does mean to say that cognitive tests should be used for all selection purposes. It has already been noted that adverse impact poses problems, given that some members of minority groups obtain lower scores on such tests.

Here we can see there is no simple solution to those designing such selection tests. Further conclusive findings have indicated that the Upper Stratum, the core dimension of cognitive ability is the key component in providing predictions of subsequent job performance. The use of specific abilities will not enhance the predictions of the Upper Stratum itself. The factors concerned in the Lower Stratum still underlie the majority of cognitive ability tests today. One area of interest related to cognitive ability concerns the development of ‘practical intelligence’ (Sternberg and Wagner, 1986, 1995).

Practical intelligence can be distinguished from the kind of intelligence that lies behind success in academic pursuits i.e. the abilities people use in every day life to attain their goals. However interesting these new ideas may be there is little evidence to say that practical intelligence is better at predicting future job performance or anything significantly different from general mental ability. There are few published articles with reasonable size samples that have investigated the criterion-related validity for tacit knowledge and where this has been done (e.g. Evans and Rotman, 1997, cited in Salgado, 1999), the results have shown that the validity is modest and provides little gain beyond what is already obtainable from studies of general mental ability.

Emotional intelligence relates to the way people perceive, understand and manage emotion (Goleman, 1996). However, although this concept is related to practical intelligence, studies have failed to demonstrate the criterion-related validity of emotional intelligence for any specific occupational area.

Personality can be defined as ‘Those relatively stable and enduring aspects of an individual which distinguish them from other people, making them unique, but which at the same time permit a comparison between individuals’. Until quite recently personality inventories were not a very popular selection technique. Guion and Gottier concluded that it was impossible to conduct a review of the criterion-related validity of personality because too few studies were available in the literature, however the 1990’s have seen a huge growth in the use of personality assessment within personnel selection practise and research studies designed to evaluate and explore the role of personality within personnel selection (e.g. Barrick and Mount, 1991).

Barrick and Mount investigated the relation of the ‘Big Five’ Personality dimensions in relation to job proficiency, training proficiency and personnel data. All studies performed adopted a meta-analytic procedure and provide positive evidence for the criterion-related validity of personality, which has moved researchers to a position to be confident in the fact that personality can play a role in effective personnel selection.

Several interesting questions that have been posed have included, the level of analysis to be used when utilising personality for personnel selection and assessment e.g. the Big Five Level, the extent to which factors relating to integrity acts as a single best predictor for personality, in much the same way that general mental ability works in the cognitive ability domain. The incremental validity provided by personality is assessed over more established methods of personnel selection, such as general mental ability.

The research focussing on the use of level of analysis best used for personality assessment is, in many ways, directly related to the extent to which broad factors such as conscientiousness provide most of the essential predictive power of personality. Ones and Visweveran (1996) maintain that broad measures such as the big five or similar frameworks provide the best level of analysis for personality assessment, where as others suggest using more specific personality characteristics. Deniz Ones suggests that when it comes to overall job performance, broader measures such as conscientiousness and integrity provide greater validity coefficients.

However other researchers such as Robertson, 1994, show that broad measures such as conscientiousness do not provide good measures for personality. Much research into the validity of personality as a method for selection suggests there is little cause for concern. There is evidence that participants distort their responses when personality is used as a selection procedure (Hough,1998). However it has been more recently found that motivational distortion, self-deception and impression management have shown to have little effect on validity (Barrick and Mount, 1996), it has also been found that intentional distortion can be avoided if participants are warned before the test is administered.

Interviews have been researched considerably as a selection technique and are the most widely used selection device. The predictive validity of interviews can be improved upon by altering their structure. Typical validity coefficients as reported by Salgado are 0.56 for a highly structured interview and 0.20 for an unstructured interview. The two main ways to structure an interview are situational interviews and behavioural description interviews. Situational interviews obtain higher validities than the later (0.50 v 0.39). Salgado also found that the concurrent validity of interviews is often higher than their predictive validity.

Unlike cognitive ability tests and personality tests, interviews tend to assess many different candidate attributes as opposed to specific constructs. A lot of recent work has focused on construct validity of interviews and has looked at determining what interviews actually measure for example do they measure, cognitive factors, job knowledge or the more unstructured aspects of social skills or personality.

Data derived from correlations of various studies suggest that interview have been primarily measuring social skills, experience and job knowledge. General mental ability has only a moderate correlation with interview performance and the contribution of conscientiousness seems to be quite small. Extroversion and emotional stability would seem to make small, but notable, contributions. Agreeableness and openness to experience also seem to make only a small contribution. Schuler developed a multi-modal interview that is divided into four parts – self-presentation, vocational questions, biographical questions and situational questions.

A study using 306 participant’s show that self-presentation and situational questions correlate highly with social skills. Gibb et al (2000) compared telephone interviews and face-to-face interviews of 70 applicants. Face-to-face interviews were more successful than telephone interviews. This is possibly due to telephone interviewing only focussing upon verbal aspects of behaviour where as face-to-face interviews would give credit for non-verbal responses.

Interviews tend to lower reliabilities than usually accepted devices used for individual prediction and therefore unreliability remains a serious source of attenuation for any validity coefficients which might be found. Nevertheless it remains the most widely practised method of selection. Further it has been suggested that its stubbornness in refusing to go away makes it arguably an increasingly interesting phenomenon to investigate.

Assessment Centres have been found to hold good criterion validity and also have a low adverse impact (Hough and Oswald 2000). However, even though the criterion validity of assessment centres is high, there has been some concern about which constructs are measured. Studies using factor analysis have shown that the key aspects which arise from analysis relate strongly to the exercises performed as opposed to the dimensions or psychological constructs which are meant to be assessed.

Hough and Oswald (2000) have suggested several features which may improve the ratings and psychometric quality of assessment centres for example using several psychology-trained assessors, having only a few conceptually distinct constructs or using cross-exercise assessment. Scholz and Schuler (1993) conducted a meta-analysis of assessment centre data which attempted to explore the key constructs measured in the overall assessment rating. They found that overall assessment rating to be highly correlated with general intelligence (0.43), achievement motivation (0.4), social competence (0.41), self-confidence (0.32) and dominance (0.30).

These results suggest that the primary construct measured within assessment centres relates to mental ability (Smith and Chung, 1998). However, concerns are made about the extent to which assessment centres provide utility in the personnel selection process. Assessment centres are clearly expensive and require constant monitoring. If the predictive ability of assessment centres was equal to methods such as psychometric testing, which generally tend to be paper tests, costs should be seriously thought about.

Also another questionable attribute of assessment centres is whether or not they can be used to indicate strengths and weaknesses of a candidate and if they can provide a basis for further development. Concern over the construct validity of the dimensions assessed in assessment centres raises questions over the validity and reliability of the assessment of specific competencies, derived from assessment-centre scores.

Biodata can be defined as ‘autobiographical data: objective or score able items of information provided by an individual about previous experience (demographic, experiential, attitudinal) which can be presumed or demonstrated to be related to personality structure, personal adjustment or success in social, educational or occupational pursuits’. This is usually in an application form which is comprised of multiple choice items that assess experiences and attitudes in ‘hard and soft’ terms.

Hard related to historical, verifiable information, where as soft item are usually abstract items for example attitudes, aspirations, motivations and expectations. Although soft items are easier to fake they allow constructs to be measured for example conscientiousness or assertiveness. Although Biodata are used significantly less than traditional selection methods such as the interview, they have attracted a lot of research attention.

Salgado suggested that biodata have considerable and generalisable criterion validity along with construct validity being well established. Bliesener (1996) discovered that the validity of biodata scales was 0.30 during an authoritative meta-analysis. However several factors appear to moderate this finding. Concurrent validity studies yielded a higher figure of 0.35. Criterion used in a study appeared to also have a significant effect for example studies using objective criteria obtained validities of 0.53. Mount, Witt and Barrick (2000) showed that empirically keyed biodata scales had incremental validity over a combination of tests of general mental ability and personality.

Biodata has been applied to a wide range of occupations such as clerical jobs in the private sector along with attempts to predict job interests (Wilkinson, 1997) and ability to deal with other people from a wide range of background and ethnic groups (Douthitt, Eby and Simon, 1999). However research has never really placed any emphasis on whether items should be ‘hard’ or ‘soft’ biodata. Biodata questionnaires were originally designed to measure success in a job with one score for overall suitability.

Nowadays studies now use biodata to produce scores on dimensions that can be linked to enable a prediction. A wide range of items can be measured on biodata scales for example money management, interest in home repairs or emotional stability.

The whole purpose of a personnel selection process is to identify candidates who are most or least suited to the occupational area applied for. Currently and historically the major quality standard for personnel selection methods has been criterion related validity. However there has been found to be difficulties when interpreting the evidence, concerning criterion related validity of personnel selection methods, therefore the contribution of the meta-analytic studies to the literature should not be dismissed.

Construct validity also needs to be considered when discussing the validity evidence for different personnel selection methods. Only two of the selection methods are directly associated with the specific constructs they measure, these being mental-ability testing and personality tests. Other selection methods tend to be defined the procedures adopted and not by the specific constructs they measure.

The International Personnel Management Association specifies ten elements that identify an assessment centre for example job analysis, multiple assessment techniques and multiple assessors. None of the criteria measured relate to the constructs, but only to the procedures and structures of the assessment process. Thus by comparing validities for assessment centres, structured interviews, cognitive ability tests and personality tests we are not comparing similar approaches. More research needs to be done into the constructs that are being measured by each selection technique. Without this, comparative evaluation of validity is almost meaningless.

A key question for researchers measuring personnel selection methods concerns the extent to which different methods provide unique and overlapping information concerning the candidates’ likely performance. Personnel selection researchers have been actively exploring the issue of incremental validity of different techniques. Cortina et al (2000) used a meta-analytic study to assess the relationship between cognitive ability, conscientiousness and interviews.

These results were put together with results concerning criterion related validity from previous meta-analysis to provide estimates of the extent to which each of the individual methods provided unique validity. It was suggested from the results that interview scores (in particular highly structured methods) helped to predict job performance, beyond information predicted by cognitive and personality test scores. Within time, researchers hope to find out which combination of selection procedures will provide the ultimate selection process for optimal performance.

References:

Barrick, M.R. and Mount, M.K. (1991) The Big Five personality dimensions and job performance: a meta-analysis. Personnel Psychology 49, 141-168.

Barrick, M. R. and Mount, M.K. (1996) Effects of impression management and self-deception on the predictive validity of personality constructs. Journal of Applied Psychology, 81 261-272.

Bliesener. T. (1996) Methodological moderators in validating biographical data in personnel selection. Journal of Occupational and Organisational Psychology, 69 107-120.

Bobko, P. Roth, P.L. and Potosky, D. (1999) Derivation and implications of a meta-analytic matrix incorporating cognitive ability, alternative predictors, and job performance. Personnel Psychology, 52, 1-31.

Carroll, J.B. (1993) Human Cognitive Abilities: a survey of factor-analytic studies. Cambridge: Cambridge University Press.

Cortina, J.M., Goldstein, N.B., Payne, S.C., Davidson, H.K. and Gilland, S.W. (2000). The Incremental Validity of interview scores over and above cognitive ability and conscientiousness scores. Personnel Psychology, 53, 325-351.

Douthitt, S.S. Eby, L.T. and Simon, S.A. (1999). Diversity of life experiences: the development of graphical measures of receptiveness to dissimilar others. International Journal of Selection and Assessment, 7, 112-125.

Goleman, D. (1996) Emotional Intelligence. London. Bloomsbury.

Guion, R.M. and Gottier, R.F. (1965) Validity of Personality Measures in personnel selection. Personnel Psychology, 18, 135-164.

Hough, L.M. (1998) Effects of intentional distortion in personality measurement and evaluation of suggested palliatives. Human Performance, 11, 209-244.

Hough, L.M. and Oswald F.L. (2000) Personnel Selection: looking toward the future – remembering the past. Annual Review of Psychology, 51, 631-644.

Hunter, J.E. and Schmidt, F.L. (1990) Methods of meta-analysis: correcting error and bias in research findings. Newbury Park, CA: Sage.

Mount, M.K., Witt, L.A. and Barrick, M.R. (2000) Incremental Validity of empirically keyed Biodata scales over GMA and the five factor personality constructs. Personnel Psychology, 53, 299-323.

Ones, D.S. and Visweveran, C. (1996) What do pre-employment customer services scales measure? Explorations in construct validity and implications for personnel selection. Presented at Annual Meeting Society Industrial and Organisational Psychology, San Diego, CA.

Salgado, J.F. (1999) Personnel Selection Methods. In C.L. Cooper and I.T. Robertson (Eds.), International Review of Industrial and Organizational Psychology. New York: Wiley

Scholtz, G., and Schuler, H. (1993) The nomological network of the assessment centre: a meta-analysis. Zeitschrift fur Arbets und Organizationpsychologie, 37, 73-85.

Shakleton, V. and Newell, S. (1997) International assessment and selection. In N. Anderson and P. Herriot (Eds), International Handbook of selection and assessment. Chichester, UK: Wiley.

Smith, M. and Robertson, I.T. (1993) Systematic Personnel Selection. London Macmillan.

Wilkinson, L.J. (1997) Generalisable Biodata? An application to the vocational interests of managers. Journal of Occupational and Organizational Psychology, 70, 49-60.