Current Human Capital Management

Current Human Capital Management: Predictive Analysis   Twitter

Prior to beginning work on this discussion,

Using the two articles you researched on human capital predictive  analysis as well as any of this week’s required or recommended articles,  discuss how predictive analysis is being used to help make human  resource decisions. Additionally, address how, as managers, you might  use predictive analysis to create a strategic global competitiveness  from a company’s human assets. Be sure to give specific company examples  to support your discussion and position on the topic.

Your initial response should be a minimum of 220 words. Support your response with at least one scholarly resource in addition  to the text.

Cited Source: Cascio, W. F., & Aguinis, H. (2019). Applied psychology in talent management (8th ed.). Retrieved from https://www.vitalsource.com

 

Data science and predictive analytics enabling better hiring mechanisms for enterprises Aggarwal, Varun; CTO; Aspiring Minds . Dataquest ; Gurgaon (Dec 24, 2015).

ProQuest document link

ABSTRACT

According to a Deloitte research, companies that are incorporating analytics in HR have two to three times better

results in the quality of hire, leadership development, and employee turnover.\n FULL TEXT

It is essential to leverage the predictive power of data science to bring in structured information and enable

building of an effective recruitment mechanism for the future

The cost of a bad hire an organization bears is beyond monetary-time spent in recruiting/training, loss of

productivity, impaired employee morale are some of the aftermaths of a bad decision. At present, the HR function

is adversely affected by the lack of structured information which if analyzed can provide key insights into the

system. While effective recruitment is at the core of their responsibilities and may seem intrinsic, hiring the right

candidate for the right job in a cost and time effective manner is one of the key challenges faced by modern day

HR.

Further adding, more than a million candidates are entering the workforce every year, thus increasing the process

of talent filtering.

All recruiters typically receive a resume which has a lot of information but no clear mention of skills or

competencies. The candidates are further subject to scrutiny by couple of line managers who may or may not be

trained for interviewing and may form an opinion based solely on their interaction. In this entire process the

decision making becomes quite subjective. The question is how can we bring objectivity to this process?

BENEFITS ASSOCIATED WITH DATA SCIENCE

With the growing amount of data present around us, data science and predictive analytics can revolutionize hiring

mechanisms to make hiring more objective and democratic. The problem here is simple-there is supply (pool of job

seekers exists) and demand (pool of jobs exist) but there is no match-making of job seekers with the right jobs.

The answer lies in inculcating a conscious movement towards a culture of data science. Data science in simple

terms is an inference science which helps us make objective decisions and also allows us to know how effective

those decisions would be.

Many organizations have just begun to invest in analytics, but the benefits associated with data-driven decisions,

the time is not far when data science will seep into every aspect of recruitment. To streamline hiring, an

organization can extract data for all the employees it hired in the previous year and quantify basis parameters like

educational qualification, experience, test scores, skills and more to predict which of them have been successful

and which of them have not.

The insights collected from such analytics will allow organizations to move beyond subjective hiring and recruit

the right talent from a more scientifically shortlisted pool. In a recent case, Xerox was able to reduce the attrition in

its call centers by using algorithm driven recruitment techniques. The insights showed that employees without any

call center experience were just as successful as those that had experience and that creative people were more

 

 

likely to stay longer. This allowed the company to widen its hiring pool and improve the quality of hires and was

able to cut down on attrition by 20%.

THE RIGHT RECRUITMENT STANDARDS

Gathering the vast amount of data can help recruiters identify the right talent by classifying information into trends

and narrow down the talent pool. This will also help save cost, time, and resources that would have been otherwise

spent in the recruitment process. However, many organizations still rely on conventional hiring methods which are

largely un-scalable and thus leading to them lose out on the right talent. Enterprises can significantly improve

output and efficiency in hiring with data science enabling them to hire the right talent with least effort with the help

of numbers.

Data science coupled with objective scores becomes a very powerful tool for companies to determine the right

recruitment standards. In a recent case, A Fortune 500 company wanted to establish hiring criteria based on

objective measures. They worked with an assessment partner to conduct job analysis to understand the

Knowledge, Skills, Ability and other prerequisites for the profile. Based on the analysis, they hired through a set of

skill based pre-employment tests. In the next year, the company noticed that in new hires, the percentage of high

performers had increased to 39% from 23% thereby leading to a 70% increase in high performers and 65%

reduction in low performers.

Recruitment is not the only aspect where HRs can benefit from data analytics but it can also help enterprises deal

better with retention, once a candidate is on the job. Usually, a higher salary package is offered to retain an

employee which may only be a short-term solution. According to a Deloitte research, companies that are

incorporating analytics in HR have two to three times better results in the quality of hire, leadership development,

and employee turnover.

Studying resignation patterns, common features of exiting employees, job satisfaction levels among other

information can lead to insights which will help HRs adopt the right approach with employees and cut down on the

attrition rate which is critical in today’s increasingly competitive landscape. Predictive analytics can help

employers understand the workforce, the way marketing teams analyze data to understand and predict customer

behaviour and patterns. These issues were earlier unquantifiable but modern data science methods are changing

the way organizations can benefit from this data.

Human resource is relatively a new domain which is seeing this technological invasion but structuring this vast

data can benefit not just organizations but also the employment ecosystem on the whole. Investing in resources

and knowledge should be a priority for enterprises as the value-driven insights will help address real business

problems.

Copyright 2015 Cyber Media (India) Ltd., distributed by Contify.com

Credit: Varun Aggarwal, CTO, Aspiring Minds DETAILS

Subject: Recruitment; Skills; Science; Decision making

Identifier / keyword: top stories aspiring minds data science features

Publication title: Dataquest; Gurgaon

Publication year: 2015

Publication date: Dec 24, 2015

Publisher: Athena Information Solutions Pvt. Ltd.

 

 

 

Database copyright  2021 ProQuest LLC. All rights reserved. Terms and Conditions Contact ProQuest

Place of publication: Gurgaon

Country of publication: India, Gurgaon

Publication subject: Computers–Data Base Management, Computers

ISSN: 0970034X

Source type: Trade Journals

Language of publication: English

Document type: News

ProQuest document ID: 1751474203

Document URL: https://search.proquest.com/trade-journals/data-science-predictive-analytics-

enabling-better/docview/1751474203/se-2?accountid=32521

Copyright: Copyright 2015 Cyber Media (India) Ltd., distributed by Contify.com

Last updated: 2015-12-24

Database: ProQuest Central

 

  • Data science and predictive analytics enabling better hiring mechanisms for enterprises

Cited: Cascio, W. F., & Aguinis, H. (2019). Applied psychology in talent management (8th ed.). Retrieved from https://www.vitalsource.com

 

Chapter 12 Selection Methods

 

Personal History Data

 

Selection and placement decisions often begin with an examination of personal history data (i.e., biodata) typically found in application forms, biographical inventories, and résumés. Undoubtedly one of the most widely used selection procedures is the application form. Like tests, application forms can be used to sample past or present behavior briefly but reliably. Studies of the application forms used by 200 organizations indicated that questions generally focused on information that was job related and necessary for the employment decision (Lowell & DeLoach, 1982; Miller, 1980). However, over 95% of the applications included one or more legally indefensible questions. To avoid potential problems, consider omitting any question that

 

Might lead to an adverse impact on members of protected groups,

Does not appear job related or related to a bona fide occupational qualification, or

Might constitute an invasion of privacy (Miller, 1980).

 

What can applicants do when confronted by a question that they believe is irrelevant or an invasion of privacy? Some may choose not to respond. However, research indicates that employers tend to view such a nonresponse as an attempt to conceal facts that would reflect poorly on an applicant. Hence, applicants (especially those who have nothing to hide) are ill advised not to respond (Stone & Stone, 1987).

 

Psychometric principles can be used to quantify responses or observations, and the resulting numbers can be subjected to reliability and validity analyses in the same manner as scores collected using other types of measures. Statistical analyses of such group data are extremely useful in specifying the personal characteristics indicative of later job success.

 

Opinions vary regarding exactly what items should be classified as biographical, since such items may vary along a number of dimensions—for example, verifiable–unverifiable; historical–futuristic; actual behavior–hypothetical behavior; firsthand–secondhand; external–

internal; specific–general; and invasive–noninvasive (see Table 12.1). This is further complicated by the fact that “contemporary biodata questions are now often indistinguishable from personality items in content, response format, and scoring” (Schmitt & Kunce, 2002, p. 570). Nevertheless, the core attribute of biodata items is that they pertain to historical events that may have shaped a person’s behavior and identity (Mael, 1991).

 

Some observers have advocated that only historical and verifiable experiences, events, or situations be classified as biographical items. Using this approach, most items on an application form would be considered biographical (e.g., rank in high school graduating class, work history). By contrast, if only historical, verifiable items are included, then questions such as the following would not be asked: “Did you ever build a model airplane that flew?” Cureton (see Henry, 1965, p. 113) commented that this single item, although it cannot easily be verified for an individual, was almost as good a predictor of success in flight training during World War II as the entire Air Force Battery.

Weighted Application Blanks

 

A priori one might suspect that certain aspects of an individual’s total background (e.g., years of education, previous experience) should be related to later job success in a specific position. The weighted application blank (WAB) technique provides a means of identifying which of these aspects reliably distinguish groups of effective and ineffective employees. Weights are assigned in accordance with the predictive power of each item, so that a total score can be derived for each individual. A cutoff score then can be established, which, if used in selection, will eliminate the maximum number of potentially unsuccessful candidates. Hence, one use of the WAB technique is as a rapid screening device, but it may also be used in combination with other data to improve selection and placement decisions. The technique is appropriate in any organization having a relatively large number of employees doing similar kinds of work and for whom adequate records are available. It is particularly valuable for use with positions requiring long and costly training, with positions where turnover is abnormally high, or in employment situations where large numbers of applicants are seeking a few positions (England, 1971).

 

Weighting procedures are simple and straightforward (Owens, 1976), but, once weights have been developed in this manner, it is essential that they be cross-validated. Since WAB procedures represent raw empiricism in the extreme, many of the observed differences in weights may reflect not true differences, but only chance fluctuations.

 

Biographical Information Blanks

 

The biographical information blank (BIB) technique is closely related to the WAB technique. Like WABs, BIBs involve a self-report instrument; although items are exclusively in a multiple-choice format, typically a larger sample of items is included, and frequently items are included that are not normally covered in a WAB. Glennon, Albright, and Owens (1966) and Mitchell (1994) have published comprehensive catalogs of life history items covering various aspects of the applicant’s past (e.g., early life experiences, hobbies, health, social relations), as well as present values, attitudes, interests, opinions, and preferences. Although primary emphasis is on past behavior as a predictor of future behavior, BIBs frequently rely also on present behavior to predict future behavior. Usually BIBs are developed specifically to predict success in a particular type of work. One of the reasons they are so successful is that often they contain all the elements of consequence to the criterion (Asher, 1972). The mechanics of BIB development and item weighting are essentially the same as those used for WABs (Mumford & Owens, 1987; Mumford & Stokes, 1992).

Résumés

 

Résumés are a source of personal history data in most employee selection situations. Although résumés are now usually submitted electronically, as far back as 1975, the estimate was that about 1 billion paper résumés were screened each year (Brown & Campion, 1994). When examiners extract personal history data from a résumé, they are particularly prone to cognitive biases and heuristics because information is often limited to one or two pages. Specifically, applicants are likely to be placed into stereotype-based categories in a rather automatic fashion, and then attributes believed to be typical of the group are assigned to individual applicants—even if those beliefs are factually incorrect. Many so-called “paper people” or “vignette” studies (Aguinis & Bradley, 2014) have been conducted in which résumés of hypothetical applicants are presented to judges, who have to provide ratings regarding each applicant’s job suitability (Derous, Ryan, & Serlie, 2015).

 

Social categorization can take place on more than one category. For example, Derous et al. (2015) conducted an experiment in which 60 Dutch recruiters rated the job suitability of applicants whose résumés included information on ethnicity (Dutch, Arab) and gender (female, male). Results showed that ratings were influenced by applicants’ ethnicity (i.e., Arabs were rated more negatively) and gender (i.e., men were rated more negatively), raters’ prejudice (i.e., those with more negative attitudes toward a particular group rated members of those groups more negatively), and job characteristics (i.e., results were more pronounced when jobs included more client contact).

 

A recent innovation is the use of video résumés, which are recorded video and audio messages in which job applicants can present themselves to potential employers. Video résumés allow applicants to express themselves in a way that is not possible using the more traditional paper format. It is also possible to create multimedia résumés, in which job applicants also include animations and text (Hiemstra, Derous, Serlie, & Born, 2012). Not much research is yet available on video résumés; however, Hiemstra et al. (2012) conducted a study involving 445 unemployed job seekers who had received a two-day job-application training in the Netherlands and found that they perceived video résumés to be more fair compared to traditional paper résumés regardless of applicant ethnicity (i.e., Dutch, Turkish, Moroccan, Surinamese/Antillean, other non-Westerners, and other Western applicants).

 

Overall, given the many factors that influence raters’ evaluation of personal history data based on résumé screening, it is important to (a) train raters to make sure they focus on job-related factors and (b) assess interrater reliability (Brown & Campion, 1994). Another important concern is the extent to which applicants may distort the information they provide, hoping to increase their chances of receiving a job offer. We discuss this topic later in the chapter.

Credit History

 

The big data movement has provided organizations with personal history data that were unthinkable just a few years ago. For example, a survey of members of the Society for Human Resources Management revealed that about 50% of employers conduct credit background checks on at least some applicants (Bernerth, 2012). One type of personal history data, credit scores, seem to be an objective indicator of a job applicant’s conscientiousness and even integrity—two clearly desirable KSAs for many jobs. If an applicant fails to keep a promise to his or her financial institution, this may be an indicator that he or she will similarly fail to keep a promise at work. Also, perhaps individuals who are under financial duress may be more prone to engaging in counterproductive behaviors at work (e.g., theft) (Bernerth, 2012).

 

Using credit background checks for employment purposes is legally permissible in the United States under the Fair Credit Reporting Act if applicants provide written authorization (Bernerth, 2012). However, some states, including California, Colorado, Connecticut, Delaware, Hawaii, Illinois, Maryland, Nevada, Oregon, Vermont, and Washington, as well as Washington, D.C., have restricted the use of credit histories of applicants and employees. For example, Colorado’s Employment Opportunity Act (SB13-018) prohibits an employer’s use of consumer credit information for employment purposes if the information is unrelated to the job. Moreover, it requires an employer to disclose to an employee or applicant if the employer uses consumer credit information to take adverse action against the employee or applicant and the particular credit information upon which the employer relied. It also authorizes an aggrieved employee to sue for an injunction, damages, or both.

 

The regulations in these jurisdictions seem justified, given evidence that credit scores are related to several demographic variables that in many cases are unrelated to job performance. For example, Bernerth (2012) collected Fair Isaac Corporation (FICO) scores for 112 university employees and alumni and conducted a regression analysis using credit scores as the criterion and the following demographic variables as predictors: minority status (nonminority, minority); gender (male, female); marital status (never been divorced, divorced); educational attainment (high school degree/GED, some college, 2-year college degree, 4-year college degree, some graduate or professional education, graduate degree); and age. The five predictors combined accounted for 34% of variance in credit scores and the predictors (a) minority status (minority status associated with lower scores), (b) educational attainment (less education associated with lower scores), and (c) age (younger applicants received lower scores) had the strongest effects. Although educational attainment may be a job-related factor for some occupations and positions, the strong relation between ethnicity and credit scores guarantees that the use of this particular type of personal history data will result in adverse impact. In addition to legal issues, the use of credit scores has ethical connotations. Specifically, “critics of credit scores contend that using such information to make hiring decisions unfairly disadvantages individuals with low scores and traps them in a ‘vicious downward spiral’ where unemployment damages personal credit which, in turn, can hurt their job prospects” (Bernerth, 2012, p. 245). As is the case for all types of predictors, validity information is required—and this is particularly important in the presence of adverse impact.

Response Distortion in Personal History Data

 

Can job applicants intentionally distort personal history data? The answer is yes. For example, the “sweetening” of résumés is not uncommon, and one study reported that 20—25% of all résumés and job applications include at least one major fabrication (LoPresto, Mitcham, & Ripley, 1986). The extent of self-reported distortion was found to be even higher when data were collected using the randomized-response technique, which absolutely guarantees response anonymity and thereby allows for more honest self-reports (Donovan, Dwight, & Hurtz, 2003).

 

A study in which participants were instructed to “answer questions in such a way as to make you look as good an applicant as possible” and to “answer questions as honestly as possible” resulted in scores almost two standard deviations higher for the “fake good” condition (McFarland & Ryan, 2000). In fact, the difference between the “fake good” and the “honest” experimental conditions was larger for a biodata inventory than for other measures including personality traits such as extraversion, openness to experience, and agreeableness. In addition, individuals differed in the extent to which they were able to fake (as measured by the difference between individuals’ scores in the “fake good” and “honest” conditions). So, if they want to, individuals can distort their responses, but some people are more able than others to do so.

 

Fortunately, there are situational characteristics that an examiner can influence, which may make it less likely that job applicants will distort personal history information. The first such characteristic is the extent to which information can be verified. More objective and verifiable items are less amenable to distortion (Kluger & Colella, 1993). The concern with being caught seems to be an effective deterrent to faking. Second, option-keyed items are less amenable to distortion (Kluger, Reilly, & Russell, 1991). With this strategy, each item-response option (alternative) is analyzed separately and contributes to the score only if it correlates significantly with the criterion. Third, distortion is less likely if applicants are warned of the presence of a lie scale (Kluger & Colella, 1993) and if biodata are used in a non-evaluative, classification context (Fleishman, 1988). A fourth approach involves asking job applicants to elaborate on their answers. These elaborations require job applicants to describe more fully the manner in which their responses are true or to describe incidents to illustrate and support their answers (Schmitt & Kunce, 2002). For example, for the question “How many work groups have you led in the past 5 years?” the elaboration request can be “Briefly describe the work groups and projects you led” (Schmitt & Kunce, 2002, p. 586). The rationale for this approach is that requiring elaboration forces the applicant to remember more accurately and to minimize managing a favorable impression. The use of the elaboration approach led to a reduction in scores of about .6 standard deviation units in a study including 311 examinees taking a pilot form of a selection instrument for a federal civil service job (Schmitt & Kunce, 2002). Similarly, a study including more than 600 undergraduate students showed that those in the elaboration condition provided responses much lower than those in the non-elaboration condition (Schmitt, Oswald, Kim, Gillespie, & Ramsay, 2003).

Validity of Personal History Data

 

Properly cross-validated biodata have been developed for many occupations, including life insurance agents; law enforcement officers; service station managers; sales clerks; unskilled, clerical, office, production, and management employees; engineers; architects; research scientists; and Army officers. Criteria include turnover (by far the most common), absenteeism, rate of salary increase, performance ratings, number of publications, success in training, creativity ratings, sales volume, and employee theft.

 

Evidence indicates that the validity of personal history data as a predictor of future work behavior is quite good. For example, Reilly and Chao (1982) reviewed 58 studies that used biographical information as a predictor. Over all criteria and over all occupations, the average validity was .35. A subsequent meta-analysis of 44 such studies revealed an average validity of .37 (Hunter & Hunter, 1984). A later meta-analysis that included results from eight studies of salespeople’s performance that used supervisory ratings as the criterion found a mean validity coefficient (corrected for criterion unreliability) of .33 (Vinchur, Schippmann, Switzer, & Roth, 1998).

 

As a specific illustration of the predictive power of these types of data, consider a study that used a concurrent validity design including more than 300 employees in a clerical job. A rationally selected, empirically keyed, and cross-validated biodata inventory accounted for incremental variance in the criteria over that accounted for by measures of personality and general cognitive abilities (Mount, Witt, & Barrick, 2000). Specifically, biodata accounted for about 6% of incremental variance for quantity and quality of work, about 7% for interpersonal relationships, and about 9% for retention. As a result, we now have empirical support for the following statement by Owens (1976) from more than four decades ago:

Personal history data also broaden our understanding of what does and does not contribute to effective job performance. An examination of discriminating item responses can tell a great deal about what kinds of employees remain on a job and what kinds do not, what kinds sell much insurance and what kinds sell little, or what kinds are promoted slowly and what kinds are promoted rapidly. Insights obtained in this fashion may serve anyone from the initial interviewer to the manager who formulates employment policy. (p. 612)

 

A caution is in order, however. Commonly, biodata keys are developed on samples of job incumbents, and it is assumed that the results generalize to applicants. However, a large-scale field study that used more than 2,200 incumbents and 2,700 applicants found that 20% or fewer of the items that were valid in the incumbent sample were also valid in the applicant sample. Clearly motivation and job experience differ in the two samples. The implication: Match incumbent and applicant samples as closely as possible, and do not assume that predictive and concurrent validities are similar for the derivation and validation of BIB scoring keys (Stokes, Hogan, & Snell, 1993).

Bias and Adverse Impact

 

Since the passage of Title VII of the 1964 Civil Rights Act, personal history items have come under intense legal scrutiny. While not unfairly discriminatory per se, such items legitimately may be included in the selection process only if it can be shown that (a) they are job related and (b) they do not unfairly discriminate against either minority or nonminority subgroups.

 

In one study, Cascio (1976b) reported cross-validated validity coefficients of .58 (minorities) and .56 (nonminorities) for female clerical employees against a tenure criterion. When separate expectancy charts were constructed for the two groups, no significant differences in WAB scores for minorities and nonminorities on either predictor or criterion measures were found. Hence, the same scoring key could be used for both groups.

 

Results from several studies have concluded that biodata inventories are relatively free of adverse impact, particularly when items do not reflect cognitive abilities (Breaugh, 2009). However, a meta-analysis by Bobko and Roth (2013) emphasized that most results are based on concurrent validity designs using incumbent samples, which likely decrease observed ethnicity-based differences. They estimated that the black—white mean standardized difference is d = .31, which was based on biodata that included a large number of KSAs.

 

Unfortunately, other than the degree of cognitive abilities saturation, when differences exist, we often do not know why. This reinforces the idea of using a rational (as opposed to an entirely empirical) approach to developing biodata inventories, because it has the greatest potential for allowing us to understand the underlying constructs, how they relate to criteria of interest, and how to minimize between-group score differences. As noted by Stokes and Searcy (1999):

 

With increasing evidence that one does not necessarily sacrifice validity to use more rational procedures in development and scoring biodata forms, and with concerns for legal issues on the rise, the push for rational methods of developing and scoring biodata forms is likely to become more pronounced. (p. 84)

 

What Do Biodata Mean?

 

Criterion-related validity is not the only consideration in establishing job relatedness. Items that bear no rational relationship to the job in question (e.g., “applicant does not wear eyeglasses” as a predictor of theft) are unlikely to be acceptable to courts or regulatory agencies, especially if total scores produce adverse impact on a protected group. Nevertheless, external or empirical keying is the most popular scoring procedure and consists of focusing on the prediction of an external criterion using keying procedures at either the item or the item-option level (Stokes & Searcy, 1999). As defined by Mael (1991), “[T]he core attribute of biodata items is that the items pertain to historical events that may have shaped the person’s behavior and identity” (p. 763). Accordingly, as shown in Table 12.1, items measure behavioral intentions, self-descriptions of personality traits, and personal interests, among other constructs. Note, however, that biodata inventories resulting from a purely empirical approach do not help us understand what constructs are measured.

 

More prudent and reasonable is the rational approach, including job analysis information to deduce hypotheses concerning success on the job under study and to seek from existing, previously researched sources either items or factors that address these hypotheses (Stokes & Cooper, 2001). Essentially, we are asking the following questions: “What do biodata mean?” “Why do past behaviors and performance or life events predict non-identical future behaviors and performance?” (Breaugh, 2009; Dean & Russell, 2005). Thus, in a study of recruiters’ interpretations of biodata items from résumés and application forms, Brown and Campion (1994) found that recruiters deduced language and math abilities from education-related items, physical ability from sports-related items, and leadership and interpersonal attributes from items that reflected previous experience in positions of authority and participation in activities of a social nature. Nearly all items were thought to tell something about a candidate’s motivation. The next step is to identify hypotheses about the relationship of such abilities or attributes to success on the job in question. This rational approach has the advantage of enhancing both the utility of selection procedures and our understanding of how and why they work (cf. Mael & Ashforth, 1995). Moreover, it is probably the only legally defensible approach for the use of personal history data in employment selection.

 

The rational approach to developing biodata inventories has proven fruitful beyond employment testing contexts. For example, Douthitt, Eby, and Simon (1999) used this approach to develop a biodata inventory to assess people’s degree of receptiveness to dissimilar others (i.e., general openness to dissimilar others). As an illustration, for the item “How extensively have you traveled?” the rationale is that travel provides for direct exposure to dissimilar others and those who have traveled to more distant areas have been exposed to more differences than those who have not. Other items include “How racially (ethnically) integrated was your high school?” and “As a child, how often did your parent(s) (guardian(s)) encourage you to explore new situations or discover new experiences for yourself?” Results of a study including undergraduate students indicated that the rational approach paid off because there was strong preliminary evidence in support of the scale’s reliability and validity. However, even if the rational approach is used, the validity of biodata items can be affected by the life stage in which the item is anchored (Dean & Russell, 2005). In other words, framing an item around a specific, hypothesized developmental time (i.e., childhood versus past few years) is likely to help applicants provide more accurate responses by giving them a specific context to which to relate their response.

Recommendations and Reference Checks

 

Another source of personal history data is information provided by others in the form of recommendations and reference checks. Many prospective users ask a practical question: “Are recommendations and reference checks worth the amount of time and money it costs to process and consider them?” In general, four kinds of information are obtainable: (1) employment and educational history (including confirmation of degree and class standing or grade point average); (2) evaluation of the applicant’s character, personality, and interpersonal competence; (3) evaluation of the applicant’s job performance ability; and (4) willingness to rehire.

 

For a recommendation to make a meaningful contribution to the screening and selection process, however, certain preconditions must be satisfied. The recommender must have had an adequate opportunity to observe the applicant in job-relevant situations, he or she must be competent to make such evaluations, he or she must be willing to be open and candid, and the evaluations must be expressed so that the potential employer can interpret them in the manner intended (McCormick & Ilgen, 1985). Although the value of recommendations can be impaired by deficiencies in any one or more of the four preconditions, unwillingness to be candid is probably the most serious. However, to the extent that the truth of any unfavorable information cannot be demonstrated and it harms the reputation of the individual in question, providers of references may be guilty of defamation in their written (libel) or oral (slander) communications (Ryan & Lasek, 1991).

 

Written recommendations are considered by some to be of little value. For example, consider the opinions based on a survey of about 600 HR professionals with titles such as recruiting manager, employment lawyer, personnel consultant, and human resources specialist (Nicklin & Roch, 2009). About 80% of respondents agreed with the statement that “letter inflation is a problem that will never be entirely alleviated.” To a large extent, this opinion is justified, since the available evidence indicates that the average validity of recommendations is .14 (Reilly & Chao, 1982). A meta-analysis focused exclusively on academic performance found similar results: the average observed correlation with GPA in medical school was .13 (N = 916) and the correlation with clinical and internship performance was .12 (N = 1,120). The average correlation with GPA in college seems higher, r = .28 (N = 5,155) (Kuncel, Kochevar, & Ones, 2014). But, meta-regression analysis (Gonzalez-Mulé & Aguinis, in press) showed that letters of recommendation contributed only .003 additional proportion of variance to the prediction of grade point average in graduate school and only .011 to the prediction of faculty ratings of performance above and beyond undergraduate GPA and verbal and quantitative GRE exam scores. Results were slightly more encouraging regarding the proportion of additional variance explained in the prediction of degree attainment: .024.

 

One of the biggest problems, and possibly the main reason for their overall lack of value-added predictive power, is that such recommendations rarely include unfavorable information and, therefore, do not discriminate among candidates. In addition, the affective disposition of letter writers has an impact on letter length, which, in turn, has an impact on the favorability of the letter (Judge & Higgins, 1998). In many cases, therefore, the letter may be providing more information about the person who wrote it than about the person described in the letter.

 

The fact is that decisions are made on the basis of letters of recommendation, particularly in academic settings (Nicklin & Roch, 2009). If such letters are to be meaningful, they should contain the following information (Knouse, 1987):

Degree of writer familiarity with the candidate: This should include time known and time observed per week.

Degree of writer familiarity with the job in question: To help the writer make this judgment, the person soliciting the recommendation should supply to the writer a description of the job in question.

Specific examples of performance: This should cover such aspects as goals, task difficulty, work environment, and extent of cooperation from coworkers.

Individuals or groups to whom the candidate is compared.

 

Unfortunately, many employers believe that reference checks are not permissible under the law. This is not true (Hedricks, Robie, & Oswald, 2013). In fact, employers may do the following: seek information about applicants, interpret and use that information during selection, and share the results of reference checking with another employer (Sewell, 1981). In fact, employers may be found guilty of negligent hiring if they should have known at the time of hire about the unfitness of an applicant (e.g., prior job-related convictions, propensity for violence) that subsequently causes harm to an individual (Gregory, 1988; Ryan & Lasek, 1991). In other words, failure to check closely enough could lead to legal liability for an employer.

 

Reference checking is a valuable screening tool (see Box 12.1). An average validity of .26 was found in a meta-analysis of reference-checking studies (Hunter & Hunter, 1984). To be most useful, however, reference checks should be

 

Consistent: If an item is grounds for denial of a job to one person, it should be the same for any other person who applies.

Relevant: Employers should stick to items of information that really distinguish effective from ineffective employees.

Written: Employers should keep written records of the information obtained to support the ultimate hiring decision made.

Based on public records: Such records include court records, workers’ compensation, and bankruptcy proceedings. (Ryan & Lasek, 1991; Sewell, 1981)

 

Reference checking can also be done via telephone interviews (Taylor, Pajo, Cheung, & Stringfield, 2004). Implementing a procedure labeled structured telephone reference check (STRC), a total of 448 telephone reference checks were conducted on 244 applicants for customer-contact jobs (about two referees per applicant) (Taylor et al., 2004). STRCs took place over an eight-month period; they were conducted by recruiters at one of six recruitment consulting firms, and they lasted on average 13 minutes. Questions focused on measuring three constructs: conscientiousness, agreeableness, and customer focus. Recruiters asked each referee to rate the applicant compared to others they have known in similar positions, using the following scale: 1 = below average, 2 = average, 3 = somewhat above average, 4 = well above average, and 5 = outstanding. Note that the scale used is a relative, versus absolute, rating scale so as to minimize leniency in ratings. As an additional way to minimize leniency, referees were asked to elaborate on their responses. As a result of the selection process, 191 of the 244 applicants were hired, and data were available regarding the performance of 109 of these employees (i.e., those who were still employed at the end of the first performance appraisal cycle). A multiple-regression model predicting supervisory ratings of overall performance based on the three dimensions assessed by the STRC resulted in R2 = .28, but customer focus was the only one of the three dimensions that predicted supervisory ratings (i.e., standardized regression coefficient of .28).

In closing, few organizations are willing to abandon altogether the practice of recommendation and reference checking, despite all the shortcomings. One need only listen to a grateful manager thanking the HR department for the good reference checking that “saved” him or her from making a bad offer to understand why. Also, from a practical standpoint, a key issue to consider is the extent to which the constructs assessed by recommendations and reference checks provide unique information above and beyond other data collection methods that we describe later in this chapter (e.g., employment interview).

Polygraph Tests

 

Polygraph instruments are intended to detect deception and are based on the measurement of physiological processes (e.g., heart rate) and changes in those processes. An examiner infers whether a person is telling the truth or lying based on charts of physiological measures in response to the questions posed and observations during the polygraph examination. Although they are often used for event-specific investigations (e.g., after a crime), they are also used (on a limited basis) for both employment and preemployment screening.

 

The use of polygraph tests has been severely restricted by a federal law passed in 1988. This law, the Employee Polygraph Protection Act, prohibits private employers (except firms providing security services and those manufacturing controlled substances) from requiring or requesting preemployment polygraph exams. Polygraph exams of current employees are permitted only under very restricted circumstances. Nevertheless, many agencies (e.g., U.S. Department of Energy) are using polygraph tests, given the security threats imposed by international terrorism.

 

Although much of the public debate over the polygraph as a lie detector focuses on ethical problems (Aguinis & Handelsman, 1997a, 1997b), at the heart of the controversy is validity—the relatively simple question of whether physiological measures can assess truthfulness and deception (Saxe, Dougherty, & Cross, 1985). An analysis of the scientific evidence on this issue is contained in a report by the National Research Council, which operates under a charter granted by the U.S. Congress. Its Committee to Review the Scientific Evidence on the Polygraph (2003) conducted a quantitative analysis of 57 independent studies investigating the accuracy of the polygraph and concluded the following:

 

Polygraph accuracy for screening purposes is almost certainly lower than what can be achieved by specific-incident polygraph tests.

The physiological indicators measured by the polygraph can be altered by conscious efforts through cognitive or physical means.

Using the polygraph for security screening yields an unacceptable choice between too many loyal employees falsely judged deceptive and too many major security threats left undetected.

 

In sum, as concluded by the committee, the polygraph’s “accuracy in distinguishing actual or potential security violators from innocent test takers is insufficient to justify reliance on its use in employee security screening in federal agencies” (p. 6). These conclusions are consistent with the views of scholars in relevant disciplines. Responses to a survey completed by members of the Society for Psychophysiological Research and Fellows of the American Psychological Association’s Division 1 (General Psychology) indicated that the use of polygraph testing is not theoretically sound, claims of high validity for these procedures cannot be sustained, and polygraph tests can be beaten by countermeasures (Iacono & Lykken, 1997).

 

In spite of the overall conclusion that polygraph testing is not very accurate, potential alternatives to the polygraph, such as measuring brain activity through electrical and imaging studies have not yet been shown to outperform the polygraph (Committee to Review the Scientific Evidence on the Polygraph, 2003). Such alternative techniques do not show any promise of supplanting the polygraph for screening purposes in the near future. Thus, although imperfect, it is likely that the polygraph will continue to be used for employee security screening until other alternatives become available.

Honesty Tests

 

Honesty testing is a multimillion-dollar industry, especially since the use of polygraphs in employment settings has been severely curtailed and “ban-the-box” laws in some states restrict employers from asking candidates about prior criminal convictions until later in the hiring process. Written honesty tests (also known as integrity tests) fall into two major categories: overt integrity tests and personality-based measures. Overt integrity tests typically include two types of questions. One assesses attitudes toward theft and other forms of dishonesty (e.g., endorsement of common rationalizations of theft and other forms of dishonesty, beliefs about the frequency and extent of employee theft, punitiveness toward theft, perceived ease of theft). The other deals with admissions of theft and other illegal activities (e.g., dollar amount stolen in the last year, drug use, gambling). Personality-based measures are not designed as measures of honesty per se, but rather as predictors of a wide variety of counterproductive behaviors, such as substance abuse, insubordination, absenteeism, bogus workers’ compensation claims, and various forms of passive aggression. Overall, personality-based measures assess broader dispositional traits such as socialization and conscientiousness. (Conscientiousness is one of the Big Five personality traits; we discuss these in more detail in Chapter 13.) In fact, despite the clear differences in content, both overt and personality-based tests seem to have a common latent structure reflecting conscientiousness, agreeableness, and emotional stability (Berry, Sackett, & Wiemann, 2007).

 

Do honesty tests work? Overall, the answer is yes, as several reviews have documented (Ones, Viswesvaran, & Schmidt, 1993; Van Iddekinge, Roth, Raymark, & Odle-Dusseau, 2012a). However, the precise extent to which such tests predict performance—and what specific facets of performance they predict—is less clear. Ones et al. (1993) conducted a meta-analysis of 665 validity coefficients that were based on 576,460 test takers. The average validity of the tests, when used to predict supervisory ratings of performance, was .41. Results for overt and personality-based tests were similar. However, the average validity of overt tests for predicting theft per se was much lower: .13. Van Iddekinge et al. (2012a) conducted a subsequent meta-analysis that relied on fewer studies (i.e., 104 studies representing 134 independent samples) because of “concerns centered around the perceived lack of methodological rigor within this literature and a heavy reliance on unpublished data from firms that publish the integrity tests (e.g., 90% of the studies in Ones et al.’s 1993 meta-analysis)” (Van Iddekinge, Roth, Raymark, & Odle-Dusseau, 2012b, p. 543). Van Iddekinge et al.’s (2012a) updated meta-analytic results revealed the following mean observed and corrected (for unreliability in the criterion) validity coefficients: .12 and .15 for job performance, .13 and .16 for training performance, .26 and .32 for counterproductive work behaviors, and .07 and .09 for turnover.

 

The Van Iddekinge et al. (2012a) results were controversial and led to a forceful reaction on the part of test vendors (Harris et al., 2012), who concluded, “In light of Van Iddekinge et al.’s substantially smaller sample of studies, and arguable methodological decisions, we are inclined to accord more weight to Ones et al.’s findings when there is a difference in conclusion” (p. 535). In their defense, regarding the number of studies included in their meta-analyses, Van Iddekinge et al. (2012b) wrote, “After several months of correspondence … we were informed that it was no longer possible to provide us access to the additional studies. Moreover, Jones informed us that Vangent’s corporate attorneys wanted us to know that we did not have permission to use several technical reports we had obtained from another researcher because the reports had not been released into the public domain (yet apparently were provided to Ones et al., 1993, and other researchers).” Clearly, this not the end of the discussion regarding the relative validity of honesty tests. At this point, we do not know what factors caused the different results reported by Ones et al. (1993) compared to Van Iddekinge et al. (2012a), but it seems that different study-inclusion criteria, corrections for artifacts, and second-order sampling error are not the culprits and more details on meta-analytic procedures, including coding, are necessary to address this issue (Sackett & Schmitt, 2012).

 

Although honesty tests are overall good predictors of certain performance facets, at least four key issues have yet to be resolved. First, as in the case of biodata inventories, there is a need for a greater understanding of the construct validity of integrity tests given that integrity tests are not interchangeable (i.e., scores for the same individuals on different types of integrity tests are not necessarily similar). Some investigations have sought evidence regarding the relationship between integrity tests and some broad personality traits. But there is a need to understand the relationship between integrity tests and individual characteristics more directly related to integrity tests such as object beliefs, negative life themes, and power motives (Mumford, Connelly, Helton, Strange, & Osburn, 2001). Second, women tend to score approximately .16 standard deviation unit higher than men, and job applicants aged 40 years and older tend to score .08 standard deviation unit higher than applicants younger than 40 (Ones & Viswesvaran, 1998). At this point, we do not have a clear reason for these findings. Third, many writers in the field apply the same language and logic to integrity testing as to ability testing. Yet there is an important difference: While it is possible for an individual with poor moral behavior to “go straight,” it is certainly less likely that an individual who has demonstrated a lack of intelligence will “go smart.” If they are honest about their past, therefore, reformed individuals with a criminal past may be “locked into” low scores on integrity tests (and, therefore, be subject to classification error) (Lilienfeld, Alliger, & Mitchell, 1995). Thus, the broad validation evidence that is often acceptable for cognitive ability tests may not hold up in the public policy domain for integrity tests. Fourth, there is the real threat of intentional distortion (Alliger, Lilienfeld, & Mitchell, 1996). It is quite ironic that job applicants are likely to be dishonest in completing an honesty test. For example, as mentioned earlier, McFarland and Ryan (2000) found that, when study participants who were to complete an honesty test were instructed to “answer questions in such a way as to make you look as good an applicant as possible,” scores were 1.78 standard deviation units higher than when they were instructed to “answer questions as honestly as possible.” Finally, test publishers have an undeniable conflict of interest regarding research addressing the validity of their own tests, much like we described in Chapter 8 regarding the assessment of test fairness (i.e., differential prediction). At the same time, they have legitimate concerns that “[i]t would be helpful to publishers, as well as the field, if there were mechanisms to protect the interest of testing clients in the same manner as human subjects, and more opportunities to publish or distribute the many strong validity studies in publishers’ files” (Harris et al., 2012, p. 532).

 

Given the challenges and unresolved issues, researchers are exploring alternative ways to assess integrity and other personality-based constructs (e.g., Van Iddekinge, Raymark, & Roth, 2005). One promising approach is conditional reasoning testing (Frost, Chia-Huei, & James, 2007; James et al., 2005), which focuses on how people solve what appear to be traditional inductive-reasoning problems. However, the true intent of the scenarios presented is to determine respondents’ solutions based on their implicit biases and preferences. These underlying biases usually operate below the surface of consciousness and are revealed based on the respondents’ responses. Another promising approach is to assess integrity as part of a situational judgment test (discussed in detail in Chapter 13), in which applicants are given a scenario and are asked to choose a response that is most closely aligned with what they would do (Becker, 2005). Consider the following example of an item developed by Becker (2005):

 

Your work team is in a meeting discussing how to sell a new product. Everyone seems to agree that the product should be offered to customers within the month. Your boss is all for this, and you know he does not like public disagreements. However, you have concerns because a recent report from the research department points to several potential safety problems with the product. Which of the following do you think you would most likely do?

 

Possible answers:

 

A. Try to understand why everyone else wants to offer the product to customers this month. Maybe your concerns are misplaced. [–1]

B. Voice your concerns with the product and explain why you believe the safety issues need to be addressed. [1]

C. Go along with what others want to do so that everyone feels good about the team. [–1]

D. Afterwards, talk with several other members of the team to see if they share your concerns. [0]

 

The scoring for the above item is –1 for answers A and C (i.e., worst-possible score), 0 for answer D (i.e., neutral score), and +1 for item B (i.e., best-possible score). One advantage of using scenario-based integrity tests is that they are intended to capture specific values rather than general integrity-related traits. Thus, these types of tests may be more defensible both scientifically and legally because they are based on a more precise definition of integrity, including specific types of behaviors. A study based on samples of fast-service employees (n = 81), production workers (n = 124), and engineers (n = 56) found that validity coefficients for the integrity test (corrected for criterion unreliability) were .26 for career potential, .18 for leadership, and .24 for in-role performance (all as assessed by managers’ ratings) (Becker, 2005).

Evaluation of Training and Experience

 

Judgmental evaluations of the previous work experience and training of job applicants, as presented on résumés and job applications, is a common part of initial screening. Sometimes evaluation is purely subjective and informal, and sometimes it is accomplished in a formal manner according to a standardized method. Evaluating job experience is not as easy as one may think because experience includes both qualitative and quantitative components that interact and accrue over time (Aguinis, O’Boyle, Gonzalez-Mulé, & Joo, 2016); hence, work experience is multidimensional and temporally dynamic (Tesluk & Jacobs, 1998). However, using experience as a predictor of future performance can pay off. Specifically, a study including more than 800 U.S. Air Force enlisted personnel indicated that ability and experience seem to have linear and noninteractive effects (Lance & Bennett, 2000). Another study that also used military personnel showed that work experience items predict performance above and beyond cognitive abilities and personality (Jerry & Borman, 2002). These findings explain why the results of a survey of more than 200 staffing professionals of the National Association of Colleges and Employers revealed that experienced hires were evaluated more highly than new graduates on most characteristics (Rynes, Orlitzky, & Bretz, 1997).

 

An empirical comparison of four methods for evaluating work experience indicated that the “behavioral consistency” method showed the highest mean validity, at .45 (McDaniel, Schmidt, & Hunter, 1988). This method requires applicants to describe their major achievements in several job-related areas. These areas are behavioral dimensions rated by supervisors as showing maximal differences between superior and minimally acceptable performers. The applicants’ achievement statements are then evaluated using anchored rating scales. The anchors are achievement descriptors whose values along a behavioral dimension have been determined reliably by subject matter experts.

 

A similar approach to the evaluation of training and experience, one most appropriate for selecting professionals, is the accomplishment record (AR) method (Hough, 1984). A comment frequently heard from professionals is “My record speaks for itself.” The AR is an objective method for evaluating those records. It is a type of biodata/maximum performance/self-report instrument that appears to tap a component of an individual’s history that is not measured by typical biographical inventories. It correlates essentially zero with aptitude test scores, honors, grades, and prior activities and interests.

 

Development of the AR begins with the collection of critical incidents to identify important dimensions of job performance. Then rating principles and scales are developed for rating an individual’s set of job-relevant achievements. The method yields (a) complete definitions of the important dimensions of the job, (b) summary principles that highlight key characteristics to look for when determining the level of achievement demonstrated by an accomplishment, (c) examples of accomplishments that job experts agree represent various levels of achievement, and (d) numerical equivalents that allow the accomplishments to be translated into quantitative indexes of achievement. When the AR was applied in a sample of 329 attorneys, the reliability of the overall performance ratings was a respectable .82, and the AR demonstrated a validity of .25. Moreover, the method appears to be fair for females, minorities, and white males.

 

What about academic qualifications? They tend not to affect managers’ hiring recommendations, as compared to work experience, and they could even have a negative effect. For candidates with poor work experience, having higher academic qualifications seems to reduce their chances of being hired (Singer & Bruhns, 1991). These findings were supported by a national survey of 3,000 employers by the U.S. Census Bureau. The most important characteristics employers said they considered in hiring were attitude, communication skills, and previous work experience. The least important were academic performance (grades), school reputation, and teacher recommendations (Applebome, 1995). Moreover, when grades are used, they tend to have adverse impact on ethnic minority applicants (Roth & Bobko, 2000).

Drug Screening

 

Drug screening tests began in the military, spread to the sports world, and now are becoming common in employment (Aguinis & Henle, 2005). In fact, about 50% of employers use some type of drug screening for all of their job applicants in the United States (Lieberman, 2017). Critics charge that such screening violates an individual’s right to privacy and that the tests are frequently inaccurate (Morgan, 1989), for example, as a result of cheating (see Box 12.2). These critics do concede, however, that employees in jobs where public safety is crucial—such as nuclear power plant operators and commercial jet pilots—should be screened for drug use. In fact, perceptions of the extent to which different jobs might involve danger to the worker, to coworkers, or to the public are strongly related to the acceptability of drug testing (Murphy, Thornton, & Prue, 1991).

 

Do the results of such tests forecast certain aspects of later job performance? In perhaps the largest reported study of its kind, the U.S. Postal Service took urine samples from 5,465 job applicants. It never used the results to make hiring decisions and did not tell local managers of the findings. When the data were examined six months to a year later, workers who had tested positively prior to employment were absent 41% more often and were fired 38% more often. There were no differences in voluntary turnover between those who tested positively and those who did not. These results held up even after adjustment for factors such as age, gender, and race. As a result, the Postal Service implemented preemployment drug testing nationwide (Wessel, 1989).

 

Is such drug screening legal? In two rulings in 1989, the Supreme Court upheld (a) the constitutionality of the government regulations that require railroad crews involved in accidents to submit to prompt urinalysis and blood tests and (b) urine tests for U.S. Customs Service (now U.S. Customs and Border Protection) employees seeking drug-enforcement posts. Overall, an employer has a legal right to ensure that employees perform their jobs competently and that no employee endangers the safety of other workers. So, if illegal drug use, on or off the job, may reduce job performance and endanger coworkers, the employer has adequate legal grounds for conducting drug tests.

 

To avoid legal challenge, consider instituting the following procedures:

 

Inform all employees and job applicants, in writing, of the company’s policy regarding drug use.

Include the drug policy and the possibility of testing in all employment contracts.

Present the program in a medical and safety context—namely, that drug screening will help to improve the health of employees and also help to ensure a safer workplace.

 

If drug screening will be used with employees as well as job applicants, tell employees in advance that drug testing will be a routine part of their employment (Angarola, 1985).

 

To enhance perceptions of fairness, employers should provide advance notice of drug tests, preserve the right to appeal, emphasize that drug testing is a means to enhance workplace safety, attempt to minimize invasiveness, and train supervisors (Konovsky & Cropanzano, 1991; Tepper, 1994). In addition, employers must understand that perceptions of drug testing fairness are affected not only by the program’s characteristics but also by employee characteristics. For example, employees who have friends who have failed a drug test are less likely to have positive views of drug testing (Aguinis & Henle, 2005).

omputer-Based Screening

 

The rapid development of computer technology over the past few years has resulted in faster microprocessors and more flexible and powerful software that can incorporate graphics and sound. These technological advances now allow organizations to conduct computer-based screening (CBS). Using the Internet, companies can conduct CBS and administer job-application forms, structured interviews (discussed later in this chapter), and other types of tests globally, 24 hours a day, 7 days a week (Jones & Dages, 2003).

 

CBS can be used simply to convert a screening tool from paper to an electronic format that is called an electronic page turner. These types of CBS are low on interactivity and do not take full advantage of technology (Olson-Buchanan, 2002). By contrast, Nike uses interactive voice-response technology to screen applicants over the telephone, the U.S. Air Force uses computer-adaptive testing (CAT) on a regular basis (Ree & Carretta, 1998), and other organizations such as Home Depot and JCPenney use a variety of technologies for screening (Chapman & Webster, 2003; Overton, Harms, Taylor, & Zickar, 1997). CAT presents all applicants with a set of items of average difficulty and, if responses are correct, items with higher levels of difficulty. If responses are incorrect, items with lower levels of difficulty are presented. CAT uses item response theory (see Chapter 6) to estimate an applicant’s level of the underlying trait based on the relative difficulty of the items answered correctly and incorrectly. The potential value added by using computers as screening devices is obvious when one considers that implementation of CAT would be nearly impossible using traditional paper-and-pencil instruments (Olson-Buchanan, 2002).

 

There are several potential advantages of using CBS (Kantrowitz et al., 2011; Olson-Buchanan, 2002). First, administration may be easier. For example, standardization is maximized because there are no human proctors who may give different instructions to different applicants (i.e., computers give instructions consistently to all applicants). Also, responses are recorded and stored automatically, which is a practical advantage, but can also help minimize data-entry errors. Second, applicants can access the test from remote locations, thereby increasing the applicant pool. Third, computers can accommodate applicants with disabilities in a number of ways, particularly since tests can be completed from their own (possibly modified) computers. A modified computer can caption audio-based items for applicants with hearing disabilities, or it can allow applicants with limited hand movement to complete a test. Finally, preliminary evidence suggests that Web-based assessment does not exacerbate adverse impact.

 

Despite the increasing availability and potential benefits of CBS, concerns about implementation include cost and potential cheating. Moreover, some testing experts believe that high-stakes tests, such as those used to make employment decisions, cannot be administered in unproctored Internet settings (Tippins et al., 2006). CAT is able to address some of these concerns because, in contrast to static testing (i.e., all applicants receive the same items), with CAT each applicant is administered a test including potentially different items, which addresses the cheating concern. Additional challenges in implementing CBS include the relative lack of access of low-income individuals to the Internet, or what is called the digital divide (Stanton & Rogelberg, 2001).

 

Olson-Buchanan (2002) concluded that innovations in CBS have not kept pace with the progress in computer technology. This disparity was attributed to three major factors: (1) costs associated with CBS development, (2) lag in scientific guidance for addressing reliability and validity issues raised by CBS, and (3) the concern that investment in CBS may not result in tangible payoffs.

 

Fortunately, many of the concerns are being addressed by ongoing research on the use, accuracy, equivalence, and efficiency of CBS. For example, Ployhart, Weekley, Holtz, and Kemp (2003) found that proctored, Web-based testing has several benefits compared to the more traditional paper-and-pencil administration. Their study included nearly 5,000 applicants for telephone-service-representative positions who completed, among other measures, a biodata instrument. Results indicated that scores resulting from the Web-based administration had similar or better psychometric characteristics, including distributional properties, lower means, more variance, and higher internal-consistency reliabilities. Another study examined reactions to CAT and found that applicants’ reactions are positively related to their perceived performance on the test (Tonidandel, Quiñones, & Adams, 2002). Thus, changes in the item-selection algorithm that result in a larger number of items answered correctly have the potential to improve applicants’ perceptions of CAT.

 

In sum, HR specialists now have the opportunity to implement CBS in their organizations. If implemented well, CBS carries numerous advantages. In fact, the use of computers and the Internet is making testing cheaper and faster, and it may serve as a catalyst for even more widespread use of tests for employment purposes (Tippins et al., 2006). However, the degree of success in implementing CBS will depend not only on the features of the test itself but also on organizational-level variables such as the culture and climate for technological innovation (Anderson, 2003).

Employment Interviews

 

Use of the interview in selection is almost universal today (Moscoso, 2000). Perhaps this is so because in the employment context the interview serves as much more than just a selection device. The interview is a communication process, whereby the applicant learns more about the job and the organization and begins to develop some realistic expectations about both.

 

When an applicant is accepted, terms of employment typically are negotiated during an interview. If the applicant is rejected, the interviewer performs an important public relations function, for it is essential that the rejected applicant leave with a favorable impression of the organization and its employees. For example, several studies found that perceptions of the interview process and the interpersonal skills of the interviewer, as well as his or her skills in listening, recruiting, and conveying information about the company and the job the applicant would hold, affected the applicant’s evaluations of the interviewer and the company (Kohn & Dipboye, 1998; Schmitt & Coyle, 1979). However, the likelihood of accepting a job, should one be offered, was still mostly unaffected by the interviewer’s behavior (Powell, 1991).

 

As a selection device, the interview performs two vital functions: It can fill information gaps in other selection devices (e.g., regarding incomplete or questionable application blank responses; Tucker & Rowe, 1977), and it can be used to assess factors that can be measured only via face-to-face interaction (e.g., appearance, speech, poise, and interpersonal competence). Is the applicant likely to “fit in” and share values with other organizational members (Cable & Judge, 1997)? Is the applicant likely to get along with others in the organization or be a source of conflict? Where can his or her talents be used most effectively? Interview impressions and perceptions can help to answer these kinds of questions. In fact, well-designed interviews can be helpful because they allow examiners to gather information on constructs not typically assessed via other means such as empathy (Cliffordson, 2002) and personal initiative (Fay & Frese, 2001). For example, a review of 388 characteristics rated in 47 actual interview studies revealed that personality traits (e.g., responsibility, dependability, and persistence, which are all related to conscientiousness) and applied social skills (e.g., interpersonal relations, social skills, team focus, ability to work with people) are rated more often in employment interviews than any other type of construct (Huffcutt, Conway, Roth, & Stone, 2001). In addition, interviews can contribute to the prediction of job performance over and above cognitive abilities and conscientiousness (Cortina, Goldstein, Payne, Davison, & Gilliland, 2000), as well as experience (Day & Carroll, 2002).

 

Since few employers are willing to hire applicants they have never seen, it is imperative that we do all we can to make the interview as effective a selection technique as possible. Next, we consider some of the research on interviewing and offer suggestions for improving the process and outcome.

Response Distortion in the Interview

 

Distortion of interview information is probable (Weiss & Dawis, 1960), the general tendency being to upgrade rather than downgrade prior work experience. That is, interviewees tend to be affected by social desirability bias, which is a tendency to answer questions in a more socially desirable direction (i.e., to attempt to look good in the eyes of the interviewer). In addition to distorting information, applicants tend to engage in influence tactics to create a positive impression, and they typically do so by displaying self-promotion behaviors (Stevens & Kristof, 1995).

 

But will social desirability distortion be reduced if the interviewer is a computer? According to Martin and Nagao (1989), candidates tend to report their grade point averages and scholastic aptitude test scores more accurately to computers than in face-to-face interviews. Perhaps this is due to the “big brother” effect. That is, because responses are on a computer rather than on paper, they may seem more subject to instant checking and verification through other computer databases. To avoid potential embarrassment, applicants may be more likely to provide truthful responses. However, Martin and Nagao’s study also placed an important boundary condition on computer interviews: There was much greater resentment by individuals competing for high-status positions than for low-status positions when they had to respond to a computer rather than a live interviewer.

 

A more comprehensive study was conducted by Richman, Kiesler, Weisband, and Drasgow (1999). They conducted a meta-analysis synthesizing 61 studies (673 effect sizes), comparing response distortion in computer questionnaires with traditional paper-and-pencil questionnaires and face-to-face interviews. Results revealed that computer-based interviews decreased social-desirability distortion compared to face-to-face interviews, particularly when the interviews addressed highly sensitive personal behavior (e.g., use of illegal drugs). Perhaps this is so because a computer-based interview is more impersonal than the observation of an interviewer and from social cues that can arouse an interviewee’s evaluation apprehension.

 

A more subtle way to distort the interview is to engage in impression-management behaviors (Lievens & Peeters, 2008; Roulin, Bangerter, & Levashina, 2015). For example, applicants who are pleasant and compliment the interviewer are more likely to receive more positive evaluations. Two specific types of impression management, ingratiation and self-promotion, seem to be most effective in influencing interviewers’ rating favorably (Higgins & Judge, 2004). A research program involving five different experiments using real-time video coding showed that interviewers are not able to detect impression management—although they were better at detecting honest impression management (i.e., truthfully describing actual job-related abilities, accomplishments, and experiences) than deceptive impression management (i.e., embellishing job-related credentials or creating credentials that fit with the job requirements) (Roulin et al., 2015). Moreover, experienced interviewers were no better than novices. Training can help improve this situation. Specifically, such training should emphasize that deception detection improves when the interviewer focuses on story-related cues (e.g., vagueness, contradictions) instead of nonverbal cues (e.g., gaze aversions, posture change, fidgeting) (Roulin et al., 2015).

Reliability and Validity

 

An early meta-analysis of only 10 validity coefficients that were not corrected for range restriction yielded a validity of .14 when the interview was used to predict supervisory ratings (Hunter & Hunter, 1984). Subsequent meta-analyses that did correct for range restriction and used larger samples of studies reported more encouraging results. Wiersner and Cronshaw (1988) found a mean corrected validity of .47 across 150 interview validity studies involving all types of criteria. McDaniel, Whetzel, Schmidt, and Maurer (1994) analyzed 245 coefficients derived from 86,311 individuals and found a mean corrected validity of .37 for job performance criteria. However, validities were higher when criteria were collected for research purposes (.47) than for administrative decision making (.36). Marchese and Muchinsky (1993) reported a mean corrected validity of .38 across 31 studies. A fourth review (Huffcutt & Arthur, 1994) analyzed 114 interview validity coefficients from 84 published and unpublished references, exclusively involving entry-level jobs and supervisory rating criteria. When corrected for criterion unreliability and range restriction, the mean validity across all 114 studies was .37. Finally, Schmidt and Rader (1999) meta-analyzed 40 studies of structured telephone interviews and obtained a corrected validity coefficient of .40 using performance ratings as a criterion. The results of these studies agree quite closely.

 

Regarding reliability, a meta-analysis of 125 interrater reliability coefficients with a total sample size of 32,428 derived from employment interviews revealed an overall mean of .68 (Huffcutt, Culbertson, & Weyhrauch, 2013). However, the 80% credibility interval, meaning that 80% of the true population coefficients fall within this interval, ranged from .42 to .94. In fact, an analysis of the level of structure of the interviews revealed that reliability was lowest when structure was low (i.e., .36), and highest when structure was high (.76). Given our discussion in Chapter 7 regarding the relation between reliability and validity, the best way to improve validity is to improve the degree of structure of the interview (discussed later in this chapter).

 

As Hakel (1989) noted, interviewing is a difficult cognitive and social task. Managing a smooth social exchange while simultaneously processing information about an applicant makes interviewing uniquely difficult among all managerial tasks. Research continues to focus on cognitive factors (e.g., preinterview impressions) and social factors (e.g., interviewer–interviewee similarity). As a result, we now know a great deal more about what goes on in the interview and about how to improve the process. At the very least, we should expect interviewers to be able to form opinions only about traits and characteristics that are overtly manifest in the interview (or that can be inferred from the applicant’s behavior), and not about traits and characteristics that typically would become manifest only over a period of time—traits such as creativity, dependability, and honesty. In the following subsections, we examine what is known about the interview process and about ways to enhance the effectiveness and utility of the selection interview.

Factors Affecting the Decision-Making Process

 

A large body of literature attests to the fact that the decision-making process involved in the interview is affected by several factors. Specifically, 278 studies have examined numerous aspects of the interview (Posthuma, Morgeson, & Campion, 2002). Posthuma et al. (2002) provided a useful framework to summarize and describe this large body of research. We follow this taxonomy in part and consider factors affecting the interview decision-making process in each of the following areas: (a) social/interpersonal factors (e.g., interviewer–applicant similarity), (b) cognitive factors (e.g., preinterview impressions), (c) individual differences (e.g., applicant appearance, interviewer training and experience), and (d) structure (i.e., degree of standardization of the interview process and discretion an interviewer is allowed in conducting the interview).

Social/Interpersonal Factors

 

As noted earlier, the interview is fundamentally a social and interpersonal process. As such, it is subject to influences such as interviewer–applicant similarity and verbal and nonverbal cues. We describe each of these factors next.

Interviewer–Applicant Similarity.

 

Similarity leads to attraction, attraction leads to positive affect, and positive affect can lead to higher interview ratings (Schmitt, Pulakos, Nason, & Whitney, 1996). Moreover, similarity leads to greater expectations about future performance (García, Posthuma, & Colella, 2008). Does similarity between the interviewer and the interviewee regarding race, age, and attitudes affect the interview? Lin, Dobbins, and Farh (1992) reported that ratings of African American and Latino interviewees, but not not white interviewees, were higher when the interviewer was the same race as the applicant. However, Lin et al. (1992) found that the inclusion of at least one different-race interviewer in a panel eliminated the effect, and no effect was found for age similarity. Further, when an interviewer feels that an interviewee shares his or her attitudes, ratings of competence and affect are increased (Howard & Ferris, 1996). The similarity effects are not large, however, and they can be reduced or eliminated by using a structured interview and a diverse set of interviewers.

Verbal and Nonverbal Cues.

 

In terms of verbal cues, Anderson (1960) found that the applicant was more likely to be hired in interviews where the interviewer did a lot more of the talking and there was less silence. Other research has shown that the length of the interview depends much more on the quality of the applicant (interviewers take more time to decide when dealing with a high-quality applicant) and on the expected length of the interview. The longer the expected length of the interview, the longer it takes to reach a decision (Tullar, Mullins, & Caldwell, 1979).

 

Several studies have also examined the impact of nonverbal cues on impression formation and decision making in the interview. Nonverbal cues have been shown to have an impact, albeit small, on interviewer judgments (DeGroot & Motowidlo, 1999). For example, Imada and Hakel (1977) found that positive nonverbal cues (e.g., smiling, attentive posture, smaller interpersonal distance) produced consistently favorable ratings. Most important, however, nonverbal behaviors interact with other variables such as gender. Aguinis, Simonsen, and Pierce (1998) found that a man displaying direct eye contact during an interview is rated as more credible than another one not making direct eye contact. However, a follow-up replication using exactly the same experimental conditions revealed that a woman displaying identical direct eye contact behavior was seen as coercive (Aguinis & Henle, 2001a).

 

Overall, the ability of a candidate to respond concisely, to answer questions fully, to state personal opinions when relevant, and to keep to the subject at hand appears to be more crucial in obtaining a favorable employment decision (Parsons & Liden, 1984; Rasmussen, 1984). High levels of nonverbal behavior tend to have more positive effects than low levels only when the verbal content of the interview is good. When verbal content is poor, high levels of nonverbal behavior may result in lower ratings.

Cognitive Factors

 

The interviewer’s task is not easy because humans are limited information processors and have biases in evaluating others (Kraiger & Aguinis, 2001). However, we have a good understanding of the impact of factors such as preinterview impressions and confirmatory bias, first impressions, stereotypes, contrast effect, and information recall. Let’s review major findings regarding the way in which each of these factors affects the interview.

Preinterview Impressions and Confirmatory Bias.

 

Dipboye (1982, 1992) specified a model of self-fulfilling prophecy to explain the impact of first preinterview impressions. Both cognitive and behavioral biases mediate the effects of preinterview impressions (based on letters of reference or applications) on the evaluations of applicants. Behavioral biases occur when interviewers behave in ways that confirm their preinterview impressions of applicants (e.g., showing positive or negative regard for applicants). Cognitive biases occur if interviewers distort information to support preinterview impressions or use selective attention and recall of information. This sequence of behavioral and cognitive biases produces a self-fulfilling prophecy.

 

Consider how one applicant was described by an interviewer given positive information:

 

Alert, enthusiastic, responsible, well-educated, intelligent, can express himself well, organized, well-rounded, can converse well, hard worker, reliable, fairly experienced, and generally capable of handling himself well.

 

On the basis of negative preinterview information, the same applicant was described as follows:

 

Nervous, quick to object to the interviewer’s assumptions, and doesn’t have enough self-confidence. (Dipboye, Stramler, & Fontanelle, 1984, p. 567)

 

Content coding of employment interviews found that favorable first impressions were followed by the use of confirmatory behavior—such as indicating positive regard for the applicant, “selling” the company, and providing job information to applicants—while gathering less information from them. For their part, applicants behaved more confidently and effectively and developed better rapport with interviewers (Dougherty, Turban, & Callender, 1994). These findings support the existence of the confirmatory bias produced by first impressions.

 

Another aspect of expectancies concerns test score or biodata score information available prior to the interview. A study of 577 candidates for the position of life insurance sales agent found that interview ratings predicted the hiring decision and survival on the job best for applicants with low passing scores on the biodata test and poorest for applicants with high passing scores (Dalessio & Silverhart, 1994). Apparently, interviewers had such faith in the validity of the test scores that, if an applicant scored well, they gave little weight to the interview. When the applicant scored poorly, however, they gave more weight to performance in the interview and made better distinctions among candidates.

First Impressions.

 

An early series of studies conducted at McGill University over a 10-year period (Webster, 1964, 1982) found that early interview impressions play a dominant role in final decisions (select/reject). These early impressions establish a bias in the interviewer (not usually reversed) that colors all subsequent interviewer–applicant interaction. Early impressions were crystallized after a mean interviewing time of only four minutes!

 

In addition, the interview is primarily a search for negative information. For example, just one unfavorable impression was followed by a reject decision 90% of the time. Positive information was given much less weight in the final decision (Bolster & Springbett, 1961).

 

Consider the effect of how the applicant shakes the interviewer’s hand (Stewart, Dustin, Barrick, & Darnold, 2008). A study using 98 undergraduate students found that quality of handshake was related to the interviewer’s hiring recommendation. It seems that quality of handshake conveys the positive impression that the applicant is extraverted, even when the candidate’s physical appearance and dress are held constant. Also, in this particular study women received lower ratings for the handshake compared with men, but they did not, on average, receive lower assessments of employment suitability.

Prototypes and Stereotypes.

 

Returning to the McGill studies, perhaps the most important finding was that interviewers tend to develop their own prototype of a good applicant and proceed to accept those who match their prototype (Rowe, 1963; Webster, 1964). Later research has supported these findings. To the extent that the interviewers hold negative stereotypes of a group of applicants, and these stereotypes deviate from the perception of what is needed for the job or translate into different expectations or standards of evaluation for minorities, stereotypes may have the effect of lowering interviewers’ evaluations, even when candidates are equally qualified for the job (Arvey, 1979).

 

Similar considerations apply to gender-based stereotypes. The social psychology literature on gender-based stereotypes indicates that the traits and attributes necessary for managerial success resemble the characteristics, attitudes, and temperaments of the masculine gender role more than the feminine gender role (Aguinis & Adams, 1998). The operation of such stereotypes may explain the conclusion by Arvey and Campion (1982) that female applicants receive lower scores than male applicants.

Contrast Effects.

 

Several studies have found that, if an interviewer evaluates a candidate who is just average after evaluating three or four very unfavorable candidates in a row, the average candidate tends to be evaluated favorably. When interviewers evaluate more than one candidate at a time, they tend to use other candidates as a standard. Whether they rate a candidate favorably, then, is determined partly by others against whom the candidate is compared (Hakel, Ohnesorge, & Dunnette, 1970; Heneman, Schwab, Huett, & Ford, 1975; Landy & Bates, 1973).

 

These effects are remarkably tenacious. Wexley, Sanders, and Yukl (1973) found that, despite attempts to reduce contrast effects by means of a warning (lecture) and/or an anchoring procedure (comparison of applicants to a preset standard), subjects continued to make this error. Only an intensive workshop (which combined practical observation and rating experience with immediate feedback) led to a significant behavior change. Similar results were reported in a later study by Latham, Wexley, and Pursell (1975). In contrast to subjects in group discussion or control groups, only those who participated in the intensive workshop did not commit contrast, halo, similarity, or first impression errors six months after training.

Information Recall.

 

A practical question concerns the ability of interviewers to recall what an applicant said during an interview. Here is how this question was examined in one study (Carlson, Thayer, Mayfield, & Peterson, 1971).

 

Prior to viewing a 20-minute videotaped selection interview, 40 managers were given an interview guide, pencils, and paper and were told to perform as if they were conducting the interview. Following the interview, the managers were given a 20-question test, based on factual information. Some managers missed none, while others missed as many as 15 out of 20 items. The average number was 10 wrong.

 

After this short interview, half the managers could not report accurately on the information produced during the interview! By contrast, managers who had been following the interview guide and taking notes were quite accurate on the test. Those who were least accurate in their recollections assumed the interview was generally favorable and rated the candidate higher in all areas and with less variability. They adopted a halo strategy. Those managers who knew the facts rated the candidate lower and recognized intraindividual differences. Hence, the more accurate interviewers used an individual-differences strategy.

 

None of the managers in this study was given an opportunity to preview an application form prior to the interview. Would that have made a difference? Other research indicates that the answer is no (Dipboye, Fontanelle, & Garner, 1984). When it comes to recalling information after the interview, there seems to be no substitute for note taking during the interview. However, the act of note taking alone does not necessarily improve the validity of the interview; interviewers need to be trained on how to take notes regarding relevant behaviors (Burnett, Fan, Motowidlo, & DeGroot, 1998). Note taking helps information recall, but it does not in itself improve the judgments based on such information (Middendorf & Macan, 2002). In addition to note taking, other memory aids include mentally reconstructing the context of the interview and retrieving information from different starting points (Mantwill, Kohnken, & Aschermann, 1995).

Individual Differences

 

A number of individual-difference variables play a role in the interview process. These refer to characteristics of both the applicant and the interviewer. Let’s review applicant characteristics first, followed by interviewer characteristics.

What Students Are Saying About Us

.......... Customer ID: 12*** | Rating: ⭐⭐⭐⭐⭐
"Honestly, I was afraid to send my paper to you, but splendidwritings.com proved they are a trustworthy service. My essay was done in less than a day, and I received a brilliant piece. I didn’t even believe it was my essay at first 🙂 Great job, thank you!"

.......... Customer ID: 14***| Rating: ⭐⭐⭐⭐⭐
"The company has some nice prices and good content. I ordered a term paper here and got a very good one. I'll keep ordering from this website."

"Order a Custom Paper on Similar Assignment! No Plagiarism! Enjoy 20% Discount"