Personality Correlates of Assessment Center Consensus Competency Ratings: Evidence from Russia

Svetlana Simonenko
Detech Group, Moscow, Russia
George C. Thornton III
Colorado State University, Fort Collins, CO 80523, USA.
Anna Kravtcova
Saratov State University, Saratov, Russia
Alyssa M. Gibbons
Colorado State University, Fort Collins, CO, USA
Controversy has revolved around whether assessment center ratings have construct validity to measure intended dimensions of managerial performance. In contrast to much recent research on the internal structure of assessment center ratings, the present studies investigated the relationship of final competency ratings derived by consensus discussion with external questionnaire measures of personality characteristics. Expanding on previous studies showing correlations of dimension scores in relation to individual trait measures, this study investigated the relationship of complex competencies with both single personality traits and with composites of personality traits. Evidence from two samples of managers in Russia shows that final competency ratings are related to predicted composites of personality factors more consistently than to single factors. Taken together, these findings provide evidence that assessment center ratings derived by consensus discussion show construct validity in relationship with predicted composites of personality characteristics.

1. Introduction

For the past 25 years, the only major criticism of the assessment center (AC) method is that AC ratings do not demonstrate construct validity to measure job-related performance dimensions. Whereas much of the criticism is based on analyses of relationships of ratings within the AC method, in the present studies, we investigate the relationships of final AC competency ratings with external measures of theoretically related personality characteristics.

Most recent research on AC construct validity has focused on the internal structure of assessors' ratings, and the question of whether ratings of performance dimensions compared across different exercises share sufficient common variance to be considered meaningful constructs (Lance, 2008a, 2008b). That criticism has been countered with reviews of supportive evidence (Thornton, 2012; Thornton & Rupp, 2012). More recent studies continue to show construct validity in post-exercise dimension ratings (PEDRs). Guenole, Chernyshenko, Stark, Corkerill, and Drasgow (2011, 2012) found that when assessors are certified to understand and follow a common frame of reference when assessing, meaningful variation in ratings is attributable to dimensions. In addition, research has shown that dimension loadings from factor analyses of PEDRs are equivalent across exercises, and thus it is meaningful to combine these ratings into across-exercise ratings (Guenole et al., 2012). Furthermore, Kuncel and Sackett (2013) found that when PEDRs are aggregated across as few as three exercises, dimension variance dominates over exercise variance. In addition, Putka and Hoffman (2012) found that PEDRs are a complex function of agreement in ratings of the dimensions across assessors, a stable component across exercises and dimensions, and a situationally variable component reflecting the combination of the assessee, exercise, and dimension.

By contrast, there is now considerable consensus that AC construct validity should not be established only, or primarily, by treating dimensions and exercises within an AC as if they were the 'traits' and 'methods' respectively in a multitrait–multimethod matrix (Arthur, Day, & Woehr, 2008; Howard, 1997, 2008). Although there is value in studying the internal structure of within-AC ratings, such internal analyses are equivalent to studying the items on a test. Item-level analyses provide information about how the items relate to one another, but they provide limited information about whether the items actually measure the intended construct. For that, evidence from outside the test itself is needed: for example, evidence about how the test relates as predicted to relevant criteria and to other related constructs (American Educational Research Association, American Psychological Association, & American Council on Measurement in Education, 1999; Society for Industrial and Organizational Psychology, 2003), and how variations in test scores produce variation in outcomes (Borsboom, Mellenbergh, & van Heerden, 2004).

Evidence that AC ratings predict outcome criteria is quite extensive (Gaugler, Rosenthal, Thornton & Bentson, 1987; Hardison & Sackett, 2004; Hermelin, Lievens, & Robertson, 2007; Thornton & Byham, 1982). However, to paraphrase Sackett and Tuzinski (2001), such overall predictive relationships indicate only that ACs measure some constructs that are relevant to success in organizations; they do not tell us which constructs. There is evidence that overall AC ratings are not simply measures of general cognitive ability (Dayan, Kasten, & Fox, 2002; Dilchert & Ones, (2009), Goldstein, Yusko, Braverman, Smith, & Chung, 1998; Hardison, 2005; Krause, Kersting, Heggestad, & Thornton, 2006), nor of broad personality traits such as the Big Five (Dilchert & Ones, 2009; Goffin, Rothstein, & Johnston, 1996; Hardison, 2005). These sorts of evidence address potential alternative explanations of the constructs underlying AC ratings; however, it tells us what those constructs are not, rather than what they are. If we wish to be confident that AC ratings such as leadership or communication truly reflect the candidate's attributes on those dimensions, we must examine the nomological nets of AC ratings. In light of the nature of the competencies assessed in the ACs in the organizations in Russia yielding data for the present studies (described in subsequent sections), it is particularly informative to investigate the construct validity of competency ratings in relation to personality characteristics.

Few studies have examined nomological nets of AC final dimension ratings and personality traits. Most early ACs used personality tests as an integral part of the assessment process (Thornton & Byham, 1982), but only the authors of the Management Progress Study reported the relationship of assessment ratings with personality trait measures. Bray and Grant (1966) reported that factor scores comprising sets of AC dimensions were correlated with selected scales of personality questionnaires. For example, dominance on the Edwards Personal Preference Schedule (EPPS) correlated with a number of AC factor scores, for example, administrative skills and interpersonal skills. The assessment of general effectiveness, passivity, and dependency correlated with general activity and ascendancy on the Guilford-Martin Inventory of Factors (GAMIN). Bray, Campbell, and Grant (1974) reported that dominance on the EPPS and ascendancy on the GAMIN correlated with several factors in the assessments. In both reports, the preponderance of the correlations was trivial.

However, those studies do not provide independent evidence of construct validity of assessors' ratings because the personality information was a direct, integral part of the consensus discussion in those ACs. Shore, Thornton, and Shore (1990) did use personality measures that were independent of the AC ratings. They classified AC dimensions broadly into 'cognitive-style' and 'interpersonal-style' and found that, as predicted, cognitive-style dimensions were more strongly correlated with cognitive ability than were the interpersonal-style dimensions, whereas the interpersonal-style dimensions were more strongly related to personality. For example, AC ratings of amount of participation were correlated with 16PF scales of shy-bold, and submissive-dominant. Dilchert and Ones (2009) found that AC ratings of problem solving were related to cognitive ability (r = .32), but minimally related to Big Five personality traits. The opposite trend was observed for dimensions such as drive, influencing others, and consideration of others, which were unrelated to cognitive ability but more strongly related to the personality traits. In a meta-analysis of 65 studies, Meriac and Woehr (2012) found that three factors of AC dimensions correlated in different patterns with external measures: administrative dimensions correlated more strongly with general mental ability (GMA) than with personality characteristics. In addition, GMA correlated more strongly with administrative dimensions than with interpersonal and activity dimensions. Furthermore, the personality characteristic of extraversion correlated more strongly with the activity dimension than with the administrative dimension. Although these findings do provide some support for the argument that dimension ratings measure what they were intended to measure, they are still quite broad.

More specific evidence of construct validity might include proposing and testing a nomological net for each individual construct (competency) measured within the AC. For example, if the AC measures leadership, examining personality characteristics (or other external variables) that are theoretically associated with leadership would provide convergent evidence for the validity of this particular competency. Linking personality to AC competencies, however, raises important questions about scope and breadth. Narrow measures of single traits are widely believed to be more appropriate predictors of specific dimensions of job performance than broad traits such as the Big Five (Hogan & Roberts, 1996). However, AC dimensions are typically more complex than single personality trait measures; any single trait may capture only part of the variance in a complex competency such as leadership, leading to relatively small correlations between traits and dimensions (Ones & Viswesvaran, 1996). Schneider, Hough, and Dunnette (1996) recommend, instead, compiling composites or constellations of multiple narrow traits that are logically or empirically related to the performance dimension in question. Such composites of narrow traits can predict constructs of medium breadth better than either individual scales or broader trait measures (Ashton, 1998; Christiansen & Robie, 2011). Thus, one appropriate comparison standard for AC dimensions consists of such personality composites.

In the present study, we sought to provide more specific evidence that individual competency ratings made in ACs relate to externally assessed personality constructs in nomological nets that are theoretically related to each competency. Specifically, we examined final competency ratings, derived by consensus discussion, in two samples of managers in Russia. We then developed and tested a set of more specific predictions about the nomological net of each competency, considering convergent and discriminant relationships of the competencies with both narrow (i.e., individual traits) and broader (i.e., composites of traits) indicators of external constructs. In the following sections, we provide context to help the reader understand the competencies examined in this study, describe the process we followed to identify a nomological net for each competency, and then present our predictions and results for each of two studies.

1.1. Societal context in Russia

The managerial competencies in the current studies were chosen to reflect the needs of the Russian organizations in which the ACs operated. Some of these competencies may seem unfamiliar to readers accustomed to ACs in Western Europe or the US, so a little background may be helpful. To provide a framework for understanding the competencies and the correlates we studied, this section provides a description of the emerging context of Russian society and organizations. The competencies assessed in these ACs were chosen because the organizations were facing special management challenges in a changing environment.

When a person moves from a totalitarian system and welfare environment to one more driven by democracy and market demands, culture shock is possible. The person's habitual ways of thinking and acting may not work. The cultural changes may influence demands on leadership and management.

The economy in Russia in recent years has been described as 'a transitional economy' (Grachev, Rogovsky, & Rakitski, 2008). During the Soviet period, economic policy and many businesses were controlled by the state or a small group of individuals. Competition was suppressed and decision making was centralized. Starting in the 1990s, privatization of state property and industry occurred. New models of economic development emerged, including new relationships between government and small and large organizations. Competition increased among Russian organizations and with foreign ones.

Building on the earlier work of Hofstede (1980) and House et al. (2004), Grachev et al. (2008) reported Russian society to be low on assertiveness, emphasis on performance, and orientation toward the future. Its members strive to avoid uncertainty, rely on bureaucratic practices, and have high respect for authority and privileges. It has become transformed from a collectivistic to a more individualistic society. Suggesting additional changes in the future, scores on 'should be' were higher than 'as is' on the following societal culture scales: uncertainty avoidance, and orientations toward performance, future, and the human condition. Furthermore, the rating on the 'should be' rating of power distance were considerably lower than the 'as is' rating. More recently, while McCarthy, Puffer, and Darda (2010) found that the predominant leadership style of 130 entrepreneurs in the years 2003−2007 in Russia was somewhat like the transformational (vs. transactional) style becoming more prevalent in the US, Puffer and McCarthy (2011) found that managers in Russian organizations tended to rely on the more traditional Russian style of strong and authoritative leadership style, and reliance on informal institutions and personal networks. Taken together, these findings suggest Russian society and business have gone through considerable transition in the recent past, and will likely undergo additional changes in the future.

In the midst of this culture context, Simonenko and Khrenov (2010) worked with a variety of Russian organizations to develop leadership assessment and development programs. They formulated a list of the managerial competencies determined to be important in the Russian market. They compared this list with standard competency models that had been developed at different times in the United Kingdom based on national standards of leadership and management from 1998 to 2004. They found that, despite significant similarities between these models, there are differences in the standards for success of a manager in Russia versus the West. The main differences can be found in the areas of interpersonal skills (e.g., communications skills, building relationships) and individual traits (e.g., positive thinking, self-development), which depend to a large extent on cultural specifics and the country's socioeconomic development at the specific moment in time.
Table 1. Competencies, dimensions, and behavioral descriptions
Table 1 shows the competencies, examples of their subcompetencies, and behavioral descriptions developed for one of the largest Russian petrochemical companies. The terms competency and dimension are often used interchangeably in the AC literature, but they carry somewhat different connotations. We use the term competency to refer to a set of dimensions, each of which is narrower and defined in more behavioral terms. Competencies may be considered clusters of dimensions. For example, thoroughness of execution, a competency, may be composed of, in part, the dimensions responsibility, achievement orientation, and organization. Thus, the nomological net of thoroughness of execution may be related to the personality traits of self-discipline and conscientiousness, among others. The competencies examined in the current studies are largely interpersonal and motivational in nature; thus it was appropriate to study their nomological relationships with personality characteristics.

In summary, the purpose of this research was to explore the relationship of competency ratings in ACs in a set of Russian organizations in relationship with narrow and broad personality characteristics. To the extent that competencies are complex sets of performance dimensions, it is expected that they will correlate with different sets of personality characteristics. Several approaches were taken to set forth our expectations for the relationship of competencies assessed in the operational ACs studied here with individual and sets of personality characteristics. We chose a widely used test to measure personality traits.

2. Study 1

2.1. Participants

Sample 1 consisted of archival data from 175 middle managers who participated in developmental ACs in five Russian and multinational organizations in the period from 2007 to 2010. The industries of the organizations in which the ACs took place included steel and mining (N = 76), retail sales (N = 24), wireless network (N = 6), candy making (N = 21), and personal care products (N = 48). All of the ACs in this study included the 15FQ+ personality assessment (Psytech International, 2010), so personality and competency information was available for all participants.

In all of the ACs in the study, the client organizations required all employees in second-level management positions to participate in the AC. All participants were given developmental feedback, and the AC ratings were used as input to succession planning processes in some organizations. Thus, the participants represented the full population of employees of interest to their organizations, and not a restricted group such as identified highpotentials or self-nominated volunteers.

2.2. Assessment centers

All ACs were operated by the same consulting firm and shared a common pool of assessors. Table 2 lists the exercises and other assessment tools included in different ACs. They consisted of at least three simulation exercises, a competency based interview, and psychometric assessments including numerical and verbal ability tests and a personality questionnaire. The following types of simulation exercises were used: group discussion with assigned roles, fact finding, role play, analytical presentation, and in tray. There was always a group discussion and role play. Each competency was assessed with at least two exercises.
Table 2. Assessment methods in five Russian assessment centers
n – number of delegates in the assessment centers.
Y – yes, the exercise was included in the assessment centers.
N – no, the exercise was not included in the assessment centers.

All ACs were conducted by members of the same pool of 10 assessors consisting of six females and four males. Specific subsets of assessors changed for different ACs. All assessors were professionally trained consultants with at least 2 years of experience with the AC methods.

The final ratings were given during group discussions of all assessors in the integration session. A description of the AC method as practiced in many countries can be found in Thornton and Rupp (2006) and as practiced in Russia in Simonenko and Khrenov (2010) and Simonenko (2011).

Although the specific dimensions assessed in the individualACs varied somewhat, they were combinedinto a common framework using six broad competencycategories shown in Table 1: leadership, thoroughness of execution, strategic vision, people development and team building, openness to changes, and corporate spirit. The synthesis of the competency frameworks was conducted by the first author, who oversaw all five ACs and had extensive knowledge of Russian culture and leadership, the dimension definitions, and underlying competency models. All competency categories were present in all five ACs with the exception of people development, which was used in only four ACs. However, some competency scores were missing for some candidates because the competencies from specific frameworks in some organizations did not match the common categories listed above; these results were excluded from the data analysis. For this reason, the effective sample size for the analyses varies across competencies (see Tables 5 and 6 for exact sample sizes). Each candidate received an overall rating between 1.0 and 3.5 for each competency category aggregated across exercises, based on consensus discussion.

2.3. Personality measures

In addition to the simulation exercises, all 175 participants completed the 15FQ+ (Psytech International, 2010), a set of personality scales that are designed to measure the 16-factor model of personality proposed by Cattell (1946). The measure consists of 192 items, with 12 items for each of the 16 bipolar factor scales. See Table 3 for a list of factor names. The 15FQ+ is widely used in international applications and has been translated into numerous languages, including Russian (Psytech International, 2010). All participants completed an online Russian-language version of the measure, which was scored using the publisher's online scoring system. As a result, the data available for analysis were the participants' sten scores, which are standardized scores transformed to range from 1 to 10 with a mean of 5.5 and a standard deviation of 2. Because raw item responses were not available, we could not calculate the reliability of the scales in this sample, but the reliability of the factor scales has previously been reported as acceptable (Cronbach's α = .72–.85; Psytech International, 2010).
Table 3. Sixteen scales in the 15FQ+
Participants completed the 15FQ+ online within 1 week of their competency assessment in the AC. Most completed the personality measures before the AC exercises, but due to scheduling constraints, a small number (fewer than 10) completed them shortly afterwards. Although some of the assessors had access to participants' 15FQ+ scores for administrative purposes (e.g., ensuring that all data had been received), all assessors were instructed to make their competency ratings solely on the basis of performance in the simulation exercises. Thus, the competency ratings were kept separate from the personality measures, although it is possible that a small amount of contamination may have occurred.

2.4. Predictions about competency–personality relationships

To identify personality factors that were theoretically related to each of the AC competencies, we solicited input from a panel of four experienced assessors (one of whom is the first author of this study). Each of these judges had a minimum of 5 years experience in AC consulting and a minimum of 2 years experience in interpreting the 15FQ+. All had graduate training in psychology or related fields (two in social psychology; one in industrial/organizational psychology; one in education) and formal training in both AC methods and the 15FQ+. Further, all judges were involved in the administration of the ACs that provided the data for this study. Thus, they had considerable knowledge of both the 15FQ+ factors and the ways the competencies were operationalized in these ACs.

Each judge began by independently recording his or her individual predictions regarding (a) whether each of the 15FQ+ factors was relevant to each competency and, if so, (b) in which direction (i.e., which pole of the 15FQ+ scale should be positively related to the competency). The judges then met as a group to resolve disagreements about the remaining factor-competency pairs by discussion. Disagreements among the judges' initial predictions were present on fewer than 10 of the 96 possible pairings; these were resolved by discussion. The resulting consensus predictions are shown in Table 4. These predictions formed the basis for all analyses in Study 1; competency-trait pairs for which a relationship was predicted were considered convergent relationships and pairs for which no relationship was predicted were considered discriminant relationships.

3. Results

The relationships of competencies with individual personality scales will be presented and then the relationships of competencies with composites of personality scales. A comparison of the two sets of analyses is then presented.

Following the process of Shore et al. (1990), we began by examining the correlations between the AC competency ratings and individual factors of the 15FQ+. We calculated Pearson correlations between each competency and each personality factor, and then computed the average correlations among theoretically related (convergent) and theoretically unrelated (discriminant) factors for each competency, based on the expert judges' predictions as described in Table 4. We present these averaged convergent and discriminant correlations in Table 5. (The far right column will be described in the next section.) In general, the correlations between the AC ratings and individual personality scales were small, with an average of r = .11. For four of the six dimensions, the average convergent relationships were larger than the average discriminant relationships. However, these differences were small, there was considerable variability within the convergent and discriminant correlations, and there were several individual trait correlations in the opposite direction from what had been predicted.
Table 4. Study 1 predictions of personality characteristics related to competencies
Table 5. Study 1. Correlations of assessment center competency ratings with averages of similar and different individual personality traits and with composites of similar traits
The analyses above matched individual traits with competencies that were arguably much broader and more complex, and thus we correlated AC competencies with composites of personality measures. For example, the competency of leadership includes behaviors such as setting goals and persuading others. Although both of these sets of behaviors can be considered indicators of a broader leadership construct, they are quite dissimilar, and might be related to different aspects of personality. In fact, the judges identified eight of the 16 15FQ+ traits as having potentially meaningful relationships with leadership, implying that leadership entails a combination of multiple personality characteristics.

To allow a degree of complexity in the personality measures that would be more similar to the complexity of the competencies, we created a personality composite by summing participants' scores on each of the traits that the judges had predicted to be relevant to each competency (cf. Christiansen & Robie, 2011; Schneider et al., 1996). For example, the personality composite for thoroughness of execution was comprised of the high intellectance, conscientious, concrete, self-disciplined, and tense-driven scores (cf. Table 4). It is important to note that these composites are not thought to represent latent constructs; what the elements of each composite have 'in common' is that they are expected to relate to the competency. Rather, the composites represent proposed personality profiles of high scorers on each competency. That is, we expected that a person who scores high on thoroughness of execution would be high in intellectance, and would be conscientious, concrete, self-disciplined, and tense-driven, regardless of the correlations or lack thereof among these traits in the general population. As our goal was not to create orthogonal factors or composites, we allowed personality scales to be included in more than one composite if they were relevant to more than one competency. For example, high intellectance forms part of the composite for thoroughness of execution, strategic vision, and openness to changes because the judges identified it as relevant to all three competencies. We used a simple sum of the relevant personality scale scores to calculate scores for each participant on each composite.
Table 6. Study 1 correlations among assessment center competency ratings and composites of personality characteristics
Bold type indicates predicted convergent validity coefficients, *p < .05.
Table 6 presents the full matrix of convergent and discriminant correlations among the competencies and personality composites. The correlations between each competency and its corresponding personality composite ranged from r = .13 to .35, and the correlations between noncorresponding competencies and personality composites ranged from r = −.05 to .37. On average, the convergent correlations (average r = .23) were higher than the discriminant correlations (average r = .15). There were three cases in which a competency correlated more highly with a personality composite other than its intended correspondent, namely strategic vision, thoroughness of execution, and openness to changes. In the latter two cases, the differences were slight (i.e., r = .37 vs. .35 and .25 vs. .23). In addition, the personality components of thoroughness of execution and openness to changes included two of the same scales and were correlated highly (r = .81).

Table 5 shows the critical comparison in columns 3 and 5. For each AC competency, the correlation of the composite is larger than the average correlation for comparable individual personality traits. The average of the former is .23, whereas the average of the latter is .13. When these two sets of findings are considered together, a contrast is apparent. The analyses of the convergent and discriminant validity of AC ratings in relation to individual personality traits suggest a lack of construct validity. By contrast, the analyses of correlations of competencies with personality composites show stronger construct validity.

4. Study 2

To extend the findings, in Study 2, we examined data from another sample, including 112 top managers from one Russian company which is the leader of petrochemistry in Russia and Eastern Europe. The managers participated in a developmental AC similar to those described in Study 1, and completed the 15FQ+ personality test. Members of the same pool of assessors conducted the ACs.

4.1. Predictions

As in Study 1, the same team of judges made predictions about the personality traits that were most relevant to each competency (see Table 7). These new composites were slightly different from those used in Study 1 because the underlying competency model for this client organization differed in subtle, but important, ways from the general competency framework of Study 1. The competencies in Study 1 were broader, as they represented a synthesis of models across several organizations; the competency model in Study 2 was more detailed and specific to a single organization. This increased specificity allowed us to link the scales of the personality questionnaire with each behavioral indicator of each competency in the framework. As a result, the personality composites for Study 2 represented a more precise match to the competencies as they were operationalized within this particular AC.
Table 7. Study 2 predictions of personality characteristics related to competencies

5. Results

As in Study 1, we again analyzed relationships of competency ratings with individual traits and then composites of traits. The traits listed in Table 7 were expected to show convergent relationships for that competency, whereas traits not listed here for a particular competency were expected to show discriminant relationships. Table 8 shows the results for individual traits. The average convergent correlation was larger than the average discriminant correlation (.17 vs. .07). This difference is larger than the difference in Study 1, and is more consistent across the AC competencies. Moreover, compared to Study 1, the average convergent correlation was larger (.17 vs. .13) and the average discriminant correlation was smaller (.07 vs. .10). This pattern suggests better construct validity at the individual trait level in these AC ratings in this individual organization than was shown in the AC ratings from several organizations in Study 1.
Table 8. Study 2 correlations of assessment center competency ratings with similar and different individual personality traits and with composites of similar traits
n = 112.
Table 9. Study 2 correlations among assessment center competency ratings and composites of personality characteristics
n = 112.
Bold type indicates predicted convergent validity coefficients, *p < .05.

Table 9 shows the correlations of the personality composites with the AC competency ratings. For every competency, the correlation of the AC rating with the comparable personality composite was larger than the correlation with the other noncomparable components. The pattern supporting construct validity is strongest for people development and team building and corporate spirit. Moreover, when the convergent correlations of the composites are compared to the average convergent correlations with individual traits (in the first and last columns of Table 8), it is clear that the correlations based on the composites are larger.

As in Study 1, the two sets of analyses suggest different conclusions. While there is only a slight difference in the convergent and discriminant correlations for individual traits (.17 vs. .07), and thus one might conclude little construct validity, there is more evidence of construct validity in the correlations of AC competency ratings with composites of personality traits: convergent correlations are considerably larger than discriminant correlations. Finally, correlations with composites are consistently larger than the average correlation with individual traits.

6. Discussion

These studies provide further evidence of the construct validity of AC ratings. In two studies, ratings of complex managerial competencies correlated with theoretically

6.1. Limitations

Certain features of the current studies must be recognized when drawing conclusions. The components of the competencies varied somewhat from company to company in Study 1. For example, the subcompetencies and behavioral indicators of a competency such as leadership differed slightly from one to another. While this reflects a lack of standardization, the variation suggests the results may have been even stronger if there was more consistency in definitions of the competencies, and then in the specification of personality correlates, as was done in Study 2.

This study investigated the average correlation of competency ratings with narrow personality scores in comparison with the correlation of the competency ratings with personality composites. A parallel analysis that might be probative in future research would be to study the average correlation between traditional dimensions and narrow personality scores in comparison with the average correlation between narrow dimensions and personality composites. Such an analysis was not possible here because assessors made ratings only on the six competencies. Guidance in rating the competencies was provided by consideration of the components and indicators of the competencies, that is, the dimensions, subcompetencies, and behavioral descriptions shown in Table 1.

There may be some undecipherable amount of confounding of knowledge of personality test scores with the final consensus AC ratings on the competencies. Some of the candidates took the online personality questionnaire prior to participation in the simulation exercises, but others did not do so until later. Some of the personality questionnaire results were available to some assessors, but not all assessors knew the meaning of the trait labels or test scores; furthermore, many assessors were not trained to interpret the personality profiles. In actual application, the personality scores were used consistently only in the process of giving feedback to candidates. The extent of confounding of personality scores and assessment ratings is probably considerably less that in ACs where reporting of scores is an integral part of the consensus discussion as was done in most early ACs, and is common in recent developmental ACs (Povah, 2011).

Another distinguishing feature of these ACs is that the assessors were experienced consultants. The results may not generalize to other ACs where assessors are higher level managers in the organizations and may have considerably less AC experience. On the other hand, the results of the present study are consistent with other studies showing construct validity of ratings provided by experienced consultants familiar with the competency model employed in the ACs (Guenole et al., 2011). Certainly, it is quite common for assessors to be experienced consultants (Hughes, Riley, Shalfrooshan, Gibbons, & Thornton, 2012; Povah, 2011), and thus the current findings are informative for practice.

The ACs studied here employed the dimension-based AC model. Two other models have been articulated in recent years, namely task-based ACs, and mixed model ACs (Jackson, Lance, & Hoffman, 2012). The method employed in the present research might be used to provide construct validity evidence of ratings emanating from ACs following other models. To be sure, there is evidence supporting the contention that variance in AC ratings is a function of factors other than dimensions. The purpose of the present research is not to investigate the relative size of dimension effects or task effects, nor to advocate the implementation of one approach. Rather, the research explores the construct validity of AC ratings derived from dimension based ACs using the process of consensus discussion.

7. Conclusions

These studies reflect the strengths and weaknesses of field research into the AC method. They involved a unique set of data from multiple real ACs in real organizations in a country, namely Russia, heretofore not studied and quite different from Western Europe or the US. While some methodological features are not as standardized and 'clean' as one might desire, the findings are still consistent enough to allow meaningful tentative conclusions. Final AC ratings of complex managerial competencies showed construct validity in relationship with personality characteristics.
American Educational Research Association, American Psychological Association, & American Council on Measurement in Education. (1999). Standards for educational and psychological tests. Washington DC: American Psychological Association.

Arthur, W. Jr, Day, E. A., McNelly, T. L., & Edens, P. S. (2003). A meta-analysis of the criterion-related validity of assessment center dimensions. Personnel Psychology, 56, 125–154.

Arthur, W. Jr, Day, E. A., & Woehr, D. J. (2008). Mend it, don't end it: An alternative view of assessment center construct-related validity evidence. Industrial and Organizational Psychology: Perspectives on Science and Practice, 1, 105–111.

Ashton, M. C. (1998). Personality and job performance: The importance of narrow traits. Journal of Organizational Behavior, 19, 289–303.

Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111, 1061–1071.

Bray, D. W., Campbell, R. J., & Grant, D. L. (1974). Formative years in business: A long-term AT&T study of managerial lives. New York: Wiley.

Bray, D. W., & Grant, D. L. (1966). The assessment center in the measurement of potential for business management. Psychological Monographs, 80, 1–27.

Cattell, R. B. (1946). The description and measurement of personality. Yonkers-on-Hudson, NY: World Book.

Christiansen, N. D., & Robie, C. (2011). Further consideration of the use of narrow trait scales. Canadian Journal of Behavioural Science/Revue canadienne des sciences du comportement, 43, 183–194.

Dayan, K., Kasten, R., & Fox, S. (2002). Entry-level police candidate assessment center: Efficient tool or a hammer to kill a fly? Personnel Psychology, 55, 827–849.

Dilchert, S., & Ones, D. S. (2009). Assessment center dimensions: Individual differences correlates and meta-analytical incremental validity. International Journal of Selection and Assessment, 17, 254–270.

Gaugler, B. B., Rosenthal, D. B., Thornton, G. C. III, & Bentson, C. (1987). Meta-analysis of assessment center validity. Journal of Applied Psychology, 72, 493–511.

Goffin, R. D., Rothstein, M. G., & Johnston, N. G. (1996). Personality testing and the assessment center: Incremental validity for managerial selection. Journal of Applied Psychology, 81, 746–756.

Goldstein, H. W., Yusko, K. P., Braverman, E. P., Smith, D., & Chung, B. (1998). The role of cognitive ability in the subgroup differences and incremental validity of assessment center exercises. Personnel Psychology, 51, 357–374.

Grachev, M. V., Rogovsky, N. G., & Rakitski, B. V. (2008). Leadership and culture in Russia: The case of transitional economy. In J. S. Chhokar, F. C. Brodbeck, & R. J. House (Eds.), Culture and leadership across the world (pp. 803–831). Mahwah, NJ: Lawrence Erlbaum.

Guenole, N., Chernyshenko, O., Stark, S., Corkerill, T., & Drasgow, F. (2011). We're doing better than you might think: A large-scale demonstration of assessment centre convergent and discriminant validity. In N. Povah & G. C. Thornton (Eds.), Assessment and development centres: Strategies for global talent management (pp. 15–32). Farnham: Grower.

Guenole, N., Chernyshenko, O., Stark, S., Corkerill, T., & Drasgow, F. (2012). More than mirage: A large scale assessment centre with more dimension variance that exercise variance. Journal of Occupational and Organizational Psychology, 86, 1–17.

Hardison, C. M. (2005). Construct validity of assessment center overall ratings: An investigation of relationships with and incremental criterion validity over Big 5 personality traits and cognitive ability. Unpublished doctoral dissertation: University of Minnesota.

Hardison, C. M., & Sackett, P. R. (2004). Assessment center criterion-related validity: A meta-analytic update. Paper presented at the 19th annual conference of the Society for Industrial and Organizational Psychology, Chicago, IL.

Hermelin, E., Lievens, F., & Robertson, I. T. (2007). The validity of assessment centres for prediction of supervisory performance ratings: A meta-analysis. International Journal of Selection and Assessment, 15, 405–411.

Hofstede, G. (1980). Cultural consequences: International differences in work-related values. Beverly Hills, CA: Sage.

Hogan, J., & Roberts, B. W. (1996). Issues and non-issues in the fidelity-bandwidth trade-off. Journal of Organizational Behavior, 17, 627–637.

House, R. J., Hanges, P. J., Javidan, M., Dorfman, P. W., Gupta, V., & Globe Associates. (2004). Culture, leadership, and organizations: The GLOBE study of 62 societies. Thousand Oaks, CA: Sage.

Howard, A. (1997). A reassessment of assessment centers: Challenges for the 21st century. Journal of Social Behavior and Personality, 12, 13–52.

Howard, A. (2008). Making assessment centers work the way they are supposed to. Industrial and Organizational Psychology: Perspectives on Science and Practice, 1, 98–104.

Hughes, D., Riley, P., Shalfrooshan, A., Gibbons, A., & Thornton, G. C. III. (2012). A global survey of assessment centre practices. Godalming, Surrey, UK: The adc Group.

Jackson, D. J. R., Lance, C. E., & Hoffman, B. J. (2012). The psychology of assessment centers. New York: Routledge.

Krause, D. E., Kersting, M., Heggestad, E. D., & Thornton, G. C. III. (2006). Incremental validity of assessment center ratings over cognitive ability tests: A study at the executive management level. International Journal of Selection and Assessment, 14, 360–371.

Kuncel, N. R., & Sackett, P. R. (2013). Resolving the assessment center construct validity problem. Journal of Applied Psychology, doi:10.1037/a0034147.

Lance, C. E. (2008a). Why assessment centers do not work the way they are supposed to. Industrial and Organizational Psychology: Perspectives on Science and Practice, 1, 84–97.

Lance, C. E. (2008b). Where have we been, how did we get there, and where shall we go? Industrial and Organizational Psychology: Perspectives on Science and Practice, 1, 140–146.

McCarthy, D. J., Puffer, S. M., & Darda, S. V. (2010). Convergence in entrepreneurial leadership style: Evidence from Russia. California Management Review, 52, 1–25.

Meriac, J. P., & Woehr, D. J. (2012). Broad assessment center dimensions: A nomological network examination of validity. In D. J. R. Jackson & B. J. Hoffman, Dimension, task, and mixed-model perspectives on assessment centers. 27th Annual Conference of the Society for Industrial and Organizational Psychology, San Diego, CA.

Ones, D. S., & Viswesvaran, C. (1996). Bandwidth-fidelity dilemma in personality measurement for personnel selection. Journal of Organizational Behavior, 17, 609–626.

Povah, N. (2011). A review of recent international surveys into assessment centre practices. In N. Povah & G. C. Thornton (Eds.), Assessment and development centres: Strategies for global talent management (pp. 329–350). Farnham: Grower.

Psytech International. (2010). 15FQ+ fifteen factor questionnaire technical manual.

Puffer, S. M., & McCarthy, D. J. (2011). Two decades of Russian business and management research: An institutional theory perspective. Academy of Management Perspective, 25, 21–36.

Putka, D. J., & Hoffman, B. J. (2012). Clarifying the contribution of assessee-, dimension-, exercise-, and assessor-related effects to reliable and unreliable variance in assessment center ratings. Journal of Applied Psychology, 98, 114–133.

Sackett, P. R., & Tuzinski, K. A. (2001). The role of dimensions and exercises in assessment center judgments. In M. London (Ed.), How people evaluate others in organizations (pp. 111–129). Mahwah, NJ: Lawrence Erlbaum.

Schneider, R. J., Hough, L. M., & Dunnette, M. D. (1996). Broadsided by broad traits: How to sink science in five dimensions or less. Journal of Organizational Behavior, 17, 639–655.

Shore, T. H., Thornton, G. C., & Shore, L. M. (1990). Construct validity of two categories of assessment center dimension ratings. Personnel Psychology, 43, 101–116.

Simonenko, S. (2011). The use of assessment and development centres in Russia. In N. Povah & G. C. Thornton (Eds.), Assessment and development centres: Strategies for global talent management (pp. 429–440). Farnham: Grower.

Simonenko, S., & Khrenov, D. (2010). Сказки и были о методах оценки персонала. [Fairy tales and true stories about assessment methods of personnel]. Moscow, RF: DeTech.

Society for Industrial and Organizational Psychology. (2003). Principles for the validation and use of personnel selection procedures (4th ed.) Bowling Green, OH: Author.

Thornton, G. C. III. (2012). Evidence that assessment center judgments measure dimensions of managerial performance. Organizational Psychology, Available at (accessed 30 September 2013).

Thornton, G. C. III, & Byham, W. C. (1982). Assessment centers and managerial performance. New York: Academic Press.

Thornton, G. C. III, & Rupp, D. R. (2006). Assessment centers in human resource management: Strategies for prediction, diagnosis, and development. Mahwah, NJ: Lawrence Erlbaum.

Thornton, G. C. III, & Rupp, D. R. (2012). Research into dimension-based assessment centers. In D. J. R. Jackson, C. E. Lance, & B. J. Hoffman (Eds.), The psychology of assessment centers (pp. 141–170). New York: Routledge.