Serviceeinschränkungen vom 12.-22.02.2026 - weitere Infos auf der UB-Homepage

Treffer: Assessing suitability for a problem-based learning curriculum: evaluating a new student selection instrument / Evaluer la pertinence d'un programme d'études basé sur la résolution de problèmes : évaluation d'un nouvel instrument de sélection de l'étudiant

Title:
Assessing suitability for a problem-based learning curriculum: evaluating a new student selection instrument / Evaluer la pertinence d'un programme d'études basé sur la résolution de problèmes : évaluation d'un nouvel instrument de sélection de l'étudiant
Source:
Medical education (Oxford. Print). 39(3):250-257
Publisher Information:
Oxford: Blackwell, 2005.
Publication Year:
2005
Physical Description:
print, 17 ref
Original Material:
INIST-CNRS
Document Type:
Fachzeitschrift Article
File Description:
text
Language:
English
Author Affiliations:
Institute of Clinical Education, Peninsula Medical School, Exeter, United Kingdom
School of Medicine, Griffith University, Gold Coast, Queensland, Australia
ISSN:
0308-0110
Rights:
Copyright 2005 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Sciences of education

FRANCIS
Accession Number:
edscal.16616546
Database:
PASCAL Archive

Weitere Informationen

CONTEXT: A new student selection instrument has been designed to assess candidate suitability for a problem-based learning, small group curriculum. OBJECTIVE: To evaluate the performance of the new teamwork selection instrument in terms of its discriminatory power, fairness, validity, reliability and acceptability among candidates. SAMPLE: A sample of 69 volunteer candidates attending for interview formed 13 teams of 5 or 6 candidates each. Each candidate was assessed independently by 2 assessors. Candidate performance in the exercise was used for instrument evaluation purposes only. RESULTS: The instrument demonstrated good item discrimination (item-total correlations r = between 0.75 and 0.83, P< 0.01); the potential for good agreement between raters (63% agreement, weighted kappa = 0.38, P < 0.01); strong internal consistency reliability (Cronbach's α = 0.93), and good acceptability among candidates. No sources of assessment bias were identified on the basis of candidates' age (univariate ANOVA F = 0.43, P > 0.05), gender (unrelated samples t-test F = 1.2, P > 0.05) or socioeconomic background (univariate ANOVA F = 0.85, P > 0.05). There was no statistically significant relationship between the candidates' performance in the new exercise and their performance in the standardised formal interview (r= - 0.37, P > 0.05); the instrument had limited predictive validity, and some of the measured attributes require conceptual clarification. DISCUSSION: Statistical and conceptual analysis highlights the scope for development in the teamwork exercise. The exercise appears to be well suited to assessing candidate suitability for a problem-based learning curriculum.

AN0016214070;esf01mar.05;2019Jun04.08:24;v2.2.500

Assessing suitability for a problem-based learning curriculum: evaluating a new student selection instrument. 

Context  A new student selection instrument has been designed to assess candidate suitability for a problem‐based learning, small group curriculum. Objective  To evaluate the performance of the new teamwork selection instrument in terms of its discriminatory power, fairness, validity, reliability and acceptability among candidates. Sample  A sample of 69 volunteer candidates attending for interview formed 13 teams of 5 or 6 candidates each. Each candidate was assessed independently by 2 assessors. Candidate performance in the exercise was used for instrument evaluation purposes only. Results  The instrument demonstrated good item discrimination (item−total correlations r = between 0.75 and 0.83, P < 0.01); the potential for good agreement between raters (63% agreement, weighted kappa = 0.38, P < 0.01); strong internal consistency reliability (Cronbach's α = 0.93), and good acceptability among candidates. No sources of assessment bias were identified on the basis of candidates' age (univariate anova F = 0.43, P > 0.05), gender (unrelated samples t‐test F = 1.2, P > 0.05) or socioeconomic background (univariate anova F = 0.85, P > 0.05). There was no statistically significant relationship between the candidates' performance in the new exercise and their performance in the standardised formal interview (r = − 0.37, P > 0.05); the instrument had limited predictive validity, and some of the measured attributes require conceptual clarification. Discussion  Statistical and conceptual analysis highlights the scope for development in the teamwork exercise. The exercise appears to be well suited to assessing candidate suitability for a problem‐based learning curriculum.

Keywords: medical; undergraduate/*standards; problem‐based learning/ methods; student selection/ trends; interpersonal relations; communication; reproducibility of results; anova; education

Being able to work as part of a team has been identified as a key skill required of the next generation of medical practitioners.[1] This skill is particularly relevant in a problem‐based learning (PBL) curriculum, where learning content is explored and mastered in student‐centred teams. To ensure that students are able to manage the rigours of the PBL environment, and have the ability to develop the attributes that are conducive to learning in a group context, it is advantageous to assess teamwork‐related attributes during student selection.[[2]]

Students entering a PBL curriculum require different skills and intelligences to those required for a traditional lecture‐based curriculum.[[2], [4]] These skills include adaptability, flexibility, generosity with knowledge and resources, communication, group and interpersonal skills, maturity, and situational awareness.[[5]] If a PBL group is dysfunctional, and these skills are not shared, some or all of its members may fail to learn.[2]

A selection instrument was developed to identify individual attributes related to performance in a team setting. It was administered during the student selection process, and evaluated for its reliability, validity, fairness and acceptability among candidates. This evaluation method draws upon educational theory derived from examination of PBL and assessment[[2], [7]] and is shaped by the utility equation:[8]

This paper will not examine issues of cost effectiveness and educational impact.

Method

Prior to designing the instrument, attributes that may contribute to, or work against, effective participation in PBL teams were identified using a modified Delphi technique of 3 rounds with 4 experts in medical education.[10] Feedback was facilitated via e‐mail and face‐to‐face meetings and group consensus and ranking arose at the third round. A number of measurable and observable indicators emerged and were refined into attributes deemed to be desirable in a team learning environment and behaviours considered to be non‐conducive to learning in a team (referred to as 'unwanted behaviours'). Table 1 details these indicators.

1 Behaviours measured during the teamwork exercise

<table><thead valign="bottom"><tr><th>Attributes</th><th>Unwanted behaviours</th></tr></thead><tbody valign="top"><tr><td>Active listening</td><td>Prejudice</td></tr><tr><td>Summarising</td><td>&#8195;Toward other peo,ple</td></tr><tr><td>Reflection</td><td>&#8195;Toward the case</td></tr><tr><td>Contributing to and complying with group rules</td><td>Depreciative behaviours</td></tr><tr><td>Respecting and tolerating varied opinions</td><td>Demeaning people</td></tr><tr><td valign="top">Balancing the task with discussion</td><td>Not valuing comments</td></tr><tr><td>Egocentricity</td></tr><tr><td>&#8195;Domineering</td></tr><tr><td>&#8195;Taking over</td></tr><tr><td>Non&#8208;participation</td></tr></tbody></table>

The teamwork exercise was structured around 1 of 4 non‐clinical ethical scenarios that had been adapted from real‐life events reported by the media during the previous year. An example scenario is shown in Fig. 1. Each group was given 30 minutes to identify and explore the ethical issues, devise learning objectives or research strategies, identify hypotheses and reach conclusions. Non‐medical scenarios were chosen to avoid advantaging candidates with prior medical knowledge or work experience, and to encourage the assessors to focus on the assessment of team skills rather than knowledge. Candidates were briefed that their participation in the process, rather than their knowledge, was being assessed.

Graph: 1 Example of ethical scenario used in a teamwork exercise.

The participants were applicants to Peninsula Medical School for 2002 who had consented to undertake this additional selection exercise. Participation was voluntary and participants were assured, during taking consent and upon commencing the exercise, that participation or non‐participation in the exercise would not contribute in any way to the outcomes of their applications. No ethical approval was sought for this study on the grounds that participation and performance in the exercise had no bearing on selection decisions.

All candidates completed the exercise on the day of their formal structured interview in November 2001, which included scoring in the attribute of insight into teamwork. Each candidate was assessed by 1 of 2 facilitator‐assessor/observer‐assessor dyads. The 4 assessors in the study were experienced group facilitators, who had also completed a short training programme about the instrument, marking scheme and rationale. All participants completed an evaluation form.

Candidates' attributes were scored by an observer on a Likert scale with the following range: 0 = not achieved, 1 = borderline, 2 = satisfactory and 3 = excellent. In addition, the 2 assessors independently awarded each candidate a global score from 0 to 4 (0 = not suitable, 4 = very suitable) based on their perceived suitability for a PBL curriculum. Summated scores ranged from 0 to 22, with a score of 15 or above representing a 'pass' in the exercise. Each of the unwanted behaviours represents an extreme and undesirable reaction in a team setting and was designed (although, at this evaluation stage, not applied) as an indicator of potential unsuitability for a PBL curriculum.

Quantitative analyses of participants' scores for the teamwork exercise were applied to evaluate the reliability and validity of the instrument. Standard analyses of distribution, correlation and variance were performed using the Statistical Package for the Social Sciences (spss). In addition, qualitative analysis examined the instrument's acceptability among candidates. Dominant theme analysis of participants' feedback transcripts from open‐ended questions identified regularities in the candidates' perceptions of the teamwork exercise.[11] Transcripts were read several times; emerging themes were indexed and further reviewed for comments that could be mapped to the themes.[11]

Results

The sample

Table 2 outlines the parameters of the population and sample.

2 Population and sample parameters

<table><thead valign="bottom"><tr><th valign="bottom">Parameter</th><th>Population &#8232;<italic>n</italic>&#8195;=&#8195;552</th><th>100</th><th>Sample &#8232;<italic>n</italic>&#8195;=&#8195;69</th><th>12.5</th></tr><tr><th><italic>n</italic></th><th>%</th><th><italic>n</italic></th><th>%</th></tr></thead><tbody valign="top"><tr><td>Gender</td></tr><tr><td>&#8195;M</td><td>237</td><td>43</td><td>30</td><td>43.5</td></tr><tr><td>&#8195;F</td><td>315</td><td>57</td><td>39</td><td>56.5</td></tr><tr><td>Age 20&#8195;years or under</td><td>348</td><td>63</td><td>68</td><td>98.6</td></tr><tr><td>Graduates</td><td>155</td><td>28</td><td>0</td><td>0</td></tr><tr><td>Teamwork participant offered place</td><td>N/A</td><td>N/A</td><td>44</td><td>64</td></tr><tr><td>Teamwork participant accepting place</td><td>N/A</td><td>N/A</td><td>16</td><td>23</td></tr></tbody></table>

Due to the timing of the exercise, all but 1 of the study participants were aged 20 years or under. All participants were British and resident in the UK at the time of application. Table 2 shows a 20% (68/348) school‐leaver applicant response rate.

Instrument performance

Discrimination between candidates

Table 3 outlines the central tendency and variability statistics of the total scores awarded during the teamwork exercise. The judges used the full range of scores available to them, awarding both the minimum (0) and maximum (22) values. However, the interquartile range is fairly small, and indicates limited dispersion throughout the score range. The measures of central tendency all fall within the pass range. A test of normality is significant (Kolmogorov−Smirnov = 0.26, P < 0.01), denoting a non‐normal distribution. Scores are negatively skewed, with the majority of participants gaining scores at the higher end of the continuum.

3 Descriptive statistics

<table><thead valign="bottom"><tr><th>Total score statistic</th><th>Value</th></tr></thead><tbody valign="top"><tr><td>Mean</td><td>15.3</td></tr><tr><td>Standard error of mean</td><td>0.57</td></tr><tr><td>Median</td><td>16.0</td></tr><tr><td>Mode</td><td>15.0</td></tr><tr><td>Interquartile range</td><td>3.0</td></tr><tr><td>Standard deviation</td><td>4.75</td></tr><tr><td>Minimum score</td><td>0</td></tr><tr><td>Maximum score</td><td>22.0</td></tr><tr><td>Standard error of measurement</td><td>4.0</td></tr></tbody></table>

Item discrimination

All items performed in the intended way, with strong and significant correlations with the total score achieved during the teamwork exercise (Pearson correlation coefficient range 0.75–0.83, P < 0.001). This confirms that the assessed items contributed positively to the measurement of the overall construct (suitability for a PBL curriculum).

The strong item−total correlations are indicative of good internal consistency reliability. Cronbach's α was used to assess internal reliability and a near perfect coefficient of 0.93 was achieved. This suggests that candidates who scored highly or poorly on 1 attribute were likely to score in a similar way across the other attributes and in total.

Test bias

The candidates' performance in the teamwork exercise was uninfluenced by gender bias (unrelated samples t‐test F = 1.2, P > 0.05). A Levene's test for equality of variances was not significant, suggesting that the score variance between males and females was approximately equal, as ideally it should be. Age was tested using the 1‐way analysis of variance (anova) technique and was found to be non‐significant (F = 0.43, P > 0.05).

The candidates' socioeconomic statuses were measured using the geodemographic census‐derived database GB Profiler.[12] In a 10‐class scheme, just over 50% of the teamwork participants had classifications in the highest 3 classes. The remaining candidates (49.3%) were distributed between the lower 7 socioeconomic classifications, with just 7.2% in the lowest 5 classes. No statistically significant relationship was found between the candidates' socioeconomic status and their teamwork performance (F = 0.85, P > 0.05).

Correlation between instruments

On the basis of their interview scores, 44 (64%) of the candidates who volunteered to complete the teamwork exercise were offered a conditional place (a conversion rate of 36% was secured). While achieving success at the interview screen, 1 in 4 candidates failed to reach the predetermined minimum competency standard in the teamwork exercise. Conversely, 23 (33%) teamwork participants were rejected on the basis of poor interview performance when they had been deemed competent in terms of the teamwork exercise.

To examine the relationship between the 2 selection methods, Pearson product−moment correlation was applied to the teamwork and interview data. The null hypothesis, that there is no relationship between the candidates' teamwork and interview scores, was accepted in all cases (P > 0.05), meaning that both instruments have limited external reliability (alternate‐forms).

A negative correlation coefficient (r = −0.32, P > 0.05) was achieved between the candidates' teamwork score and interview score. The result was insignificant and as such should be put down to chance occurrences. However, closer inspection of the data shows that the 4 lowest scoring students in the teamwork exercise received among the highest formal interview scores. During interview all 4 students received satisfactory or excellent ratings for their response to a question on what defines an effective team player; in practice they failed to perform well in a team setting.

Interrater reliability

For the analysis of interrater reliability, the scores awarded by 1 assessor pairing only were examined. This pairing demonstrated slightly less agreement than the other pairing. It was found that the assessors independently agreed on a candidate's global performance 22 times out of 35 (63%). An analysis of interrater reliability using the weighted kappa coefficient indicates fairly low agreement between assessors (k = 0.38, P < 0.01). Inspection of the data suggests that the low coefficient can be explained by 2 trends:

• 1

frequent use of the satisfactory rating (60% of the scores awarded by this pairing were 'satisfactory'), which produces low variability,[[13]] and

• 2

disagreement between the assessors in the award of 'satisfactory' and 'excellent' global scores. Minimal disagreement (2 cases, 6%) was observed in the award of 'unsatisfactory'.

Predictive validity

In its present form the teamwork instrument has no predictive validity. There were no statistically significant relationships between performance in the teamwork exercise and (of the candidates who took up their conditional offers) performance in the summative assessments completed in the first year on the programme. The measures of performance used were 4 progress test scores, 4 written report scores based on participation in special study units, 10 personal and professional development judgements (2 of which were judgements from the student's PBL tutor), and the scores of 2 portfolio analyses. In all cases the significance level of Pearson product−moment correlations exceeded 0.05.

Acceptability among candidates

Feedback was mostly positive, with many candidates feeling the teamwork exercise was a good introduction to PBL, a good measure of the skills relevant to medicine and a PBL curriculum, and a good opportunity to get to know other candidates. Illustrative comments included:

'The idea of a team building exercise is a good one as it will establish a bond.''I enjoyed this process very much and found it very useful. I think it would be a very good measure of teamwork.''This session is a very good idea as it allows candidates to get a feel of the medical school.'

Other positive elements of the feedback included statements that the exercise was fun, stimulating, proactive and inclusive.

A total of 26% participants commented that the process lacked structure, resources and learning support. These issues may be perceived weaknesses of PBL curricula more generally among students and teachers.[5] The analysis highlights that the acceptability of this selection instrument is also strongly related to candidates' attitudes towards PBL. Illustrative comments included:

'I personally would value at least some structured teaching, though if this proceeded along with PBL it would help it seem more relevant.''I think that this will be a very successful style of teaching and learning, when coupled with the more traditional styles of teaching.''... seems a good way to learn as a supplement to more mainstream methods.'

Acceptability of the selection instrument appeared to be lower among applicants who were uncertain or undecided regarding the perceived educational value of PBL.

The most dominant negative theme of the analysis concerned the relationship between personality and performance in a team setting. Over a third of candidates (38%) commented that the exercise might be difficult or intimidating for quiet, shy or particularly reflective candidates.

Discussion

The evaluation of the teamwork exercise shows that its performance is variable across the range of analyses. Juxtaposed to some strong elements, such as good internal consistency reliability and strong face validity, there are some areas of weakness that require further work before the instrument can be considered as a reliable measure of candidate suitability for a PBL curriculum. Further research is also required to fully address all the elements of the utility equation, including cost and educational impact.

In validating the teamwork exercise, its performance was compared to that of the standardised structured interview. The findings show that there is no correlation between the 2 selection instruments, with some disparity noted in the judgements made of the suitability of some candidates. The results suggest that there may be a distinction between the ability to express team‐sympathetic views and the ability to participate effectively in a team. Miller's[7] pyramid can be applied to demonstrate how the attributes measured by the interview and teamwork selection instruments differ. The interview measures cognitive understanding of what makes a good team player (in Miller's terms, the candidate 'knows' or 'knows how'). Meanwhile, the teamwork exercise measures the most sophisticated form of competence identified by Miller, that of 'does'. The interview measures depth and breadth of knowledge, while the teamwork exercise measures whether the candidate is able to put such knowledge into practice and perform effectively as a team member. The lack of consistency in the outcomes of these 2 instruments appears to be a product of the different knowledge or competency domains that each assesses.

It is reassuring that the main barrier to achieving good agreement between raters concerns the use of 'satisfactory' and 'excellent' awards. The assessors used the global scale ratings in a similar fashion, with no leniency or severity effects noted in the data. To a good degree, the judgements of the observer‐assessors and facilitator‐assessors were impartial and/or the checklist approach to measuring competency encouraged objectivity. In any future applications of this exercise, a detailed qualitative statement of the performance expected at each point in the global rating scale (particularly 'satisfactory' and 'excellent') may build on the agreement between assessors.

The fairness of measurement instruments can be compromised if there is an inherent differential validity effect.[16] Analysis is therefore necessary to ensure that this potential method of student selection does not systematically favour, or discriminate against, any sectors of the target population. The sample was fairly homogeneous, as all 69 assessed candidates were British non‐graduates and 99% of them were aged 20 years or less. Age, gender and socioeconomic status were tested as sources of potential test bias and were found to be non‐significant. However, further analysis of potential biases is necessary with a more diverse sample. None of the candidates commented upon the ethical scenario that their group was randomly allocated, suggesting that the process, rather than the content, was of primary interest. Given that different candidates may be more or less inclined to engage with different scenarios and thus be an additional source of bias, it will be necessary to monitor the impact of the ethical scenarios in future administrations.

The analysis highlights the fact that there are 3 aspects of the teamwork exercise that need improving. Firstly, as the majority of candidates 'passed' the teamwork exercise, its discriminatory power is relatively limited. If applied for selection purposes in its present guise, it would most likely lead to some false‐positive judgements. It is, therefore, important to revisit the standard setting process with a view to enhancing the instrument's potential to discriminate between candidates.

Secondly, there appears to be room for conceptual clarification of some of the measured attributes and unwanted behaviours, in particular in terms of defining how we might distinguish between the desirable attribute of 'reflection' and the unwanted behaviour of 'not participating'. As these are largely non‐verbal and visually imperceptible forms of behaviour, the process of assessing whether or not a candidate displays these behaviours may be prone to examiner error. It is possible that conceptual indistinctness between some of the measured attributes contributed positively towards the strong internal consistency reliability coefficient. Assessors may not have been able to clearly distinguish between the different forms of behaviours, and subsequently graded candidates with greater consistency than they might have done otherwise. Indeed, this may have compounded any propensity for halo effects, which occur when assessors' perceptions of 1 attribute favourably or unfavourably influence the assessment of other attributes, making a positive contribution to internal consistency reliability.[17] In addition, the literature on team‐related attributes suggests that many more teamwork measures could be included, such as inclusiveness, interpersonal and communication skills, and supportiveness, etc.[[5]] The assessors may have applied the global scores in such a way as to encompass these attributes. Whether the measurement of such attributes should be made explicit and transparent might be an area for exploration

Thirdly, candidate feedback has been invaluable in highlighting the ability of the teamwork exercise to work as an induction to PBL pedagogy. In this case, it is imperative that the exercise is more thoroughly structured, supported and resourced. This will ensure that candidates' decisions as to whether to proceed with or withdraw their applications are fully informed. Moreover, candidates' concerns that the exercise may be difficult for some personality types suggest that some preliminary team‐building tasks may be useful.

Limitations of the study

Time and resource constraints determined that the teamwork exercise was applied only during the first round of student selection with direct school‐leavers. No mature candidates were offered the opportunity to participate. Consequently, the sample was homogeneous, with limited demographic and educational diversity. It will be necessary to monitor instrument performance, and particularly the issue of bias, in future applications when the sample will have greater diversity in terms of age and educational and employment experience.

As the Peninsula Medical School undergraduate programme has just been implemented, an analysis of the predictive validity of the teamwork exercise is hindered by 2 factors. Firstly, the analysis can only be based on the teamwork candidates who were successful in their applications and subsequently took up their places, a total of 16. The size of the sample violates the assumptions of potentially useful analysis techniques, such as linear regression. Secondly, there is a lack of criterion variables on which to base the analysis. At present, the teamwork exercise (and structured panel interview) is a poor measure of performance in the first year of the programme. It is anticipated that performance over subsequent years will demonstrate a stronger relationship with the assessments made during the selection process. Knowledge of the instrument's predictive validity will add considerably to the assessment of its utility. As with the formal interview, however, the potential competitiveness and artificiality of the selection environment may mask students' true abilities, skills and attributes. Relationships between performance in the teamwork exercise at selection and, once on the programme, performance in PBL groups, may not necessarily be linear.

Conclusion

The new teamwork instrument performs modestly well against the utility equation. For PBL curricula particularly, it appears to have greater face validity than the formal interview. The exercise can replicate well the learning environment that students will experience.

With reference to Miller's pyramid,[7] the teamwork exercise appears to be a more sophisticated instrument than the formal structured interview. Comparing the outcomes of the 2 selection methods highlights the difference between knowing which attributes make a good team player (assessed with the formal interview), and actually demonstrating the skills associated with being a good team player (assessed with the teamwork exercise). There is perhaps little to choose between these 2 characteristics as selection criteria, given that teamwork skills may be learned once on the programme. However, with its capacity to test for desirable and undesirable indicators of teamwork performance, and serve as an informative induction to PBL pedagogy, the teamwork exercise appears to have good utility in the specific context of PBL curricula.

<bold>Contributors: </bold> SEC conducted the qualitative and quantitative analyses and wrote the paper. JS designed the selection instrument and the programme of evaluation, including qualitative and quantitative data collection, and contributed significantly to drafts of the manuscript.

<bold>Acknowledgements: </bold> this paper is written in memory of our dear colleague and friend Richard Farrow, who was an exemplary role model and champion for effective teamwork. In addition, the authors would like to thank the applicants to Peninsula Medical School in 2002 who consented to participate in the selection exercise. Thanks go to John McLachlan and Nick Cooper for support in the development of the instrument and to Lisa Valentine for collating admissions data.

<bold>Funding: </bold> none.

<bold>Conflicts of interest: </bold> none.

<bold>Ethical approval: </bold> no ethical approval was sought for this project.

Overview

What is already known on this subject

The ability to work as a member of a team is important to the functioning of PBL groups and is a skill that should be screened for at admission.

What this study adds

This study provides preliminary evidence that a teamwork exercise can fulfil the selection criteria of PBL curricula.

Suggestions for further research

The predictive validity of the teamwork exercise and the formal structured panel interview should be monitored over time to explore the capacity of these instruments to discriminate between candidates.

References

1 General Medical Council. Tomorrow's Doctors. London: GMC 2003.

2 Shuler C, Fincham A. To admit or not to admit? That is the question... In: Schwartz P, Mennin S, Webb G, eds. Problem‐based Learning. London: Kogan Page 2001; 126 – 34.

3 Epstein RM, Hundert EM. Defining and assessing professional competence. JAMA 2002 ; 9 (2): 226 – 35.DOI: 10.1001/jama.287.2.226

4 Barrows HS. The essentials of problem‐based learning. J Dent Educ 1998 ; 62 : 630 – 3.

5 Lee RMKW, Kwan CY. Overview: PBL, what is it? The use of problem‐based learning in medical education. McMaster University. http://www.fhs.mcmaster.ca/mdprog/overview%5fpbl.htm. [Accessed 23 September 2002.]

6 Mathieu JE, Day DV. Assessing processes within and between organisational teams: a nuclear power plant example. In: Brannick MT, Salas E, Prince C, eds. Team Performance Assessment and Measurement: Theory, Methods and Applications. Mahwah, New Jersey: Lawrence Erlbaum Associates 1997; 173 – 95.

7 Miller GE. The assessment of clinical skills/competence/performance. Acad Med 1990 ; 65 (9): 63 – 7.

8 Van der Vleuten CPM. A paradigm shift in education: how to proceed with assessment? Paper presented at 9th International Ottawa Conference on Medical Education. March 2000.

9 Albanese MA, Mitchell MA. Problem‐based learning: a review of literature on its outcomes and implementation issues. Acad Med 1993 ; 68 : 52 – 81.

Fischer RG. The Delphi method: a description, review and criticism. J Acad Librarianship 1978 ; 4 : 64 – 70.

Ritchie R, Spencer L. Qualitative data analysis for applied policy research. In: Bryman A, Burgess RG, eds. Analysing Qualitative Data. London: Routledge 1994; 173 – 94.

Blake M, Openshaw S. GB Profiles: a user guide. [Working paper.]Leeds: University of Leeds, School of Geography 1994.

Lantz CA, Nebenzahl E. Behaviour and interpretation of the k statistic: resolution of the two paradoxes. J Clin Epidemiol 1996 ; 49 (4): 431 – 4. DOI: 10.1016/0895-4356(95)00571-4

Cicchetti DV, Feinstein AR. High agreement but low kappa. II. Resolving the paradoxes. J Clin Epidemiol 1990 ; 43 (6): 551 – 8.

Feinstein AR, Cicchetti DV. High agreement but low kappa. I. Resolving the paradoxes. J Clin Epidemiol. 1990 ; 43 (6): 543 – 9. DOI: 10.1016/0895-4356(90)90158-L

Linn RL. Selection bias: multiple meanings. J Educational Measurement 1984 ; 21 : 3 – 47.

Nisbett RE, Wilson TD. The halo effect. Evidence for the unconscious alterations of judgements. J Pers Soc Psychol 1977 ; 35 : 450 – 6. DOI: 10.1037//0022-3514.35.6.450

By Suzanne E Chamberlain and Judy Searle

Reported by Author; Author