Dr. van Der Vleuten Cees P.M
Publications Authored By Dr. van Der Vleuten Cees P.M
We discuss activity theory's theoretical background and principles, and we show how these can be applied to the cultural research practice by discussing the steps involved in a cross-cultural study that we conducted, from formulating research questions to drawing conclusions. We describe how the activity system, the unit of analysis in activity theory, can serve as an organizing principle to grasp cultural complexity. We end with reflections on the theoretical and practical use of activity theory for cultural research and note that it is not a shortcut to capture cultural complexity: it is a challenge for researchers to determine the boundaries of their study and to analyze and interpret the dynamics of the activity system.
Using a linear regression model for each station, we calculated the checklist score cut-off on the regression equation for the global scale cut-off set at 2. The OSCE pass-fail standard was defined as the average of all station's standard. To determine the reliability, the root mean square error (RMSE) was calculated. The R (2) coefficient and the inter-grade discrimination were calculated to assess the quality of OSCE.
The mean total test score was 60.78. The OSCE pass-fail standard and its RMSE were 47.37 and 0.55, respectively. The R (2) coefficients ranged from 0.44 to 0.79. The inter-grade discrimination score varied greatly among stations.
The RMSE of the standard was very small indicating that BRM is a reliable method of setting standard for OSCE, which has the advantage of providing data for quality assurance.
Clin. Anat., 2013. © 2013 Wiley Periodicals, Inc.
In this discussion paper we argue that meaningfulness and appropriateness of current validity evidence can be called into question and that we need alternative strategies to assessment and validity inquiry that build on current theories of learning and performance in complex and dynamic workplace settings.
Drawing from research in various professional fields we outline key issues within the mechanisms of learning, competence and performance in the context of complex social environments and illustrate their relevance to WBA. In reviewing recent socio-cultural learning theory and research on performance and performance interpretations in work settings, we demonstrate that learning, competence (as inferred from performance) as well as performance interpretations are to be seen as inherently contextualised, and can only be under-stood 'in situ'. Assessment in the context of work settings may, therefore, be more usefully viewed as a socially situated interpretive act.
We propose constructivist-interpretivist approaches towards WBA in order to capture and understand contextualised learning and performance in work settings. Theoretical assumptions underlying interpretivist assessment approaches call for a validity theory that provides the theoretical framework and conceptual tools to guide the validation process in the qualitative assessment inquiry. Basic principles of rigour specific to qualitative research have been established, and they can and should be used to determine validity in interpretivist assessment approaches. If used properly, these strategies generate trustworthy evidence that is needed to develop the validity argument in WBA, allowing for in-depth and meaningful information about professional competence.
Participants' subsequent exam performance was compared with non-participants.
About 71% of students who performed poorly in the new exam subsequently failed a course. Attendance at the workshops made no difference to short- or long-term pass rates. Attendance at more than three follow-up small group sessions significantly improved pass rates two semesters later, and was influenced by teacher experience.
Close similarity between predictor task and target task is important for accurate prediction of failure. Consideration should be given to dose effect and class size in the prevention of failure of at-risk students, and we recommend a systemic approach to intervention/remediation programmes, involving a whole semester of mandatory, weekly small group meetings with experienced teachers.
Yet, despite participants' convergent opinions on the elements of effective remediation, significant differences were found between outcomes of students working with experienced and inexperienced teachers. The current study explores the actual practice of teachers on this remediation course, aiming to exemplify elements of our theory of remediation and explore differences between teachers.
Since it is in the classroom context that the interactions that constitute the complex process of remediation emerge, this practice-based research has focused on direct observation of classroom teaching. Nineteen hours of small group sessions were recorded and transcribed. Drawing on ethnography and sociocultural discourse analysis, selected samples of talk-in-context demonstrate how the various elements of remediation play out in practice, highlighting aspects that are most effective, and identifying differences between experienced and novice teachers.
Long-term student outcomes are strongly correlated to teacher experience (r, 0.81). Compared to inexperienced teachers, experienced teachers provide more challenging, disruptive facilitation, and take a dialogic stance that encourages more collaborative group dynamics. They are more expert at diagnosing cognitive errors, provide frequent metacognitive time-outs and make explicit links across the curriculum.
Remediation is effective in small groups where dialogue is used for collaborative knowledge construction and social regulation. This requires facilitation by experienced teachers who attend to details of both content and process, and use timely interventions to foster curiosity and the will to learn. These teachers should actively challenge students' language use, logical inconsistencies and uncertainties, problematize their assumptions, and provide a metacognitive regulatory voice that can generate attitudinal shifts and nurture the development of independent critical thinkers.
Besides maximum facilitation of learning it should improve the validity and reliability of measurements and documentation of competence development. We explored how, in a competency-based curriculum, current theories on programmatic assessment interacted with educational practice.
In a development study including evaluation, we investigated the implementation of a theory-based programme of assessment. Between April 2011 and May 2012 quantitative evaluation data were collected and used to guide group interviews that explored the experiences of students and clinical supervisors with the assessment programme. We coded the transcripts and emerging topics were organised into a list of lessons learned.
The programme mainly focuses on the integration of learning and assessment by motivating and supporting students to seek and accumulate feedback. The assessment instruments were aligned to cover predefined competencies to enable aggregation of information in a structured and meaningful way. Assessments that were designed as formative learning experiences were increasingly perceived as summative by students. Peer feedback was experienced as a valuable method for formative feedback. Social interaction and external guidance seemed to be of crucial importance to scaffold self-directed learning. Aggregating data from individual assessments into a holistic portfolio judgement required expertise and extensive training and supervision of judges.
A programme of assessment with low-stakes assessments providing simultaneously formative feedback and input for summative decisions proved not easy to implement. Careful preparation and guidance of the implementation process was crucial. Assessment for learning requires meaningful feedback with each assessment. Special attention should be paid to the quality of feedback at individual assessment moments. Comprehensive attention for faculty development and training for students is essential for the successful implementation of an assessment programme.
It explores the perspectives of patients, midwives, nurses, general practitioners, and hospital boards on gynaecological competencies and compares these with the CanMEDS framework.
Clinical expertise, reflective practice, collaboration, a holistic view, and involvement in practice management were perceived to be important competencies for gynaecological practice. Although all the competencies were covered by the CanMEDS framework, there were some mismatches between stakeholders' perceptions of the importance of some competencies and their position in the framework.
The CanMEDS framework appears to offer relevant building blocks for specialty specific postgraduate training, which should be combined with the results of an exploration of specialty specific competencies to arrive at a postgraduate curriculum that is in alignment with professional practice.
The scenarios differed in the sequencing and alignment of VPs and related educational activities, tutor involvement, number of VPs, relevance to assessment and involvement of real patients. We sought students' perceptions on the VP scenarios in focus group interviews with eight groups of 4-7 randomly selected students (n = 39). The interviews were recorded, transcribed and analysed qualitatively.
The analysis resulted in six themes reflecting students' perceptions of important features for effective curricular integration of VPs: (i) continuous and stable online access, (ii) increasing complexity, adapted to students' knowledge, (iii) VP-related workload offset by elimination of other activities, (iv) optimal sequencing (e.g.: lecture--1 to 2 VP(s)--tutor-led small group discussion--real patient) and (V) optimal alignment of VPs and educational activities, (vi) inclusion of VP topics in assessment.
The themes appear to offer starting points for the development of a framework to guide the curricular integration of VPs. Their impact needs to be confirmed by studies using quantitative controlled designs.
The mean relevance score of the Delphi panel (n = 19) reached 4.2 on a five-point Likert-type scale (1 = not relevant and 5 = highly relevant) in the second round, meeting predefined criteria for completing the Delphi procedure. Faculty (n = 991) from 131 medical schools in 56 countries completed MORC. Exploratory factor analysis yielded three underlying factors-motivation, capability, and external pressure-in 12 subscales with 53 items. The scale structure suggested by exploratory factor analysis was confirmed by confirmatory factor analysis. Cronbach alpha ranged from 0.67 to 0.92 for the subscales. Generalizability analysis showed that the MORC results of 5 to 16 faculty members can reliably evaluate a school's organizational readiness for change.
MORC is a valid, reliable questionnaire for measuring organizational readiness for curriculum change in medical schools. It can identify which elements in a change process require special attention so as to increase the chance of successful implementation.
We conducted nine focus groups (two with medical students, three with residents, four with music students) and four individual interviews (with one clinician-educator, one music educator and two doctor-musicians), for a total of 37 participants. Analysis occurred alongside and informed data collection. Themes were identified iteratively using constant comparisons.
Cultural perspectives diverged in terms of where learning should occur, what learning outcomes are desired, and how learning is best facilitated. Whereas medicine valued learning by doing, music valued learning by lesson. Whereas medical learners aimed for competence, music students aimed instead for ever-better performance. Whereas medical learners valued their teachers for their clinical skills more than for their teaching abilities, the opposite was true in music, in which teachers' instructional skills were paramount. Self-assessment challenged learners in both cultures, but medical learners viewed self-assessment as a skill they could develop, whereas music students recognised that external feedback would always be required.
This comparative analysis reveals that medicine and music make culturally distinct assumptions about teaching and learning. The contrasts between the two cultures illuminate potential vulnerabilities in the medical learning culture, including the risks inherent in its competence-focused approach and the constraints it places on its own teachers. By highlighting these vulnerabilities, we provide a stimulus for reimagining and renewing medicine's educational practices.
In total, 138 students (in the third year out of five) completed a questionnaire about goal orientation, motivation, self-efficacy, control of learning beliefs and attitudes to feedback. Individual website usage was analysed over an 8-week period. Latent class analyses were used to identify profiles of students, based on their use of different aspects of the feedback website. Differences in learning-related student characteristics between profiles were assessed using analyses of variance (anovas). Individual website usage was related to OSCE performance.
In total, 132 students (95.7%) viewed the website. The number of pages viewed ranged from two to 377 (median 102). Fifty per cent of students engaged comprehensively with the feedback, 27% used it in a minimal manner, whereas a further 23% used it in a more selective way. Students who were comprehensive users of the website scored higher on the value of feedback scale, whereas students who were minimal users scored higher on extrinsic motivation. Higher performing students viewed significantly more web pages showing comparisons with peers than weaker students did. Students who just passed the assessment made least use of the feedback.
Higher performing students appeared to use the feedback more for positive affirmation than for diagnostic information. Those arguably most in need engaged least. We need to construct feedback after summative assessment in a way that will more effectively engage those students who need the most help.
Using a constructivist grounded theory approach, we conducted 12 focus groups and nine individual interviews (with a total of 50 participants) across three cultures of professional training in, respectively, music, teacher training and medicine. Constant comparative analysis for recurring themes was conducted iteratively.
Each of the three professional cultures created a distinct context for learning that influenced how feedback was handled. Despite these contextual differences, credibility and constructiveness emerged as critical constants, identified by learners across cultures as essential for feedback to be perceived as meaningful. However, the definitions of credibility and constructiveness were distinct to each professional culture and the cultures varied considerably in how effectively they supported the occurrence of feedback with these critical characteristics.
Professions define credibility and constructiveness in culturally specific ways and create contexts for learning that may either facilitate or constrain the provision of meaningful feedback. Comparison with other professional cultures may offer strategies for creating a productive feedback culture within medical education.
This raised the question: 'How did those schools overcome the barrier of uncertainty avoidance?'
Austria offered the combination of a high uncertainty avoidance score and integrated curricula in all its medical schools. Twenty-seven key change agents in four medical universities were interviewed and transcripts analysed using thematic cross-case analysis.
Initially, strict national laws and limited autonomy of schools inhibited innovation and fostered an 'excuse culture': 'It's not our fault. It is the ministry's'. A new law increasing university autonomy stimulated reforms. However, just this law would have been insufficient as many faculty still sought to avoid change. A strong need for change, supportive and continuous leadership, and visionary change agents were also deemed essential.
In societies with strong uncertainty avoidance strict legislation may enforce resistance to curriculum change. In those countries opposition by faculty can be overcome if national legislation encourages change, provided additional internal factors support the change process.
The audio-taped interviews were transcribed verbatim, analyzed, and themes were identified. We preformed investigators' triangulation, member checking with clinical supervisors and we triangulated the data with a similar research performed prior to the implementation of WBA.
WBA results in variable learning approaches. Based on several affecting factors; clinical supervisors, faculty-given feedback, and assessment function, students may swing between surface, deep and effort and achievement learning approaches. Students' and supervisors' orientations on the process of WBA, utilization of peer feedback and formative rather than summative assessment facilitate successful implementation of WBA and lead to students' deeper approaches to learning. Interestingly, students and their supervisors have contradicting perceptions to WBA.
A change in culture to unify students' and supervisors' perceptions of WBA, more accommodation of formative assessment, and feedback may result in students' deeper approach to learning.
Principal component analysis on data from a lecture in statistics for PhD students (n = 56) in psychology and health sciences revealed a three-component solution, consistent with the types of load that the different items were intended to measure. This solution was confirmed by a confirmatory factor analysis of data from three lectures in statistics for different cohorts of bachelor students in the social and health sciences (ns = 171, 136, and 148), and received further support from a randomized experiment with university freshmen in the health sciences (n = 58).
A questionnaire covering preparedness for practice, intensity of the transition, social support, and burnout was used. Structural equation modelling was used for statistical analysis.
Data from a third of the population were available (32% n = 840) (43% male/57% female). Preparation in generic competencies received lower ratings than in medical competencies. A total of 10% met the criteria for burnout and 18% scored high on the emotional exhaustion subscale. Perceived lack of preparation in generic competencies correlated with burnout (r = 0.15, p < 0.001). No such relation was found for medical competencies. Furthermore, social support protected against burnout.
These findings illustrate the relevance of generic competencies for new hospital consultants. Furthermore, social support facilitates this intense and stressful stage within the medical career.
A longitudinal qualitative study was performed in the Netherlands. Semi-structured interviews were conducted with new consultants. The study was guided by an interpretative phenomenological approach until saturation was reached. At 3-month intervals between July 2011 and March 2012, eight novice consultants in internal medicine were interviewed three times each about their supervisory role while on call. Interviews focused on their preparation for the role in training, the actions they took to master the role, and their progression over time.
Three interrelated domains of relevant factors emerged from the data: preparedness; personal characteristics, and contextual characteristics. Preparedness referred to the extent to which new consultants were prepared by training to take full responsibility for registrars' actions while supervising them from a distance. Personal characteristics, such as coping strategies and views on supervision, guided consultants' development as supervisors. Essential to this process were contextual characteristics, especially those concerning the extent to which the consultant knew the registrar, was familiar with departmental procedures, and had access to support from colleagues.
New consultants should be prepared for their supervisory role by training and by being given a proper introduction to their workplace. The former requires progressive independence and exposure to supervisory tasks during specialty training; the latter requires an induction programme to enable new consultants to familiarise themselves with the departmental environment and the registrars they will be supervising.
All consultants registered in the Netherlands in 2007-2009 (n = 2643) and Denmark in 2007-2010 (n = 1336) received in June 2010 and April 2011, respectively, a survey about their preparation for medical and generic competencies, perceived intensity and burnout. Power analysis resulted in required sample sizes of 542. Descriptive statistics and independent t-tests were used for analysis.
Data were available of 792 new consultants in the Netherlands and 677 Danish new consultants. Compared to their Dutch counterparts, Danish consultants perceived specialty training and the transition less intensely, reported higher levels of preparation for generic competencies and scored lower on burnout.
The importance of contextual aspects in the transition is underscored and shows that Denmark appears to succeed better in aligning training with practice. Regulations regarding working hours and progressive independence of trainees appear to facilitate the transition.
Research in organisational psychology has proposed a mechanism whereby feedback seeking is influenced by motives and goal orientation mediated by the perceived costs and benefits of feedback. Building on a recently published model of resident doctors' feedback-seeking behaviour, we conducted a qualitative study to explore students' feedback-seeking behaviours in the clinical workplace.
Between April and June 2011, we conducted semi-structured face-to-face interviews with veterinary medicine students in Years 5 and 6 about their feedback-seeking behaviour during clinical clerkships. In the interviews, 14 students were asked about their goals and motives for seeking feedback, the characteristics of their feedback-seeking behaviour and factors influencing that behaviour. Using template analysis, we coded the interview transcripts and iteratively reduced and displayed the data until agreement on the final template was reached.
The students described personal and interpersonal factors to explain their reasons for seeking feedback. The factors related to intentions and the characteristics of the feedback provider, and the relationship between the feedback seeker and provider. Motives relating to image and ego, particularly when students thought that feedback might have a positive effect on image and ego, influenced feedback-seeking behaviour and could induce specific behaviours related to students' orientation towards particular sources of feedback, their orientation towards particular topics for and timing of feedback, and the frequency and method of feedback-seeking behaviour.
This study shows that during clinical clerkships, students actively seek feedback according to personal and interpersonal factors. Perceived costs and benefits influenced this active feedback-seeking behaviour. These results may contribute towards the optimising and developing of meaningful educational opportunities during clerkships.
Focusing on WBA as a recent instance of innovation in PGME, we conducted semi-structured interviews to explore perceptions of the effects of WBA in a purposive sample of Dutch trainees and (lead) consultants in surgical and non-surgical specialties. Interviews conducted in 2011 with 17 participants were analysed thematically using template analysis. To support the exploration of effects outside the domain of education, the study design was informed by theory on the diffusion of innovations.
Six domains of effects of WBA were identified: sentiments (affinity with the innovation and emotions); dealing with the innovation; specialty training; teaching and learning; workload and tasks, and patient care. Users' affinity with WBA partly determined its effects on teaching and learning. Organisational support and the match between the innovation and routine practice were considered important to minimise additional workload and ensure that WBA was used for relevant rather than easily assessable training activities. Dealing with WBA stimulated attention for specialty training and placed specialty training on the agenda of clinical departments.
These outcomes are in line with theoretical notions regarding innovations in general and may be helpful in the implementation of other innovations in PGME. Given the substantial effects of innovations outside the strictly education-related domain, individuals designing and implementing innovations should consider all potential effects, including those identified in this study.
Progress testing is longitudinal assessment in that it is based on subsequent equivalent, yet different, tests. The results of these are combined to determine the growth of functional medical knowledge for each student, enabling more reliable and valid decision making about promotion to a next study phase. The longitudinal integrated assessment approach has a demonstrable positive effect on student learning behaviour by discouraging binge learning. Furthermore, it leads to more reliable decisions as well as good predictive validity for future competence or retention of knowledge. Also, because of its integration and independence of local curricula, it can be used in a multi-centre collaborative production and administration framework, reducing costs, increasing efficiency and allowing for constant benchmarking. Practicalities include the relative unfamiliarity of faculty with the concept, the fact that remediation for students with a series of poor results is time consuming, the need to embed the instrument carefully into the existing assessment programme and the importance of equating subsequent tests to minimize test-to-test variability in difficulty. Where it has been implemented-collaboratively-progress testing has led to satisfaction, provided the practicalities are heeded well.
Five key attributes of guidelines for communication skill training were identified: complexity, level of detail, format and organization, type of information, and trustworthiness/validity. The desired use of these attributes is related to specific educational purposes and learners' expertise. The low complexity of current communication guidelines is appreciated, but seems ad odds with the wish for more valid communication guidelines.
Which guideline characteristics are preferred by users depends on the expertise of the learners and the educational purpose of the guideline.
Communication guidelines can be improved by modifying the key attributes in line with specific educational functions and learner expertise. For example: the communication guidelines used in GP training in the Netherlands, seem to offer an oversimplified model of doctor patient communication. This model may be suited for undergraduate learning, but does not meet the validity demands of physicians in training.
However, validity evidence from those interventions has not proved entirely adequate for the practical anatomy examination, and thus, further investigation was required. In this study, the validity evidence of SRF was examined using multiple choice questions (MCQs) constructed according to different levels of Bloom's taxonomy in comparison with the traditional free response format. A group of 100 medical students registered in a gross anatomy course volunteered to be enrolled in this study. The experimental MCQ examinations were part of graded midterm and final steeplechase practical examination. Volunteer students were instructed to complete the practical examinations twice, once in each of two separate examination rooms. The two separate examinations consisted of a traditional free response format and MCQ format. Scores from the two examinations (FRF and MCQ) displayed a strong correlation, even with higher level Bloom's taxonomy questions. In conclusion, the results of this study provide empirical evidence that the SRF (MCQ) response format is a valid method and can be used as an alternative to the traditional FRF steeplechase examination.
For stringency, we focused on a subset of assessment factor-learning effect associations that featured least commonly in a baseline qualitative study. Our aims were to determine whether these uncommon associations were operational in a broader but similar population to that in which the model was initially derived.
A cross-sectional survey of 361 senior medical students at one medical school was undertaken using a purpose-made questionnaire based on a grounded theory and comprising pairs of written situational tests. In each pair, the manifestation of an assessment factor was varied. The frequencies at which learning effects were selected were compared for each item pair, using an adjusted alpha to assign significance. The frequencies at which mechanism factors were selected were calculated.
There were significant differences in the learning effect selected between the two scenarios of an item pair for 13 of this subset of 21 uncommon associations, even when a p-value of < 0.00625 was considered to indicate significance. Three mechanism factors were operational in most scenarios: agency; response efficacy, and response value.
For a subset of uncommon associations in the model, the role of most assessment factor-learning effect associations and the mechanism factors involved were supported in a broader but similar population to that in which the model was derived. Although model validation is an ongoing process, these results move the model one step closer to the stage of usefully informing interventions. Results illustrate how factors not typically included in studies of the learning effects of assessment could confound the results of interventions aimed at using assessment to influence learning.
We collected and analysed modified mini-CEX forms completed by GP trainers and trainees. Since each trainee has the same trainer for the duration of one year, we used trainer-trainee pairs as the unit of analysis. We determined for all forms the frequency of the different types of narrative comments and rated their specificity on a three-point scale: specific, moderately specific, not specific. Specificity was compared between trainee-trainer pairs.
We collected 485 completed modified mini-CEX forms from 54 trainees (mean of 8.8 forms per trainee; range 1-23; SD 5.6). Trainer feedback was more frequently provided than trainee self-reflections, and action plans were very rare. The comments were generally specific, but showed large differences between trainee-trainer pairs.
The frequency of self-reflection and action plans varied, all comments were generally specific and there were substantial and consistent differences between trainee-trainer pairs in the specificity of comments. We therefore conclude that feedback is not so much determined by the instrument as by the users. Interventions to improve the educational effects of the feedback procedure should therefore focus more on the users than on the instruments.
In one experimental condition, a tutor in the video encouraged participants to elaborate by asking elaborative questions. In a second condition, the tutor asked superficial questions. After the discussion, all participants studied a text with relevant new information. Elaborative questions had no significant effect on recall of idea units from the text, p = .39, η(2) = .01. High-ability students outperformed low-ability students, p = .04, η(2) = .07, but this effect did not interact with the experimental treatment, p = .22, η(2) = .02. Suggestions for further research are presented.
Active peer discussion by a Computer Supported Collaborative Learning (CSCL) environment show positive medical students perceptions on subjective knowledge improvement. High students' activity during discussions in a CSCL environment demonstrated higher task-focussed discussion reflecting higher levels of knowledge construction. However, it remains unclear whether high discussion activity influences students' decisions revise their CAT paper. The aim of this research is to examine whether students who revise their critical appraisal papers after discussion in a CSCL environment show more task-focussed activity and discuss more intensively on critical appraisal topics than students who do not revise their papers.
Forty-seven medical students, stratified in subgroups, participated in a structured asynchronous online discussion of individual written CAT papers on self-selected clinical problems. The discussion was structured by three critical appraisal topics. After the discussion, the students could revise their paper. For analysis purposes, all students' postings were blinded and analysed by the investigator, unaware of students characteristics and whether or not the paper was revised. Postings were counted and analysed by an independent rater, Postings were assigned into outside activity, non-task-focussed activity or task-focussed activity. Additionally, postings were assigned to one of the three critical appraisal topics. Analysis results were compared by revised and unrevised papers.
Twenty-four papers (51.6%) were revised after the online discussion. The discussions of the revised papers showed significantly higher numbers of postings, more task-focussed activities, and more postings about the two critical appraisal topics: "appraisal of the selected article(s)", and "relevant conclusion regarding the clinical problem".
A CSCL environment can support medical students in the execution and critical appraisal of authentic tasks in the clinical workplace. Revision of CAT papers appears to be related to discussions activity, more specifically reflecting high task-focussed activity of critical appraisal topics.
This Guide presents a generic, systemic framework to help identify and explore improvements in the quality and defensibility of progress test data. The framework draws on the combined experience of the Dutch consortium, an individual medical school in the United Kingdom, and the bulk of the progress test literature to date. It embeds progress testing as a quality-controlled assessment tool for improving learning, teaching and the demonstration of educational standards. The paper describes strengths, highlights constraints and explores issues for improvement. These may assist in the establishment of potential or new progress testing in medical education programmes. They can also guide the evaluation and improvement of existing programmes.
This article focuses on a teaching approach and is a translational contribution to existing literature. In line with best evidence medical education, the aim of this article is twofold: to briefly inform teachers about constructivist learning theory and elaborate on the principles of constructive, collaborative, contextual, and self-directed learning; and to provide teachers with an example of how to implement these learning principles to change the approach to teaching surface anatomy. Student evaluations of this new approach demonstrate that the application of these learning principles leads to higher student satisfaction. However, research suggests that even better results could be achieved by further adjustments in the application of contextual and self-directed learning principles. Successful implementation and guidance of peer physical examination is crucial for the described approach, but research shows that other options, like using life models, seem to work equally well. Future research on surface anatomy should focus on increasing the students' ability to apply anatomical knowledge and defining the setting in which certain teaching methods and approaches have a positive effect.
Externally regulated educational interventions, like reflection, learning portfolios, assessments and progress meetings, are increasingly used to scaffold self-regulation.The aim of this study is to explore how postgraduate trainees regulate their learning in the workplace, how external regulation promotes self-regulation and which elements facilitate or impede self-regulation and learning.
In a qualitative study with a phenomenologic approach we interviewed first- and third-year GP trainees from two universities in the Netherlands. Twenty-one verbatim transcripts were coded. Through iterative discussion the researchers agreed on the interpretation of the data and saturation was reached.
Trainees used a short and a long self-regulation loop. The short loop took one week at most and was focused on problems that were easy to resolve and needed minor learning activities. The long loop was focused on complex or recurring problems needing multiple and planned longitudinal learning activities. External assessments and formal training affected the long but not the short loop. The supervisor had a facilitating role in both loops. Self-confidence was used to gauge competence.Elements influencing self-regulation were classified into three dimensions: personal (strong motivation to become a good doctor), interpersonal (stimulation from others) and contextual (organizational and educational features).
Trainees did purposefully self-regulate their learning. Learning in the short loop may not be visible to others. Trainees should be encouraged to actively seek and use external feedback in both loops. An important question for further research is which educational interventions might be used to scaffold learning in the short loop. Investing in supervisor quality remains important, since they are close to trainee learning in both loops.
Specifically, it investigated how students' cultural backgrounds impact on SDL in PBL and how this impact affects students.
A qualitative, cross-cultural, comparative case study was conducted in three medical schools. Data were collected through 88 semi-structured, in-depth interviews with Year 1 and 3 students, tutors and key persons involved in PBL, 32 observations of Year 1 and 3 PBL tutorials, document analysis, and contextual information. The data were thematically analysed using the template analysis method. Comparisons were made among the three medical schools and between Year 1 and 3 students across and within the schools.
The cultural factors of uncertainty and tradition posed a challenge to Middle Eastern students' SDL. Hierarchy posed a challenge to Asian students and achievement impacted on both sets of non-Western students. These factors were less applicable to European students, although the latter did experience some challenges. Several contextual factors inhibited or enhanced SDL across the cases. As students grew used to PBL, SDL skills increased across the cases, albeit to different degrees.
Although cultural factors can pose a challenge to the application of PBL in non-Western settings, it appears that PBL can be applied in different cultural contexts. However, its globalisation does not postulate uniform processes and outcomes, and culturally sensitive alternatives might be developed.
Contemporary theories on learning based on a constructivist paradigm offer the following insights: acquisition of knowledge and skills should be viewed as an ongoing process of exchange between the learner and his environment, so called lifelong learning. This process can neither be atomized nor separated from the context in which it occurs. Four contemporary approaches are presented as examples.
The following shift in focus for future research is proposed: beyond isolated single factor effectiveness studies toward constructivist, non-reductionistic studies integrating the context.
Future research should investigate how constructivist approaches can be used in the medical context to increase effective learning and transition of communication skills.
The COLT was adapted based on experts' comments during a meeting and interviews, followed by a Delphi procedure (Part I). It was administered to teachers from two Dutch medical schools with different traditions in student-centred education (Part II; N=646). The data were analyzed using confirmatory factor analysis and reliability analysis.
324 Teachers (50.2%) completed the questionnaire. Confirmatory factor analysis did not confirm the underlying theoretical model, but an alternative model demonstrated a good fit. This led to an instrument with eighteen items reflecting three underlying factors: 'teacher centredness', 'appreciation of active learning', and 'orientation to professional practice'. We found significant differences in COLT scores between the faculty of the two medical schools.
The COLT appears to be a construct valid tool resulting in reliable scores of teachers' conceptions of learning and teaching, in student-centred medical education. Two of the three factors are new and may be specific for student-centred medical education. The COLT may be a promising tool to improve faculty development.
Remediation should support emotional needs and foster cognitive and metacognitive skills for self-regulation and critical thinking. Teachers of remediation need to motivate, critique, challenge and advise their learners, applying teaching and contextual expertise in a constructivist, student-centred environment that fosters curiosity and joy for learning. Teachers of remediation can mediate these processes through embodiment of five core roles: facilitator, nurturing mentor, disciplinarian, diagnostician and modeller of desired skills, attitudes and behaviours.
Remediation of struggling medical students can be achieved through a cognitive apprenticeship within a small community of inquiry that motivates and challenges the students. This community needs teachers capable of performing a unique combination of roles that demands high levels of teaching presence and practical wisdom.
Interviews were conducted and the resulting data analysed using a qualitative, phenomenological approach. Between October 2009 and January 2010, we interviewed 22 postgraduate general practice trainees at two institutions in the Netherlands. Three researchers analysed the transcripts of the interviews.
A three-step scheme emerged from the data. Feedback as part of WBA is of greater benefit to trainees if: (i) observation and feedback are planned by the trainee and trainer; (ii) the content and delivery of the feedback are adequate, and (iii) the trainee uses the feedback to guide his or her learning by linking it to learning goals. Negative emotions reported by almost all trainees in relation to observation and feedback led to different responses. Some trainees avoided observation, whereas others overcame their apprehension and actively sought observation and feedback. Active trainers were able to help trainees overcome their fears. Four types of trainer-trainee pairs were distinguished according to their engagement in observation and feedback. External requirements set by training institutions may stimulate inactive trainers and trainees.
In line with the literature, our results emphasise the importance of the content of feedback and the way it is provided, as well as the importance of its incorporation in trainees' learning. Moreover, we highlight the step before the actual feedback itself. The way arrangements for feedback are made appears to be important to feedback in formative WBA. Finally, we outline several factors that influence the success or failure of feedback but precede the process of observation and feedback.
In a constructivist grounded theory study, we interviewed 22 early-career academic doctors about experiences they perceived as influential in their learning. Although feedback emerged as important, responses to feedback were highly variable. To better understand how feedback becomes (or fails to become) influential, we used the theoretical framework of regulatory focus to re-examine all descriptions of experiences of receiving and responding to feedback.
Feedback could be influential or non-influential, regardless of its sign (positive or negative). In circumstances in which the individual's regulatory focus was readily determined, such as in choosing a career (promotion) or preparing for a high-stakes examination (prevention), the apparent influence of feedback was consistent with the prediction of regulatory focus theory. However, we encountered many challenges in applying regulatory focus theory to real feedback scenarios, including the frequent presence of a mixed regulatory focus, the potential for regulatory focus to change over time, and the competing influences of other factors, such as the perceived credibility of the source or content of the feedback.
Regulatory focus theory offers a useful, if limited, construct for exploring learners' responses to feedback in the clinical setting. The insights and predictions it offers must be considered in light of the motivational complexity of clinical learning tasks and of other factors influencing the impact of feedback.
137 complaints (98%) yielded 46 different unprofessional behaviours grouped into 18 categories. The element 'perceived medical complications and error' occurred most commonly (n=77), followed by 'having to wait for care' and 'insufficient or unclear clarification' (n=52, n=48, respectively). The combined non-cognitive elements of professionalism (especially aspects of communication) were far more prominently discussed than cognitive issues (knowledge/skills) related to medical error. Most categories of professionalism elements were considered important by physicians but, nevertheless, were identified in patient complaints analysis. Some issues (eg, 'altruism', 'appearance', 'keeping distance/respecting boundaries with patients') were not perceived as problematic by patients and/or relatives, while mentioned by physicians. Conversely, eight categories of poor professionalism revealed from complaint analysis (eg, 'having to wait for care', 'lack of continuity of care' and 'lack of shared decision making') were not considered essential by physicians.
The vast majority of unprofessional behaviour identified related to non-cognitive, professionalism aspects of care. Complaints pertaining to unsatisfactory communication were especially noticeable. Incongruence is noted between the physicians' and the patients' perception of actual care.
Generalisability coefficient of 0.8, on a scale of 0 to 1.0, was considered to indicate good reliability for assessment purposes. Pass/fail standards were based on laparoscopic experience: Novices, intermediates, and experts (>100 procedures). The pass/fail standards were investigated for the PLUS performances of 33 second-year urological residents.
Fifteen novices, twenty-three intermediates and twelve experts were included. An inter-trial reliability of >0.80 was reached with two trials for each task. Inter-rater reliability of the quality measurements was 0.79 for two judges. Pass/fail scores were determined for the novice/intermediate boundary and the intermediate/expert boundary. Pass rates for second-year residents were 63.64% and 9.09%, respectively.
The PLUS assessment is reliable for setting a certification standard for second-year urological residents that serves as a starting point for residents to proceed to the next level of laparoscopic competency.
A fitness-for-purpose approach defining quality was adopted to develop and validate guidelines.
First, in a brainstorm, ideas were generated, followed by structured interviews with 9 international assessment experts. Then, guidelines were fine-tuned through analysis of the interviews. Finally, validation was based on expert consensus via member checking.
In total 72 guidelines were developed and in this paper the most salient guidelines are discussed. The guidelines are related and grouped per layer of the framework. Some guidelines were so generic that these are applicable in any design consideration. These are: the principle of proportionality, rationales should underpin each decisions, and requirement of expertise. Logically, many guidelines focus on practical aspects of assessment. Some guidelines were found to be clear and concrete, others were less straightforward and were phrased more as issues for contemplation.
The set of guidelines is comprehensive and not bound to a specific context or educational approach. From the fitness-for-purpose principle, guidelines are eclectic, requiring expertise judgement to use them appropriately in different contexts. Further validation studies to test practicality are required.
From an interpretative constructivist perspective, we conducted a qualitative exploratory study using semi-structured interviews with a purposive sample of 16 lead consultants in the Netherlands between August 2010 and February 2011. The study design was based on the research questions and notions from corporate business and social psychology about the roles of change managers. Interview transcripts were analysed thematically using template analysis.
The lead consultants described change processes with different stages, including cause, development of content, and the execution and evaluation of change, and used individual change strategies consisting of elements such as ideas, intentions and behaviour. Communication is necessary to the forming of a strategy and the implementation of change, but the nature of communication is influenced by the strategy in use. Lead consultants differed in their degree of awareness of the strategies they used. Factors influencing approaches to change were: knowledge, ideas and beliefs about change; level of reflection; task interpretation; personal style, and department culture.
Most lead consultants showed limited awareness of their own approaches to change. This can lead them to adopt a rigid approach, whereas the ability to adapt strategies to circumstances is considered important to effective change management. Interventions and research should be aimed at enhancing the awareness of lead consultants of approaches to change in PGME.
English language literatures were searched in Pubmed, PsycINFO, and Medline without restriction to type or date of publication. Reviewing the literature, the most prominent identified theme was assessment function characterized in summative and formative assessment and general effect of assessment on students' learning approaches. The literature review has pointed clearly to the complexity of the relationship between learning environment, students' perceptions of assessment demands, and students' approaches to learning. Many factors (extrinsic and intrinsic) were theoretically proposed to mediate students' approaches to learning in response to their assessment. However, few of these factors were researched in the published literature. Formative assessment is likely to contribute to students' deep approach to learning while summative is likely to contribute to their surface approach. However, these effects are not definite and further research about the complex relationship between assessment and students' learning is required.
The purpose of this study was to explore whether the model was operational in a clinical context as a first step in this process.
Given the complexity of the model, we adopted a qualitative approach. Data from in-depth interviews with eighteen medical students were subject to content analysis. We utilised a code book developed previously using grounded theory. During analysis, we remained alert to data that might not conform to the coding framework and open to the possibility of deploying inductive coding. Ethical clearance and informed consent were obtained.
The three components of the model i.e., assessment factors, mechanism factors and learning effects were all evident in the clinical context. Associations between these components could all be explained by the model. Interaction with preceptors was identified as a new subcomponent of assessment factors. The model could explain the interrelationships of the three facets of this subcomponent i.e., regular accountability, personal consequences and emotional valence of the learning environment, with previously described components of the model.
The model could be utilized to analyse and explain observations in an assessment context different to that from which it was derived. In the clinical setting, the (negative) influence of preceptors on student learning was particularly prominent. In this setting, learning effects resulted not only from the high-stakes nature of summative assessment but also from personal stakes, e.g. for esteem and agency. The results suggest that to influence student learning, consequences should accrue from assessment that are immediate, concrete and substantial. The model could have utility as a planning or diagnostic tool in practice and research settings.
The first experiences with the programme show that students think that the programme has high learning value and the assessment is sufficiently robust. Many of the commonly reported weaknesses of work-based assessment (not a good fit with the educational context, too complex, too bureaucratic and too much work) were not mentioned by the students.
The authors reanalyzed 104 previously published comparisons involving a single, problem-based medical school in the Netherlands (Maastricht University's medical school), using student attrition and study duration data from this school and the schools with which it was compared. The authors removed bias by reequalizing the comparison groups in terms of attrition and study duration.
The uncorrected data showed no differences between problem-based and conventional curricula: Mean effect sizes as expressed by Cohen d were 0.02 for medical knowledge and 0.07 for diagnostic reasoning. However, the reanalysis demonstrated medium-level effect sizes favoring the problem-based curriculum. After corrections for attrition and study duration, the mean effect size for knowledge acquisition was 0.31 and for diagnostic reasoning was 0.51.
Effects of the Maastricht problem-based curriculum were masked by differential attrition and differential exposure in the original studies. Because this school has been involved in many studies included in influential literature reviews published in the past 20 years, the authors' findings have implications for the assessment of the value of problem-based learning put forward by these reviews.
Moreover, it evaluated the effect of self-assessment process on students' study strategies within a community of clinical practice.
We conducted a qualitative phenomenological study from May 2008 to December 2009. We held 37 semi-structured individual interviews with three different cohorts of undergraduate medical students until we reached data saturation. The cohorts were exposed to different contexts while experiencing their clinical years' assessment program. In the interviews, students' perceptions and interpretations of 'self-assessment practice' and 'supervisor-provided feedback' within different contexts and the resulting study strategies were explored.
The analysis of interview data with the three cohorts of students yielded three major themes: strategic practice of self-assessment, self-assessment and study strategies, and feedback and study strategies. It appears that self-assessment is not appropriate within a summative context, and its implementation requires cultural preparation. Despite education and orientation on the two major components of the self-assessment process, feedback was more effective in enhancing deeper study strategies.
This research suggests that the theoretical advantages linked to the self-assessment process are a result of its feedback component rather than the practice of self-assessment isolated from feedback. Further research exploring the effects of different contextual and personal factors on students' self-assessment is needed.
Participants were asked to reflect on experiences they considered to have been influential during their training. Constant comparative analysis for emerging themes was conducted iteratively with data collection.
A model of clinical learning emerged in which the clinical work itself is central. As they observe and participate in clinical work, learners can attend to a variety of sources of information that facilitate the interpretation of the experience and the construction of knowledge from it. These 'learning cues' include feedback, role models, clinical outcomes, patient or family responses, and comparisons with peers. The integration of a cue depends on the learner's judgement of its credibility. Certain cues, such as clinical outcomes or feedback from patients, are seen as innately credible, whereas other cues, particularly feedback from supervisors, are subjected to critical judgement.
Learners make complex judgements regarding the credibility of information about clinical performance. Credibility judgements influence the learning that arises from the clinical experience. Further understanding of how such judgements are made could guide educators in providing credible information to learners.
Kane's views on validity as represented by a series of arguments provide a useful framework from which to highlight the value of different widely used approaches to improve the quality and validity of assessment procedures.
In this paper we discuss four inferences which form part of Kane's validity theory: from observations to scores; from scores to universe scores; from universe scores to target domain, and from target domain to construct. For each of these inferences, we provide examples and descriptions of approaches and arguments that may help to support the validity inference.
As well as standard psychometric methods, a programme of assessment makes use of various other arguments, such as: item review and quality control, structuring and examiner training; probabilistic methods, saturation approaches and judgement processes, and epidemiological methods, collation, triangulation and member-checking procedures. In an assessment programme each of these can be used.
Analysis of variance examined differences between years and regression analysis the relationship between deliberate practice and skill test results.
875 students participated (90%). Factor analysis yielded four factors: planning, concentration/dedication, repetition/revision, study style/self reflection. Student scores on 'Planning' increased over time, score on sub-scale 'repetition/revision' decreased. Student results on the clinical skill test correlated positively with scores on subscales 'planning' and 'concentration/dedication' in years 1 and 3, and with scores on subscale 'repetition/revision' in year 1.
The positive effects on test results suggest that the role of deliberate practice in medical education merits further study. The cross-sectional design is a limitation, the large representative sample a strength of the study. The vanishing effect of repetition/revision may be attributable to inadequate feedback. Deliberate practice advocates sustained practice to address weaknesses, identified by (self-)assessment and stimulated by feedback. Further studies should use a longitudinal prospective design and extend the scope to expertise development during residency and beyond.
A competency framework was developed based on the analysis of focus group interviews with 54 recently graduated veterinarians and clients and subsequently validated in a Delphi procedure with a panel of 29 experts, representing the full range and diversity of the veterinary profession. The study resulted in an integrated competency framework for veterinary professionals, which consists of 16 competencies organized in seven domains: veterinary expertise, communication, collaboration, entrepreneurship, health and welfare, scholarship, and personal development. Training veterinarians who are able to use and integrate the seven domains in their professional practice is an important challenge for today's veterinary medical schools. The Veterinary Professional (VetPro) framework provides a sound empirical basis for the ongoing debate about the direction of veterinary education and curriculum development.
Expertise theories highlight the multistage processes involved. The transition from novice to expert is characterised by an increase in the aggregation of concepts from isolated facts, through semantic networks to illness scripts and instance scripts. The latter two stages enable the expert to recognise the problem quickly and form a quick and accurate representation of the problem in his/her working memory. Striking differences between experts and novices is not per se the possession of more explicit knowledge but the superior organisation of knowledge in his/her brain and pairing it with multiple real experiences, enabling not only better problem solving but also more efficient problem solving. Psychometric theories focus on the validity of the assessment - does it measure what it purports to measure and reliability - are the outcomes of the assessment reproducible. Validity is currently seen as building a train of arguments of how best observations of behaviour (answering a multiple-choice question is also a behaviour) can be translated into scores and how these can be used at the end to make inferences about the construct of interest. Reliability theories can be categorised into classical test theory, generalisability theory and item response theory. All three approaches have specific advantages and disadvantages and different areas of application. Finally in the Guide, we discuss the phenomenon of assessment for learning as opposed to assessment of learning and its implications for current and future development and research.
Our purpose was to clarify the influence of context on reasoning, to build upon education theory and to generate implications for education practice.
Qualitative data about experts were gathered from two sources: think-aloud protocols reflecting concurrent thought processes that occurred while board-certified internists viewed videotape encounters, and free-text responses to queries that explicitly asked these experts to comment on the influence of selected contextual factors on their clinical reasoning processes. These data sources provided both actual performance data (think-aloud responses) and opinions on reflection (free-text answers) regarding the influence of context on reasoning. Results for each data source were analysed for emergent themes and then combined into a unified theoretical model.
Several themes emerged from our data and were broadly classified as components influencing the impact of contextual factors, mechanisms for addressing contextual factors, and consequences of contextual factors for patient care. Themes from both data sources had good overlap, indicating that experts are somewhat cognisant of the potential influences of context on their reasoning processes; notable exceptions concerned the themes of missed key findings, balancing of goals and the influence of encounter setting, which emerged in the think-aloud but not the free-text analysis.
Our unified model is consistent with the tenets of cognitive load, situated cognition and ecological psychology theories. A number of potentially modifiable influences on clinical reasoning were identified. Implications for doctor training and practice are discussed.
Three student cohorts were taught using one instructional format per subject area so that each cohort received a different instructional format for each of the three subject areas. Outcome measures (objective structured clinical examination, video quiz, written examination) were selected to determine the effect of each instructional format on the clinical reasoning of students.
Increasingly authentic instructional formats did not significantly improve clinical reasoning performance across all outcome measures and subject areas. However, the results of the video quiz showed significant differences in the anaemia subject area between students who had been instructed using the paper case and live SP-based formats (scores of 47.4 and 57.6, respectively; p = 0.01) and in the abdominal pain subject area, in which students instructed using the DVD format scored higher than students instructed using either the paper case or SP-based formats (scores of 41.6, 34.9 and 31.2, respectively; p=0.002).
Increasing the authenticity of instructional formats does not appear to significantly improve clinical reasoning performance in a pre-clerkship course. Medical educators should balance increases in authenticity with factors such as cognitive load, subject area and learner experience when designing new instructional formats.
The interview guide was based on questionnaire results; overall response rate for Years 1-3 was 90% (n = 875). Students report a variety of activities to improve their physical examination skills. On average, students devote 20% of self-study time to skill training with Year 1 students practising significantly more than Year 3 students. Practice patterns shift from just-in-time learning to a longitudinal selfdirected approach. Factors influencing this change are assessment methods and simulated/real patients. Learning resources used include textbooks, examination guidelines, scientific articles, the Internet, videos/DVDs and scoring forms from previous OSCEs. Practising skills on fellow students happens at university rooms or at home. Also family and friends were mentioned to help. Simulated/real patients stimulated students to practise of physical examination skills, initially causing confusion and anxiety about skill performance but leading to increased feelings of competence. Difficult or enjoyable skills stimulate students to practise. The strategies students adopt to master physical examination skills outside timetabled training sessions are self-directed. OSCE assessment does have influence, but learning takes place also when there is no upcoming assessment. Simulated and real patients provide strong incentives to work on skills. Early patient contacts make students feel more prepared for clinical practice.
Using a strategic planning approach, a semi-structured open-ended questionnaire on the future of their profession was sent to 102 Dutch gynecologists. Through inductive analysis, a future perspective and its needed competencies were identified and compared to the CanMEDS framework.
The 62 responses showed content validity for the CanMEDS roles. Additionally, two roles were identified: advanced technology user and entrepreneur. Within the role Communicator, the focus will change through more active patient participation. The roles Collaborator and Manager are predicted to change in focus because of an increase of complex interdisciplinary teamwork and leadership roles.
By studying the Dutch gynecologists' perspective of the future in a strategic planning approach, two additional roles and focus areas within a contemporary competency framework were identified. The perspective of clinicians on future health care provides valuable messages on how to design future-proof curricula.
Sixty bowel cancer screening polypectomy videos were randomly chosen for analysis and were scored independently by 7 expert assessors by using DOPyS. Each parameter and the global rating were scored from 1 to 4 (scores ≥3 = competency). The scores were analyzed by using generalizability theory (G theory).
Fifty-nine of the 60 videos were assessable and scored. The majority of the assessors agreed across the pass/fail divide for the global assessment scale in 58 of 59 (98%) polyps. For G-theory analysis, 47 of the 60 videos were analyzed. G-theory analysis suggested that DOPyS is a reliable assessment tool, provided that it is used by 2 assessors to score 5 polypectomy videos all performed by 1 endoscopist. DOPyS scores obtained in this format would reflect the endoscopist's competence.
Small sample and polyp size.
This study is the first attempt to develop and validate a tool designed specifically for the assessment of technical skills in performing polypectomy. G-theory analysis suggests that DOPyS could reliably reflect an endoscopist's competence in performing polypectomy provided a requisite number of assessors and cases were used.
This has led to a broadened perspective on the types of construct assessment tries to capture, the way information from various sources is collected and collated, the role of human judgement and the variety of psychometric methods to determine the quality of the assessment. Research into the quality of assessment programmes, how assessment influences learning and teaching, new psychometric models and the role of human judgement is much needed.
We conducted an international qualitative study using focus groups and drawing on principles of grounded theory. We recruited volunteer participants from three undergraduate and two postgraduate programmes using structured self-assessment activities (e.g. portfolios). We asked learners to describe their perceptions of and experiences with formal and informal activities intended to inform self-assessment. We conducted analysis as a team using a constant comparative process.
Eighty-five learners (53 undergraduate, 32 postgraduate) participated in 10 focus groups. Two main findings emerged. Firstly, the perceived effectiveness of formal and informal assessment activities in informing self-assessment appeared to be both person- and context-specific. No curricular activities were considered to be generally effective or ineffective. However, the availability of high-quality performance data and standards was thought to increase the effectiveness of an activity in informing self-assessment. Secondly, the fostering and informing of self-assessment was believed to require credible and engaged supervisors.
Several contextual and personal conditions consistently influenced learners' perceptions of the extent to which assessment activities were useful in informing self-assessments of performance. Although learners are not guaranteed to be accurate in their perceptions of which factors influence their efforts to improve performance, their perceptions must be taken into account; assessment strategies that are perceived as providing untrustworthy information can be anticipated to have negligible impact.
Eight teachers at the Vrije Universiteit (VU) University Medical Centre in Amsterdam attended a training course on the use of the MAAS-Global instrument, which they subsequently used to assess the consultation skills of 53 GPTs in 176 videotaped consultations (102 with SPs, 74 with RPs). All consultations were randomly allocated and assessed by two teachers independently. The reliability of the ratings was estimated using generalisability theory.
It was easier to obtain acceptable reliability using RP consultations than SP consultations. Two assessors and five consultations were required to achieve minimal reliability (generalisability coefficient 0.7) with RPs, whereas three assessors and 30 consultations were needed to achieve minimal reliability with SPs.
Inter-observer and context variability in the assessment of the consultation skills of GPTs remains high. To achieve acceptable levels of reliability, large samples of observations are required in both formats, but, interestingly, RP encounters require a smaller sample than SP encounters.
In each school, teachers, management and examination board participated. Results show that the two schools use different approaches to assure assessment quality. The innovative school seems to be more aware of its own strengths and weaknesses, to have a more positive attitude towards teachers, students, and educational innovations, and to explicitly involve stakeholders (i.e., teachers, students, and the work field) in their assessments. This school also had a more explicit vision of the goal of competence-based education and could design its assessments in accordance with these goals.
Regarding the quality of feedback, the aggregated score for each of the three categories was not significantly different between the two groups, neither for the interim, nor for the final assessment. Some, not statistically significant, but nevertheless noteworthy trends were nevertheless noted. Feedback in the web-based group was more often unrelated to observed behaviour for several categories for both the interim and final assessment. Furthermore, most comments relating to the category 'Dealing with oneself' consisted of descriptions of a student's attendance, thereby neglecting other aspects of personal functioning. The survey identified significant differences between the groups for all questionnaire items regarding feasibility, acceptability and perceived usefulness in favour of the paper-based form. The use of a web-based instrument for professional behaviour assessment yielded a significantly higher number of comments compared to the traditional paper-based assessment. Unfortunately, the quality of the feedback obtained by the web-based instrument as measured by several generally accepted feedback criteria did not parallel this increase.
This study explored the pre-assessment learning effects of summative assessment in theoretical modules by exploring the variables at play in a multifaceted assessment system and the relationships between them. Using a grounded theory strategy, in-depth interviews were conducted with individual medical students and analyzed qualitatively. Respondents' learning was influenced by task demands and system design. Assessment impacted on respondents' cognitive processing activities and metacognitive regulation activities. Individually, our findings confirm findings from other studies in disparate non-medical settings and identify some new factors at play in this setting. Taken together, findings from this study provide, for the first time, some insight into how a whole assessment system influences student learning over time in a medical education setting. The findings from this authentic and complex setting paint a nuanced picture of how intricate and multifaceted interactions between various factors in an assessment system interact to influence student learning. A model linking the sources, mechanism and consequences of the pre-assessment learning effects of summative assessment is proposed that could help enhance the use of summative assessment as a tool to augment learning.
We searched the PubMed, EMBASE and PsycINFO databases for articles pertaining to script concordance testing. We then reviewed these articles to evaluate the construct validity of the script concordance method, following an established approach for analysing validity data from five categories: content; response process; internal structure; relations to other variables, and consequences.
Content evidence derives from clear guidelines for the creation of authentic, ill-defined scenarios. High internal consistency reliability supports the internal structure of SCT scores. As might be expected, SCT scores correlate poorly with assessments of pure factual knowledge, in which correlations for more advanced learners are lower. The validity of SCT scores is weakly supported by evidence pertaining to examinee response processes and educational consequences.
Published research generally supports the use of SCT to assess the interpretation of clinical data under conditions of uncertainty, although specifics of the validity argument vary and require verification in different contexts and for particular SCTs. Our review identifies potential areas of further validity inquiry in all five categories of evidence. In particular, future SCT research might explore the impact of the script concordance method on teaching and learning, and examine how SCTs integrate with other assessment methods within comprehensive assessment programmes.
To find empirical evidence for the factors claimed to have an influence on anatomical knowledge of students.
A literature search.
There is a lack of sufficient quantity and quality of information within the existing literature to support any of the claims, but the gathered literature did reveal some fascinating insights which are discussed.
Anatomy education should be made as effective as possible, as nobody will deny that medical students cannot do without anatomical knowledge. Because of promising findings in the areas of teaching in context, vertical integration and assessment strategies, it is recommended that future research into anatomy education should focus on these factors.
This article presents the framework for PB that is used at Maastricht medical school, the Netherlands.
The approach to PB used in the Dutch medical schools is described with special attention to 4 years (2005-2009) of experience with PB education in the first 3 years of the 6-year undergraduate curriculum of Maastricht medical school. Future challenges are identified.
The adages 'Assessment drives learning' and 'They do not respect what you do not inspect' [Cohen JJ. 2006. Professionalism in medical education, an American perspective: From evidence to accountability. Med Educ 40, 607-617] suggest that formative and summative aspects of PB assessment can be combined within an assessment framework. Formative and summative assessments do not represent contrasting but rather complementary approaches. The Maastricht medical school framework combines the two approaches, as two sides of the same coin.
We found that students working in heterogeneous groupings interact with students with whom they don't normally interact with, learn a lot more from each other because of their differences in language and academic preparedness and become better prepared for their future professions in multicultural societies. On the other hand we found students segregating in the tutorials along racial lines and that status factors disempowered students and subsequently their productivity. Among the challenges was also that academic and language diversity hindered student learning. In light of these the recommendations were that teachers need special diversity training to deal with heterogeneous groups and the tensions that arise. Attention should be given to create 'the right mix' for group learning in diverse student populations. The findings demonstrate that collaborative heterogeneous learning has two sides that need to be balanced. On the positive end we have the 'ideology' behind mixing diverse students and on the negative the 'practice' behind mixing students. More research is needed to explore these variations and their efficacy in more detail.
Between January and May 2009, 14 physicians were interviewed who had commenced an attending post in internal medicine or obstetrics-gynecology between six months and two years earlier, within the Netherlands. Interviews focused on the attendings' perceptions of the transition, their socialization within the new organization, and the preparation they had received during residency training. The interview transcripts were openly coded, and through constant comparison, themes emerged. The research team discussed the results until full agreement was reached.
A conceptual framework emerged from the data, consisting of three themes interacting in a longitudinal process. The framework describes how novel disruptive elements (first theme) due to the transition from resident to attending physician are perceived and acted on (second theme), and how this directs new attendings' personal development (third theme).
The conceptual framework finds support in transition psychology and notions from organizational socialization literature. It provides insight into the transition from resident to attending physician that can inform measures to smooth the intense transition.
Results were analysed for emergent themes.
Remedial programmes for at-risk medical students should be mandatory, but should respect students' identity as repeaters. Attitude and motivation are key, and working in stable groups provides essential emotional and cognitive support. The learning environment needs to foster changes in students' ways of thinking and their development as flexible, reflective learners. These endeavours require support from honest teachers with rigorous expectations and good facilitation skills.
Successful remediation needs to challenge students' conceptions of learning, works best in groups with skilled facilitators, and must take into account a blend of cognitive and affective factors and the complex interplay between learner and environment. Given a carefully designed programme, at-risk medical students can learn to make effective and lasting changes to their approach to study, and their views of learning can come to converge with influential ideas in the education literature.
In the present study, we employed two established theories as frameworks with the purpose of assessing the extent to which different views of the same clinical encounter (a three-component, Year 2 medical student objective structured clinical examination [OSCE] station) are similar to or differ from one another.
We performed univariate comparisons between the individual items on each of the three components of the OSCE: the standardised patient (SP) checklist (patient perspective); the post-encounter form (trainee perspective), and the oral presentation rating form (faculty perspective). Confirmatory factor analysis (CFA) of the three-component station was used to assess the fit of the three-factor (three-viewpoint) model. We also compared tercile performance across these three views as a form of extreme groups analysis.
Results from the CFA yielded a measurement model with reasonable fit. Moderate correlations between the three components of the station were observed. Individual trainee performance, as measured by tercile score, varied across components of the station.
Our work builds on research in fields outside medicine, with results yielding small to moderate correlations between different perspectives (and measurements) of the same event (SP checklist, post-encounter form and oral presentation rating form). We believe obtaining multiple perspectives of the same encounter provides a more valid measure of a student's clinical performance.
To explore students' perceptions about a newly introduced integrated feedback and assessment instrument to support self-directed learning in clinical practice. Students collected feedback from clinical supervisors and wrote it on a competency-based format. This feedback was used for self-assessment, which had to be completed before the final assessment.
Four focus group discussions were conducted with second and last year Midwifery students. Focus groups were audiotaped, transcribed verbatim and analysed in a thematic way using ATLAS.ti for qualitative data analysis.
The analysis of the transcripts suggested that integrating feedback and assessment supports participation and active involvement in learning by collecting, writing, asking, reading and rereading feedback. Under the condition of training and dedicated time, these learning activities stimulate reflection and facilitate the development of strategies for improvement. The integration supports self-assessment and formative assessment but the value for summative assessment is contested. The quality of feedback and empowerment by motivated supervisors are essential to maximise the learning effects.
The integrated Midwifery Assessment and Feedback Instrument is a valuable tool for supporting formative learning and assessment in clinical practice, but its effect on students' self-directed learning depends on the feedback and support from supervisors.
We reasoned that the PT data should be flexibly accessible in all pathways and with any available comparison data, according to the personal interest of the learner. For that purpose, a web-based tool (Progress test Feedback, the ProF system) was developed. This article presents the principles and features of the generated feedback and shows how it can be used. In addition to enhancement of the feedback, the ProF database of longitudinal PT-data also provides new opportunities for research on knowledge growth, and these are currently being explored.
Students who failed and then repeated first semester were required to participate in a cognitive skills programme, following a syllabus based on principles drawn from both educational experience and multi-disciplinary theory and practice. Performance of programme participants was compared to the performance of students who repeated prior to the mandatory programme.
Of the participants (n = 216), 91% passed their repeat semester, compared to 58% (n = 715) for controls (p < 0.0001). This significant effect persisted for progression through the school for the subsequent three semesters (p < 0.0005).
A mandatory programme that draws on a blend of theories and research-proven techniques can make a positive difference to the outcomes for at-risk medical students.
The items within the instrument are clustered around motivational and cognitive factors based on Slavin's theoretical framework. A confirmatory factor analysis (CFA) was carried out to estimate the validity of the instrument. Furthermore, generalizability studies were conducted and alpha coefficients were computed to determine the reliability and homogeneity of each factor.
The CFA indicated that a three-factor model comprising 19 items showed a good fit with the data. Alpha coefficients per factor were high. The findings of the generalizability studies indicated that at least 9-10 student responses are needed in order to obtain reliable data at the tutorial group level.
The instrument validated in this study has the potential to provide faculty and students with diagnostic information and feedback about student behaviors that enhance and hinder tutorial group effectiveness.
Historical data from 54 Maastricht (norm-referenced) and 52 Groningen (criterion-referenced) tests were used to demonstrate huge discrepancies and variability in cut-off scores and failure rates. Subsequently, the compromise model - known as Cohen's method - was applied to the Groningen tests.
The Maastricht norm-referenced method led to a large variation in required cut-off scores (15-46%), but a stable failure rate (about 17%). The Groningen method with a conventional, pre-fixed standard of 60% led to a large variation in failure rates (17-97%). The compromise method reduced variation in required cut-off scores as well as failure rates.
Both the criterion and norm-referenced standards, used in practice, have disadvantages. The proposed compromise model reduces the disadvantages of both methods and is considered more acceptable. Last but not least, compared to standard setting methods using panels, this method is affordable.
In addition, it is explained how research into workplace learning and assessment has impacted developments in educational practice. Finally, it is argued that the participation of teachers within the medical domain in conducting and disseminating research should be cherished, because they play a crucial role in ensuring that medical education research is applied in educational practice.
The consequences of the assessment as well as any agreements reached, must be clearly documented. If remediation of inappropriate behaviour is unsuccessful, a consilium abeundi, i.e. a recommendation to leave the programme, should be discussed with the student. The Dutch Higher Education and Scientific Research Act (WHW) does not provide for denying students access to educational activities and exams after completing the first year. However, the new Higher Education and Research Act (WHOO), which has yet not been implemented, will provide for obligatory cessation of studies.
All ICM fellows (n = 90) were sent a questionnaire containing the following questions regarding training in professionalism (7-point Likert scale (1 = very inadequate, 7 = very adequate)): which are the elements perceived to be important in intensivists'' daily practice (38 items, cat. I)? Which methods of learning and teaching are recognised (16 items, cat. II)? Which methods of teaching and learning are considered especially useful (16 items, cat. III)? Finally, the perceived quantity and quality of formal and informal learning methods, as well as the responsible organisational body was studied. Data were analysed using SPSS 15.0.
Response was 75.5 % (n = 68), mean age 34 years. Regarding Elements, scores on virtually all items were high. The factor 'striving for excellence' explained half the variance. Two other aspects, 'Teamwork' and 'Dealing with ethical dilemmas', were identified. Regarding Methods, three dimensions, 'formal curriculum'', 'private and academic experiences' and 'role modelling', proved important. The factor 'formal curriculum' explained most of the variance. Regarding Usefulness the same factors, now mainly explained by the factor Private and academic experiences, emerged with variance. In both categories the items 'observations in daily practice' and 'watching television programmes like ER and House' were the highest- and lowest-scoring items (5.99 and 5.81, and 2.69 and 2.49, respectively). Mean scores regarding the quantity of formal and informal teaching were 4.06 and 4.58 (range 1.841 and 1.519). For the quality of teaching, the figures were 4.22 and 4.52 (range 1.659 and 1.560, respectively). 54 suggestions for improvement of teaching were documented. The need for some form of formal teaching of professionalism aspects as well as for feedback was most frequently mentioned (n = 19 and 16). The local training centres are considered and should remain pivotal for teaching professionalism issues (n = 17 and 28).
Almost all elements of professionalism were considered relevant to intensivists' daily practice. Although formal teaching methods regarding professionalism aspects are easily recognised in daily practice, learning by personal experiences and informal ways quantitatively plays a more important, and more valued role. Qualitative comments, nevertheless, stress the need for providing and receiving (solicited and unsolicited) feedback, thereby requesting expansion of formal teaching methods. The local training centres (should continue to) play a major role in teaching professionalism, although an additional role for the (inter)national intensive care organisations remains.
This article will firstly clarify how professional behaviour assessment relates to other assessment methods using the framework proposed by Miller6. Thereafter a brief overview will be provided of the current "tool box" of methods available to assess professionalism. Data on the validity, reliability, feasibility, acceptability and educational utility of these "tools" as derived from published evidence will be reviewed. Subsequently a general overview of the way forward in the assessment of professionalism and professional behaviour will be given.
Each mode was presented in a rich and a poor version with regard to the use of different media and questions and explanations explicitly directed at clinical reasoning. Five groups of between four and nine randomly selected students (n = 27) participated in focus group interviews facilitated by a moderator using a questioning route. The interviews were videotaped, transcribed and analysed. Summary reports were approved by the students.
Ten principles of VP design emerged from the analysis. A VP should be relevant, of an appropriate level of difficulty, highly interactive, offer specific feedback, make optimal use of media, help students focus on relevant learning points, offer recapitulation of key learning points, provide an authentic web-based interface and student tasks, and contain questions and explanations tailored to the clinical reasoning process.
Students perceived the design principles identified as being conducive to their learning. Many of these principles are supported by the results of other published studies. Future studies should address the effects of these principles using quantitative controlled designs.
A study assessing students' actual knowledge of clinical anatomy revealed no relationship between students' knowledge and the school's didactic approach. Test failure rates based on absolute standards set by different groups of experts were indicative of unsatisfactory levels of anatomical knowledge, although standards differed markedly between the groups of experts. Good test performance by students seems to be related to total teaching time for anatomy, teaching in clinical context, and revisiting anatomy topics in the course of the curriculum. These factors appeared to outweigh the effects of disciplinary integration or whether the curriculum was problem-based or traditional.
They discussed what teaching skills helped them to acquire physical examination skills.
Students' opinions related to didactic skills, interpersonal and communication skills and preconditions. Students appreciated didactic skills that stimulate deep and active learning. Another significant set of findings referred to teachers' attitudes towards students. Students wanted teachers to be considerate and to take them seriously. This was reflected in student descriptions of positive behaviours, such as: 'responding to students' questions'; 'not exposing students' weaknesses in front of the group', and '[not] putting students in an embarrassing position in skill demonstrations'. They also appreciated enthusiasm in teachers. Important preconditions included: the integration of skills training with basic science teaching; linking of skills training to clinical practice; the presence of clear goals and well-structured sessions; good time management; consistency of teaching, and the appropriate personal appearance of teachers and students.
The teaching skills and behaviours that most facilitate student acquisition of physical examination skills are interpersonal and communication skills, followed by a number of didactic interventions, embedded in several preconditions. Findings related to interpersonal and communication skills are comparable with findings pertaining to the teaching roles of tutors and clinical teachers; however, the didactic skills merit separate attention as teaching skills for use in skills laboratories. The results of this study should be complemented by a study performed in a larger population and a study exploring teachers' views.
For example, advantages of real patients as educational resource were patient-centered learning and high patient satisfaction. Disadvantages were their limited availability and the variability in learning experiences among students. Despite the considerable amount of literature we found, many gaps in knowledge about patient roles in medical education remain and should be addressed by future studies.
We carried out a multi-method case study. Twelve departments of obstetrics and gynaecology distributed the Postgraduate Hospital Educational Environment Measure (PHEEM), a reliable questionnaire measuring the clinical learning environment, among medical students. After analysis (using anova and post hoc tests), 14 medical students from the highest- and lowest-scoring departments participated in semi-structured interviews. We analysed the transcribed recordings using a content analysis approach. Researchers agreed on coding and an expert group reached consensus on the themes of the analysis.
We found a significant difference between departments in PHEEM scores. The interviews indicated that department and medical student characteristics determine the clinical learning climate. For departments, 'legitimacy', 'clerkship arrangements' and 'focus on personal development' were the main themes. For medical students, 'initial initiatives', 'continuing development' and 'clerkship fatigue' were the principal themes. The amount and nature of participation played a central role in all themes.
Differences between clinical learning climates appear to be related to differing approaches to participation among departments. Participation depends on characteristics of both departments and students, and the interactions among them. The outcomes give valuable clues to how a favourable clinical learning climate is shaped.
Intervention: open-ended questionnaire. Analysis: qualitative data analysis with two coding dictionaries based on current literature. Differences between 1994 and 2003 were estimated using the Chi-square test.
Residents preferred the 'person' role both in 1994 (42%) and in 2003 (48%). The 'physician' role was significantly more important in 1994 than in 2003; the 'supervisor' role was significantly more important in 2003 than in 1994 (p<0.05). Seventy percent of the comments related to 'direct interaction' (i.e., between residents and clinical teachers), 30% to 'indirect interaction' (i.e., clinical teachers' behaviour affecting residents indirectly).
The data showed that almost half of residents' comments described 'person' role characteristics. There was a significant shift in the role ranked second, from the physician role in 1994 to the supervisor role in 2003. The findings highlighted that teachers, in order to be perceived as ideal, should adapt their behaviour to residents' learning needs.
These include the use of a reference point for evaluators' judgement that represents the standard expected upon completion of the training, flexibility, a greater range of cases assessed and the use of frequency scores within feedback to identify trainees' progress over time.
A range of qualitative and quantitative data were collected and analysed from 2 consecutive cohorts of trainees in Scotland (2002-03 and 2003-04).
There is rich evidence supporting the validity, educational impact and feasibility of the LEP. In particular, a great deal of support was given by trainers for the use of a fixed reference point for judgements, despite initial concerns that this might be demotivating to trainees. Trainers were highly positive about this approach and considered it useful in identifying trainees' progress and helping to drive learning.
The LEP has been successful in combining a strong formative approach to continuous assessment with the collection of evidence on performance within the workplace that (alongside other tools within an assessment system) can contribute towards a summative decision regarding competence.
Gender and secondary school grade point average (GPA) scores were included as moderator variables. Data were analysed by a stepwise multiple and logistic regression analysis.
Graduates of the PBL curriculum scored higher on self-rated competencies. Contrary to expectations, graduates of the PBL curriculum did not show more appreciation of their curriculum than graduates of the conventional curriculum and no differences were found on clinical competence. Graduates of the conventional curriculum needed less time to find a postgraduate training place. No differences were found for scientific activities such as reading scientific articles and publishing in peer- reviewed journals. Women performed better on clinical competence than did men. Grade point average did not affect any of the variables.
The results suggest that PBL affects self-rated competencies. These outcomes confirm earlier findings. However, clinical competence measures did not support this finding.
The main outcome measures involved between-school comparisons of progress test results based on different benchmarking methods.
Variations in relative school performance across different tests and year groups indicate instability and low reliability of single-point benchmarking, which is subject to distortions as a result of school-test and year group-test interaction effects. Deviations of school means from the overall mean follow an irregular, noisy pattern obscuring systematic between-school differences. The longitudinal benchmarking method results in suppression of noise and revelation of systematic differences. The pattern of a school's cumulative deviations per year group gives a credible reflection of the relative performance of year groups.
Even with highly comparable curricula, single-point benchmarking can result in distortion of the results of comparisons. If longitudinal data are available, the information contained in a school's cumulative deviations from the overall mean can be used. In such a case, the mean test score across schools is a useful benchmark for cross-institutional comparison.