How to cite this article: Sepúlveda-Vildósola AC, Gaspar-López N, Reyes-Lagunes LI, Gonzalez-Cabello HJ. Reliability and validity of an instrument to evaluate integral clinical competence in medical residents. Rev Med Inst Mex Seguro Soc. 2015 Jan-Feb;53(1):30-9..
Received: March 10th 2014
Accepted: October 27th 2014
Ana Carolina Sepúlveda-Vildósola,a Nadia Gaspar-Lópezb Lucina Isabel Reyes-Lagunes,c Héctor Jaime Gonzalez-Cabellod
aDirección de Educación e Investigación en Salud
bMédico residente de la especialidad de Pediatría
cFacultad de Psicología, Universidad Nacional Autónoma de México
dJefatura de la División de Educación Médica
a,b,dHospital de Pediatría “Dr. Silvestre Frenk Freund”, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social
Distrito Federal, México
Communication with: Ana Carolina Sepúlveda-Vildósola
Telephone: (55) 5627 6900, extension 22306
Background: The evaluation of clinical competence in medical residents is a complex procedure. Teachers need reliable and valid instruments to evaluate objectively the, clinical competence. The aim of this study was to determine the reliability and validity of an instrument designed to evaluate the clinical competence of medical residents.
Methods: We designed an instrument taking into consideration every part of the clinical method, and three different levels of competence were determined for each one. The instrument was examined with regards to the clarity, pertinence and sufficiency of each clinical indicator by five expert pediatricians. The instrument was finally constituted by 11 indicators. Each resident was evaluated independently by three pediatricians.
Results: A total of 651 measurements were done in 234 residents. The instrument distinguished between extreme groups, had a value of Cronbach´s alpha of 0.778 and the factorial analysis set apart two factors: clinical competence and complementary competences. No statistical differences were found between evaluators neither in global evaluation or in each indicator.
Conclusion: The instrument here proposed is valid and reliable. It may be used in formative evaluation of medical residents in clinical specialization programs.
Keywords: Evaluation; Clinical competence; Medical residents
Mexico has been slowly moving toward a competency approach to education in medicine. The objective is that the professional be able to solve health problems that our society demands, and adapt to uncertainty, change, and complex new scenarios. Health institutions that train human resources should provide processes and environments that let individuals develop the skills required to adapt to constantly changing organizations and to the needs of the society in which they live. Learning must be done in the context of real experiences that allow them to be creative in solving these problems.1
In recent years, the evaluation of clinical competency has become more important and, despite being a complex activity, it is an essential and fundamental task for feedback in the educational process and is the mechanism by which institutions and organizations can guarantee the public adequate performance by trained doctors. However, there are serious problems in the process of evaluating the clinical competency of doctors, both from a conceptual perspective and from the instruments available to do it. Most of these instruments assess one competency or, at best, two.2
The available tools can be divided into three categories: formal, structured, circumscribed, and ex vivo assessments (written, oral examinations or OSCE);3-7 formal, structured, circumscribed, in situ examinations (Mini Clinical Evaluation Exercise or mini-CEX); and informal, unstructured, summative, in situ assessments (In-training Evaluation Report). In general, the evaluation of clinical competency requires direct observation of the clinical skills of students and the use of valid and reliable instruments to objectivize the process. However, the systematic literature review conducted in 2009 by Kogan et al. found that, of 32 instruments that assess the competency of residents, only half reported the evaluation of the reliability of the instrument; the mini-CEX was the most used and most reliable.8,9 However, despite attempts to develop objective, standardized, valid, and reliable tools to achieve objectivity,10 one of the greatest difficulties is the subjectivity of the assessors, which has been attributed to factors such as idiosyncratic judgments in the "first impression", their frames of reference, inference levels used in the process of observation, or their clinical skills. These authors found no differences by demographic factors such as age, gender, or clinical experience.11-13 Margolis et al. recommend a greater number of assessors for each observation to improve the stability of reported tests.14 Other strategies include the use of portfolios, reviewing records, simulations and models, examination with standardized patients, or reviewing videos of performance with a patient.15-21
According to this context, it is understood that there is no single way to assess clinical competencies and that forms of evaluation are complementary, since no one comprehensively evaluates the process.
In Mexico, evaluation of clinical competency has been done in the undergraduate of the Facultad de Medicina of the UNAM through OSCE.22 However, its limitations have been documented: the knowledge and skills of students are examined piecemeal, some of the "stations" are artificial, and also the cost, time spent, and personnel involved in the development and implementation is higher than in traditional tests. In graduate medical education, clinical expertise of the resident can only be determined by observation and the evaluation that they provide patients during their training. Other authors found adequate reliability in the comprehensive evaluation of clinical competency through professional examination,23 or the use of written tests built with real cases in various degrees.24-26 At Hospital Infantil de Mexico, Federico Gomez began to use the mini-CEX as a tool in the evaluation of clinical competency of pediatric residents.27 In Spanish-language literature we found no objective instrument that is contextually relevant and that validly and reliably evaluates clinical performance of specialty students with a real patient.
In the Hospital de Pediatría of the Unidad Médica de Alta Especialidad of the Centro Médico Nacional Siglo XXI, an instrument for the evaluation of clinical competency (IECC) was developed. To be considered effective, the instrument must be valid and reliable. The aim of this study was to estimate the reliability and validity of the IECC in the evaluation of clinical competency of medical residents of the Hospital de Pediatría of the Centro Médico Nacional Siglo XXI, and to determine inter-observer agreement.
The study was conducted at the Unidad Médica de Alta Especialidad (UMAE), Hospital de Pediatría, Centro Médico Nacional Siglo XXI.
Phase 1. Instrument design
1. In 2012 an instrument for assessing overall clinical competency was made by the following steps:
a) A checklist of the different steps carried out in the clinical method was made, and three proficiency levels were determined for each. The instrument was built based on what is proposed in the literature, namely from indicators mentioned in instruments used to measure clinical competency.
b) The instrument was subjected to review by a group of five pediatric doctors who are experts in the hospital with over 20 years of clinical experience. These experts determined the representativeness of the items and the use of sensitive methods for the construction of the test. They were asked their opinions on the clarity, relevance, and sufficiency of the indicators proposed. The instrument was supplemented in accordance with their comments.
2. The instrument was composed of 11 indicators, described in Annex 1. Three competency levels were determined for each (satisfactory, unsatisfactory, and unacceptable).
Phase 2. Implementation of the instrument
Records obtained from the comprehensive clinical competency evaluation done on resident physicians who completed a specialty in UMAE Hospital de Pediatría of the CMN Siglo XXI from August 2012 to December 2013 were included. Incomplete or illegible evaluations were removed.
Each resident was evaluated by three evaluators (one from the Dirección de Educación e Investigación en Salud, one regular or deputy assessor of the specialty course, and another evaluator-teacher assistant in the specialty in which the resident would be evaluated). The professors chose the patient based on the illnesses in the university modules of each grade level. The parent/guardian of the child, and, when older than 9 years, the patient were asked for voluntary cooperation in the clinical evaluation. Residents were explained the examination procedure, which consisted of an initial phase, in which they would examine and interview the patient, and a second phase in the classroom, where residents would be questioned about their reasoning and to explain their medical knowledge. Residents were given as much time as they needed for interviewing and physical examination of the patient without intervention, except if the examination could injure the patient or if it could present complications.
Each evaluator was asked to independently qualify each category of the instrument. At the end, the points obtained for the execution of each indicator were added, and the final grade was assigned, as referred to in Annex 1. The final exam score of each student was obtained from the average of the three grades of the various assessors.
Scores were captured in an SPSS, version 20 database. Descriptive analysis of the variables was done using frequencies and percentages for qualitative variables and mean and standard deviation for quantitative variables. The discriminant analysis of the prompts was done with the Student’s t test for extreme groups. Determining the validity of the instrument was done using factorial analysis of principal components with orthogonal rotation. The reliability of the instrument was determined by Cronbach's alpha coefficient. Inter-observer agreement was determined with ANOVA.
The protocol was approved by the Local Research and Ethics Committee, with the registration number R-2012-3603-87.
651 total measurements were obtained from 234 residents with an average of 2.78 measurements per resident. 67% of evaluations (438 records) were obtained in the 2012-2013 school year and 213 records were obtained in the 2013-2014 school year.
By grade level of residents, seven evaluations were of first year residents, 135 second year, 181 third year, 60 fourth year, 123 fifth year, and 145 sixth year.
51.9% of evaluations were of residents in the specialty of Pediatrics, while the remaining 48.1% was represented by specialties within Pediatrics.
As for the scores obtained, 13.2% obtained a rating of 5, 13.4% got a rating of 6, 18.9% of 7, 18.1% of 8, 21.4% of 9, and 15% of 10. The average score was 7.66 (± 1.617).
The discriminant analysis found a statistically significant difference between the groups with lower and higher test performance (25th percentile or under and 75th or higher, respectively) (Table I).
|Table I Discriminant analysis of IECC for extreme groups|
|Evaluated aspects||t||DF||p (bilateral)|
|IECC = instrument for evaluation of clinical competency; DF = degrees of freedom|
Reliability analysis of 11 elements obtained a Cronbach's alpha value of 0.778, with a Cronbach's alpha based on established elements of 0.818 and a Hotelling T2 of 0.000.
In Table II it can be seen that none of the test items adversely affect the test reliability.
|Table II Statistics of total IECC elements|
|Evaluated aspects||Scale average eliminating the element||Scale variance eliminating the element||Element-total corrected correlation||Multiple correlation to the square||Cronbach's alpha eliminating the element|
|IECC = instrument for evaluation of clinical competency|
According to the maximum possible score for each of the categories, the sections of presentation, communication, and priority programs showed the best performance, followed by current condition, prognosis, and diagnostic complement. Areas with lower performance were treatment plan, diagnostic integration, physical examination, and theoretical foundation (Table III).
|Table III Analysis of performance of evaluated residents|
|Evaluated aspects||Maximum possible score||Mean||Standard deviation|
Factorial analysis was conducted to determine the validity of the instrument, and it found that the items were clearly grouped into two main factors (Table IV).
|Table IV Factorial analysis of IECC (rotated component matrix*)|
|Factor 1||Factor 2|
IECC = instrument for evaluation of clinical competency
* Extraction method: principal component analysis
a) Factor 1: Clinical competency, which includes the components of the medical act, such as obtaining information through the current condition, interview, and physical examination, as well as the components of clinical diagnostic reasoning, composed of diagnostic complement, diagnostic integration, treatment plan, theoretical foundation, and prognosis.
b) Factor 2: Complementary skills, composed of presentation, communication, and knowledge of priority programs.
Regarding agreement between observers, no statistically significant differences were found in the values obtained by each evaluator in different test items, so there was a good inter-observer agreement (Table V).
|Table V inter-observer agreement of test assessors (ANOVA of one factor)|
|Evaluated aspects||Sum of square||DF||Mean quadratic||F||p|
|DF = degrees of freedom|
The evaluation of comprehensive clinical competency is a primary objective of the institutions involved in the training of health professionals. It is not enough to design excellent training programs that receive accreditation by the relevant bodies, but one must also prove that their application produces the positive impact desired. Therefore, it is essential to do continuous, rigorous, and specific assessment of the medical resident in specialization.16
However, the evaluation of clinical competency is difficult to carry out, because the set of knowledge, skills, and attitudes have to be assessed integrally.11-14 Traditionally, student evaluation is done in a fragmented way, i.e., in different times and through different instruments.
At IMSS, cognitive evaluation is undertaken through multiple-choice tests, while psychomotor skills are evaluated in an open format called CEM 2, in which professors note the expertise gained by the resident, without any checklist to endorse the measure, so that its reliability and validity are questionable.
The attitudes of the residents are assessed through the CEM 3 format, which includes aspects such as professional behavior, judgment, interpersonal relationships, discipline, fulfillment of academic activities, criticism, responsibility, and commitment, which the professor rates on a scale from 0 to 100. To be clear, psychometric study of these evaluation forms has never been done.
The proposed instrument for evaluation of clinical competency attempts to integrate the main aspects of knowledge, skills, and attitudes, which, when performed in a real environment, offer results of resident performance roughly similar to how it will be in their professional life.
Given the results of discrimination, it is useful in evaluating the clinical competency of residents, given that, beyond just being easy to apply, it integrates the qualities required to conduct an assessment of clinical competency.
It is noteworthy that the areas of interview and physical examination obtained the lowest average scores, as they are widely covered in undergraduate medical education, and one would expect students to already have mastery of these skills. In regard to treatment plan, diagnostic integration, and theoretical foundation, which also had lower scores, these correspond to knowledge and skills that each resident must acquire for the specialty, which should be reinforced by their professors.
We believe that the main bias of the test is related to student performance, which may be affected by two main aspects, the stress of the evaluation itself, and the fact of being observed, making the resident more meticulous in the interviewing and physical examination of the patient, which is not necessarily how they do it daily. Other problems were related to the very process of patient care, such as phone calls to relatives, interruptions from the nursing area to apply medication or request tests from the lab, and so on.
The time of application of the test ranged from 45 to 150 minutes, with an average of 120. In this case, residents had free time for interview and physical examination. One would have to determine if the reliability and validity of the instrument could be affected if time is limited.
The main purpose of the evaluation was to give feedback on the educational process, in order to establish areas for improvement in the process. The integration of this evaluation instrument is not intended to replace other forms of evaluation that are done with medical residents; rather it is intended as a complementary form of evaluation, and the frequency of application should be weighed by each of the professors or institutions that adopt it.
New trends in medical education allow us to envision an educational scenario in which aspects of the quality standards of education will acquire increasing relevance, with new teaching methodologies, and with advances in the evaluation of knowledge and acquired skills, so that having valid and reliable instruments would allow us to comprehensively assess the training of residents and plan necessary changes in the academic programs, so that in the future the performance of specialists graduated from various training institutions gets better results.
The proposed instrument is valid and reliable. It is proposed as a training tool in the evaluation of medical residents in clinical specialties.
Annex 1 Clinical competency assessment instrument
|INSTITUTO MEXICANO DEL SEGURO SOCIAL
UMAE HOSPITAL DE PEDIATRIA CENTRO MEDICO NACIONAL SIGLO XXI DIRECCION GENERAL DIRECCION DE EDUCACION E INVESTIGACION EN SALUD
NAME OF RESIDENT:
|Aspects to be evaluated||SCORE|
|Level s of execution|
|Current condition||Identifies the patient's main health problem(s )||Incompletely identifies the patient's health problems||Does not identify the patient's main health problem(s )|
|Interview||Complete. Oriented to the current condition of the patient||Complete. Not oriented to the current condition of the patient||Incomplete, disorganized, unrelated to the current condition of the patient|
|Prepares necessary material Examination is comprehensive, systematic, and oriented to the current condition of the patient||Incomplete material Full examination, disorderly, or not oriented to the current condition of the patient||Does not prepare material required. Examination is incomplete, dis organized, has no relation to the current condition of the patient|
|Diagnostic complement||Requests , justifies , and correctly interprets complementary examinations according to the current condition of the patient||Incompletely requests complementary examinations according to the current condition of the patient but interprets them correctly||Inadequately requests, justifies, or interprets complementary examinations according to the current condition of the patient|
|Properly systematizes and integrates information to support diagnosis||Incompletely systematizes and integrates information to support diagnosis||Is unable to integrate and sustain a diagnosis|
|Identifies the elements to establish differentia l diagnoses||Incompletely identifies the elements to establish differential diagnoses||Does not identify the elements to establish differential diagnoses|
|Identifies the patient's main
|Incompletely identifies the patient's
|Does not identify the patient's
main health problem(s)
|Interview||Complete Oriented to the current condition of the patient||Complete Not oriented to the current condition of the patient||Incomplete, disorganized,
unrelated to the current condition of the patient
|Physical examination||Prepares necessary material Examination is comprehensive, systematic, and oriented to the current condition of the patient||Incomplete material Full examination, disorderly, or not oriented to the current condition of the patient||Does not prepare material required Examination is incomplete, disorganized, has no relation to the current condition of the patient|
|Diagnostic complement||Requests, justifies, and correctly interprets complementary examinations according to the current condition of the patient||Incompletely requests complementary examinations according to the current condition of the patient, but interprets them correctly||Inadequately requests, justifies, or interprets complementary examinations according to the current condition of the patient|
|Diagnostic integration||Properly systematizes and
integrates information to support diagnosis
|Incompletely systematizes and integrates information to support diagnosis||Is unable to integrate and sustain
|Treatment plan (medical and/or) surgical)||Suggests the best treatment according to the patient's current status and severity of condition||Suggests an appropriate alternative therapy according to patient's current status and severity of condition||Is unable to suggest appropriate treatment according to patient's current status and severity of condition|
|Knows clinical practice guideline of ailment in question||Incompletely knows clinical practice guideline of ailment in question||Does not know clinical practice guideline of ailment in question|
|Identifies adverse and secondary effects of recommended treatment||Incompletely identifies adverse and secondary effects of recommended treatment||Does not identify adverse and secondary effects of recommended treatment|
|Identifies interactions between medications||Incompletely identifies interactions between medications||Does not identify interactions between medications|
|Theoretical foundation||Fully knows the
theoretical fundamentals of main disease and/or differential diagnoses
|Incompletely knows the
theoretical fundamentals of main disease and/or differential diagnoses
|Doesn't know the theoretical fundamentals of main disease and/or differential diagnoses|
|Prognosis||Identifies the prognosis and possible complications of the patient's main disease||Incompletely identifies the prognosis and possible complications of the patient's main disease||Does not identify the prognosis and possible complications of the patient's main disease|
|Communication||Communicates appropriately, respectfully, and effectively with the patient, their family, and the healthcare team||Slight communication problems with the patient, their family, or the healthcare team||Serious communication problems with the patient, their family, or the healthcare team|
|Priority patient risk- prevention programs (ask any of 4 programs randomly)||Knows in full:
Program VENCER II
safety goals Handwashing times Handwashing technique
Program VENCER II
International patient safety goals Handwashing times Handwashing technique
|Does not know: Program VENCER II International patient safety goals Handwashing times Handwashing technique|
|Presentation||Presents in clean and full uniform, neat and short nails Washes hands correctly and at appropriate times Introduces self to patient and their family||DOES NOT APPLY||Presents with incomplete or dirty uniform, long or dirty fingernails, does not wash hands correctly and at appropriate times, or does not introduce him/herself to patient and their family|
REMARKS: In case you find that a point does not apply, please explain in the following lines and score the category as satisfactory.
FEEDBACK FOR STUDENT:__________________________________________________________
|(Name and signature) Please add the points obtained in the grid and write score|
|ON TRACK TO BE COMPETENT||19-21 POINTS||SEVEN|
|NOT COMPETENT||LESS THAN 15 POINTS||FIVE|
Conflict of interest statement: The authors have completed and submitted the form translated into Spanish for the declaration of potential conflicts of interest of the International Committee of Medical Journal Editors, and none were reported in relation to this article.