My last post was a long while ago, but I promised that the next one would address pass/fail grading. I’ve been working to get my school to reassess its preclinical grading system for two months now. My enthusiasm and the evidence presented below are apparently not enough for a change to happen, so it will take many more months of soliciting the opinions of various stakeholders at this institution before reaching a resolution. Such is the reality of educational administration.

The issue is whether the entire preclinical (aka preclerkship, basic science) curriculum should have medical students assessed based on a two-interval Pass/Fail basis or a more discriminating grading scale such as Honors/High Pass/Pass/Fail.

This simple matter has hypothesized effects on a number of outcomes, including 1) psychosocial factors like stress and well-being, cooperation in the learning environment, interactions with faculty, satisfaction with education; 2) academic factors like course performance, standardized exam performance, class attendance; 3) non-curricular factors like participation in research, student organizations, leadership roles, and professional exploration activities; 4) longer term effects on clerkship performance, competing for residency positions, and resident performance; and 5) impact on outsider perceptions of the students and culture during medical school admissions. This post will examine the arguments and evidence in most of these categories.

But first, a note on historical trends nationwide. The last 10-15 years has seen two concurrent shifts. First, residencies have placed increased emphasis on USMLE Step 1 during the resident selection process. At many institutions, Step 1 score cutoffs serve as a first-pass filter, e.g. applicants below 240 in plastic surgery at Man’s Best Med School will not receive an interview invite. This practice was much less widespread a decade ago. Second, more top schools have made all of their preclinical coursework pass/fail. I reckon that these two trends are related in a positive feedback fashion. With fewer top schools having discriminating preclinical grades at all, residencies must rely on standardized indicators in order to compare applicants. With more emphasis on the standardized exam, preclinical honors grades are unnecessary for students to demonstrate strong preclinical knowledge. Consequently, more schools drop grades. Image

    ==School==       ==Preclinical grading system==
1.  Harvard          P/F since 1991
2.  Johns Hopkins    P/F since 2009
3.  U Penn           1 sem P/F; 2 sem H/P/F
4.  Stanford         P/F since 1968
5.  UCSF             P/F since 2001
6.  Washington U     1 yr P/F; 1 yr H/HP/P/F
7.  Yale             P/F since 1931
8.  Columbia         P/F since 2008
9.  Duke             P/F since 2011
10. U Michigan       P/F since 2005
11. U Chicago        P/F - at least before 2001
12. U Washington     1 yr P/F; 1 yr H/P/F
13. UCLA             P/F - at least before 2000
14. Vanderbilt       P/F since 2012
15. U Pittsburgh     P/F since 2012
16. Weill Cornell    P/F since 2010
17. UC San Diego     P/F since 2010
18. Northwestern     P/F - at least before 2001
19. Mount Sinai      P/F - at least before 2001
20. UT Southwestern  1 sem P/F; 3 sem A/B+/B/C/F
21. Emory            P/F since 2008
22. Baylor           P/F since 2009
23. UNC-Chapel Hill  1 yr P/F; 1 yr H/P/F
24. Case Western     P/F - at least before 1982
25. U Virginia       P/F since 2003

The sources of this information are school press releases, school websites, and SDN forums.

One caveat merits noting: A few schools continue to rank students basked on internally tracked discriminating grades, e.g. Hopkins, Northwestern, Baylor, thereby negating some of the posited anti-competitive effect of changing grading systems. However, these are the exceptions rather than the rule.

For residency program directors across all specialties, preclinical grades are relatively unimportant compared to the other factors considered. According to the NRMP Program Director Survey 2012, respondents rate preclinical honors as “fairly important” (3.0 out of 5) but this ranks 31 out of 38 factors in the survey. More important factors include letters of recommendation, USMLE scores, clerkship grades, class ranking, perceived interest in the program, medical school reputation, AΩA membership, the personal statement, research experience, and interview interactions. This holds true in general across even the most competitive specialties.

Another survey of residency program directors by Green et al. published in 2009 found similar results: “grades in preclinical courses are not highly valued by program directors.” The authors offer the following explanation: “This may be because there is considerable variability in the naming and content of courses in medical schools in the preclinical curriculum, perhaps making grades difficult to interpret. The USMLE Step 1 exam is a well-understood and objective means to assess basic science knowledge, and it may act as a substitute for preclinical grades.” Dr. Green, a former associate PD for internal medicine at Northwestern and current associate dean for medical education, noted in an interview that stressing about preclinical grades is misguided: “Many students believe that their grades in the preclinical years are very important. With the exception of a course failure, preclinical grades are not important.”

Two older studies of residencies exist. Crane and Ferraro (2000) found that in emergency medicine, preclinical grades are among the least important factors, perhaps because these grades offer little insight into an applicants potential performance as a houseofficer providing patient care. Wagoner and Suriano (1999) determined that for hundreds of program directors in 14 specialties, preclinical grades were the least important of 12 selection criteria surveyed.

You might ask, if preclinical grades don’t matter much, why do we care about this issue? Students get stressed nevertheless. One reason is that any kind of discriminating evaluation system that will appear on someone’s permanent academic record provides an inflated sense of importance. A second, more important reason is that discriminating preclinical grades, when they exist, factor into class ranking and determination for AOA. The latter factors are more important to residency directors because they convey relative clerkship performance in part. However, having preclinical grades conflates what PDs value most (clinical) with what they value least (preclinical) and improperly inflates the importance of preclinical grades.

A comprehensive review of the literature indicates that pass/fail grading improves medical student well-being without compromising academic outcomes. Within the last two decades, when the national playing field changed the most, these studies include

  • A study of seven medical schools with P/F compared to 3+ interval grading in 2007
  • University of Michigan changing its second year from H/HP/P/F to P/F for the second year in 2005
  • Mayo Medical School changing from 5-interval to P/marginal P/F for first year only.
  • University of Virginia changing from 5-interval to P/F for both preclinical years.
  • Stanford, which long has had P/F, surveying internship directors about how Stanford graduates fared against graduates of schools with more discriminating grading.
  • University of Michigan changing its first year to P/F in 1992.

Positive psychosocial effects: There is strong evidence from four studies that students under a P/F regime are more satisfied with the quality of their medical education; are less likely to have burnout or consider dropping out of school; feel  less stressed, emotionally exhausted, depersonalized, anxious, depressed; report a better mood, higher vitality, more positive well-being; and perceive greater group cohesion compared to the graded peers.

One interesting finding is that at UVA there was no significant difference in anxiety or well-being only during the semester right before USMLE Step 1. Interpreted positively, these data suggest that 1) changing to P/F can improve well-being until Step 1 preparation begins, at which point 2) learning under a P/F system does not increase the stress associated with Step 1 preparation.

A second interesting finding is that among the seven medical schools studied, differences in grading mattered more than differences in faculty contact time, number of tests, and clinical experiences for student well-being. This suggests that curricular reforms aimed at promoting well-being should focus on a simple change grading over more complicated tweaks of academic scheduling.

Don’t underestimate the importance of psychosocial well-being. At our institution, mental health services see an increase in utilization right before exams, a pattern that grows more pronounced during the non-P/F portion of our preclinical curriculum. Students and faculty can easily overlook this factor if they have never experienced these kinds of difficulty themselves. The topic is riddled with stigma and few talk openly about it. Students at my school have taken to complaining anonymously on SDN forums about the negative effect of grading on mental health.

Maybe positive effect on life balance: At Michigan, students generally agreed that P/F grading freed up time to explore other academic talents, participate in volunteer/service activities or student organizations, spend time with family, and exercise and improve personal wellness. However, at UVA, between the graded and P/F cohorts, there was no significant difference in self-reported time utilization in non-curricular activities.

Little impact on course performance: At Michigan, there was no difference in exam scores in 8 of 11 second-year courses. Average scores improved in one course and declined in two (from 94% to 89 or 90%). The decline in one course was attributed to it being the last block and students starting studying for Step 1 a few weeks earlier than the previous year. At UVA, there was no difference in the mean of all course scores in the second year. As far as P/F in the first year goes, Mayo found a decline in anatomy exam scores but Michigan found no difference in anatomy scores. The difference could be attributable to different entering student populations. Matriculants to Michigan have higher GPAs and MCATs than those at Mayo. Academically stronger students, the hypothesis goes, do not need as much extrinsic motivation to perform well. This phenomenon could explain why a higher proportion of top 25 schools have moved to non-discriminating preclinical grading compared to all MD schools nationwide.

No impact on USMLE scores: Strong evidence from Michigan, Mayo, and UVA indicate no significant change in Step 1 scores after moving to preclinical P/F. Michigan and UVA also reported Step 2 scores, for which there was no significant difference. This is very important. First, USMLE scores are very important for residency application. Second, even though educators will rarely say they teach to boards or that the goal of the preclinical curriculum is to prepare students for Step 1, the most clinically-relevant information taught in preclinical curricula are tested on Step 1 (and preclinical courses on physical diagnosis impart knowledge that’s tested on Step 2).

No impact on clerkship performance: UVA saw no change in mean clerkship grades after moving to preclinical P/F.

No impact on course participation: Course directors worry about this issue frequently, but changing to P/F at UVA resulted in no difference in self-reported class attendance. My explanation is that in an age of video-recorded/podcast lectures, attendance is largely driven by personal learning style rather than degree of motivation to learn.

No impact on residency matches: Residency directors at Michigan ranked programs into which the pre- and post-switch classes had matched as 1 = Top 15, 2 = Top 30, or 3 = Other. Between classes, there was no difference in average rating (from 1.68 to 1.70). There was also no difference in the proportion that matched into a top 15 program (from 30% to 32% of the class). At UVA, there was no significant difference in match quality between pre-switch and post-switch classes. They measured the quality of residency programs in internal medicine, family medicine, pediatrics, and general surgery by each program’s board certification pass rates.

No impact on residency performance: This study from Stanford is too complicated to explain, but you should read it if interested. Conclusion: “Graduates from a medical school with a two-interval, pass/fail system successfully matched with strong, highly-sought-after postgraduate training programs, performed in a satisfactory to superior manner, and compared favorably with their peer group.”

The following points have not been studied methodically, so they represent mostly anecdotal evidence and my speculation.

Probably positive effect on admissions: To many applicants, having preclinical grades seems like an old-fashioned practice, indicative of a school that is not progressive or innovative in medical education. That kind of negative perception prevents a school from recruiting the best students it can, which can lead to downstream effects on the environment, reputation, and performance of students during medical school. UVA found that, in post-admission surveys, 5-13% of students who declined an offer of admission did so in part due to issues with their old grading system.

Possibly positive effect on learning environment: Nowadays, educators typically assume that collaboration is better than competition in the professional school learning environment. Health care is a team activity after all. Many educational leaders posit that changing to P/F will promote cooperative learning. However, there are no studies that prove this. The hypothesis is certainly reasonable. Even when grades are criterion-referenced rather than norm-referenced (i.e. everyone could theoretically get honors one year if everyone did really well), they still factor into constructs that aim to stratify students, i.e. class rank and AOA eligibility. That system discourages students from altruistically helping classmates out, particularly those with whom one is less close friends, by distributing study resources and notes.

Impact on education quality: A change in grading does not necessitate any changes in course content or style. On one hand, P/F may put an onus on course faculty to provide quality education in order to justify and encourage attendance. Further, P/F permits faculty to take risks in experimenting with the curriculum. On the other hand, discriminating grades and their importance give faculty the impetus to ensure that their tests are high quality and error-free.

Impact on the importance of clerkship grades: Without discriminating preclinical grades, the clerkships become more important. Some complain that shifting weight from the relatively objective grading of the preclinical curriculum to the variable and subjective grading of the clerkships is a bad thing because it is less in the control of students who are used to studying book knowledge. The fact remains that residencies place extreme importance on subjective evaluations like letters of recommendation because they supposedly provide the most insight into how a student would perform as a resident, interacting with the health care team every day.

Impact on faculty-student interaction: Discriminating grades promotes grade-grubbing behavior, e.g. arguing for points on an exam. However, grades do provide the extrinsic motivation to engage with material deeper and encourages students to ask clarifying questions more often.


  1. I don’t see how it matters. If residency programs care more about Step1 scores, why don’t you? I’m at one of the internally ranked schools you listed, and they only do it for things like AOA. I’m sure the other schools with chapters have to compare students on some grounds too.

