M1 Assessment

SOP #:
9.4.4

Version #:
1

Implementation Date:
May 21, 2025

Last Reviewed/Update Date
N/A

Approval by ECC
May 21, 2025

Rationale

Assessment within the MD program at the Brody School of Medicine occurs in the context of programmatic curricular alignment such that the assessments are aligned with institutional and course/clerkship level learning objectives and learning activities. This includes formative and summative assessments, and utilizes written examinations, (MCQ, SAQ, Essay), performance assessments (OSCE and simulation), assignments (reports, projects, self-reflection), and oral examinations among the assessment tools.

Internally created, blended assessments administered at Brody School of Medicine follow the procedures in this document to provide a secure and reliable examination environment, consistent with what is required by the National Board of Medical Examiners. This allows consistent and accurate assessment of student knowledge fund.


Scope

The Standard Operating Procedure applies to M1 (first year) medical learner assessments at BSOM. It outlines the procedures for assessments to learners enrolled in BSOM M1 courses and contains recommendations to enable the creation of a highly reliable process. The SOP also communicates expectations to key stakeholders for creating assessments and affects learners, staff, faculty members, and course/clerkship directors of the Brody School of Medicine who design, create, and administer assessments for M1 students.


Definitions

Accommodation: A change or adjustment from the normal curriculum or equipment format that allows an individual with a disability to access content or complete tasks to pursue a regular course of study.

ADA: The Americans with Disabilities Act, which prevents discrimination against people with disabilities

Assessment Architecture: a plan that includes the name of the assessment, assessment type, assessment duration, and assessment location for all assessments, including pre-determined makeup assessment dates.

Assessment Blueprint: a planning document used by the course/thread director to ensure a balanced content representation of the MCQs used on an assessment.
Assessment Platform: the application or platform used for the development and administration of multiple-choice quizzes and examinations at BSOM.

BSOM: Brody School of Medicine

CD: Course/thread Director

ECC: Executive Curriculum Committee
Exam, examination: a high-stakes assessment that contributes greater than or equal to 5% of a course grade.

Extended Quiz: a low stakes MCQ assessment that offers flexibility to be completed over a specific time frame.

Learner Identity: a module or application that works in conjunction with the assessment platform that verifies the learner’s identity.

Learning Management System: The technology used to house, plan, implement, assess, and evaluate the learning process.

MCQ: multiple-choice questions, written in the single best answer format used by the NBME.

NBME: National Board of Medical Examiners

Quiz: a low stakes MCQ assessment that contributes to 10-20% of the final grade to course in the foundational phase.

Secure Review: an assessment related event that allows students to review the questions and answers on an assessment either immediately following or at a fixed time after completing the assessment.


Curriculum Overview

The M1 fall semester consists of three graded courses: Molecular Basis of Medicine, System Structure and Function and the first half of Foundations of Doctoring 1. Final course grades are assigned for Molecular Basis of Medicine, and Systems 1 at the end of the fall semester.

The M1 spring semester consists of four graded courses: Neurologic Systems, Ethical Issues in Medicine, Foundations of Disease and Therapeutics, and the second half of Foundations of Doctoring. Not all courses run concurrently.

Three courses in the M1 curriculum contain longitudinal threads. System Structure and Function has three threads: Anatomy, Histology, and Physiology. Neurologic Systems has two threads: Behavioral Science and Neuroscience. Foundations of Disease and Therapeutics has three threads: Host-Microbe Interactions, Pathology and Pharmacology.


Assessment

Student assessment in the M1 integrated curriculum consists of multiple-choice question examinations, multiple-choice question quizzes, laboratory practical exams, projects, and assignments. Students receive a grade in each examination for all courses and threads assessed in that examination.


Reassessment Option

Students with an examination score below 70% in a course or thread will have one optional attempt to resolve the failure by retaking an assessment on that material. The reassessment option will be offered for each examination in the M1 fall and M1 spring. The dates for retaking the examination are listed in the curriculum calendar.

Students scoring 70% or above on the reassessment will have a grade of 70% entered for that examination for calculating the course/thread grade. Students scoring below 70% will have the higher of the original or reassessment grade entered on that examination for calculating the course/thread grade.


Course Grades

All pre-clerkship courses are graded as pass/fail. Grades are based both on assessment results and required minimum attendance/participation/assignment completion as described in individual course syllabi. Assessment grades are rounded to the whole number. A grade of 70% AND meeting attendance/participation/assignment metrics is required to pass a course without threads.

Students must have a cumulative average of 70% and meet attendance/participation/assignment metrics in each thread to pass a course with threads. If a student fails one thread in a course with multiple threads, they fail the course. The pass/fail grade is reported to the registrar and listed on the transcript.


Course Remediation

Aside from the exam reassessment option mentioned above, students will not be allowed to remediate exam performance. If a student fails a course due to performance on attendance/participation/assignment metrics, any opportunity to remediate non-exam performance, if any, will be described in that course’s syllabus.


Class Rank

Class rank is calculated and reported to students at the conclusion of M3 clerkships. Class rank is reported by quartile (Quartile 1 represents the top 25% and Quartile 4 represents the bottom 25% of students in the cohort) in the Medical Student Performance Evaluation (MPSE), part of a student’s Electronic Residency Application Service (ERAS) application.


Student Academic Progression

There are two graded courses that are completed in the M1 Fall semester, Molecular Basis of Medicine and System Structure and Function. Students who pass both courses will progress to the M1 Spring semester. Students who fail one course will be recommended to reattempt the M1 year. Students who fail both courses will be recommended for dismissal. Foundations of Doctoring 1 is a longitudinal M1 course with the final grade determined and reported after the Spring semester.

There are four graded courses that are completed in the M1 Spring. Students who fail one (1) spring semester course will be recommended to reattempt the M1 year. Students who fail two (2) or more spring semester courses will be recommended for dismissal from medical school.


Responsibilities

There are several key stakeholders involved in the M1 assessment process. Listed below are each key stakeholder, along with their expected responsibilities and time frame for completion.

Associate Dean of Medical Education

  • Ensures compliance with test development timeline.
  • Leads item writing development sessions.
  • Oversees all testing operations at BSOM and reports on irregularities on internally generated examinations to the Executive Curriculum Committee, and irregularities on NBME examinations to the NBME.
  • Ensures that the Testing Administrator is trained and operating according to established procedures of BSOM.
  • Monitors any failures in courses & clerkships and oversees remediation plans.
  • Communicates with the Division of Academic Affairs to request non-faculty proctors.

Executive Curriculum Committee

Responsible for oversight of assessment blueprint and assessment creation process.

Curriculum Committee Chair

Timeline Planning

Set/Schedule Calendar for fall and spring term two months prior to the start of the term. Identify assessment days. Establish assessment blueprint, including types of assessments, exam/quiz dates and durations, dedicated makeup dates, question allotment, and secure review dates as appropriate.

Course/thread Director

  • Term Planning
    • Reviews assessment blueprint (working with Curriculum Committee Chairs), exam durations, question allotment, and secure review dates.
    • Responsible for communicating the assessment blueprint to all teaching faculty.
    • Ensures that the timing for questions (90 seconds per question) is applied to blended assessments and clearly communicated to learners in the syllabus.
    • Beginning of Term – by the end of the first week of each term: Send a list of graduate learners and confirm off-cycle learners enrolled in the course to the testing administrator.
    • Meet with the Testing Administrator for testing/proctor protocol training as needed.
    • Faculty and Course/thread directors are advised to only proctor if all proctoring opportunities have been exhausted.
    • Two business days prior to assessment:
      • Questions are reviewed for content and clarity at the course/thread director and departmental level before being uploaded to the assessment platform. • Course/thread directors identify questions to include in the assessment and submit them to the appropriate Assessment Shell.
      • Questions are tagged to USMLE content outline and course/lecture objectives.
  • Course/thread director or a designee should attend all Secure Review sessions to answer learner questions about the assessment. Faculty must rotate through all testing rooms.
  • Course/thread directors should address all clarification forms sent to them by the testing administrator post secure review.
  • No changes can be made to the approved calendar or assessment blueprint without approval of the curriculum committees. These must be communicated to the testing administrator immediately.
  • Item analysis should be performed after each exam. The following analyses should be performed and reviewed with instructors in the context of the entire exam:
    • Analysis of item difficulty
    • Analysis of item discrimination
    • Analysis of item options (point biserial)
  • Assessment statistics should not be shared until all learners have completed the assessment.

Instructor(s)

  • Throughout the Term
    • Generates questions for assessments using best practices (Appendix A).
    • Uses item analysis and/or comments from previous administrations to improve quality and clarity questions prior to submission.
    • Instructor generated questions should be completed in time to allow departmental and course/thread director review of questions prior to the submission deadline of two business days before an assessment.
  • After the Assessment
    • Instructors submitting questions should attend the Secure Review sessions so that learners’ questions can be answered.
    • Reviews the item analysis to revise questions for subsequent use.

Schedule for Review

This procedure is reviewed and approved by the Curriculum Committees, including the Executive Curriculum Committee and Foundational Curriculum Committee every three years.

The M1 Assessment Policy is posted on the OME Website by the Office of Medical Education to allow learners and teaching faculty/administration to reference at any time.


Related Policies

Assessment Administration


Applicable Laws, Regulations & Standards

LCME Standards for Accreditation of Medical Education Programs Leading to the MD Degree: Published March 2022; Standard 8, Element 2; Standard 8, Element 3; Standard 8, Element 7; Standard 9, Element 4, and Standard 9, Element 8


Appendices

Appendix A

Recommendations from the Ad-hoc Testing Committee on Question Construction

Content adapted from the NBME Item Writing Guide. Additional information can be accessed.

Why should learners be tested?

  • Communicate material that is important
  • Motivate learning
  • Identify learning needs
  • Assess obtainment of learning objectives
  • Determine grades
  • Identify areas where instruction can be improved

What should be tested?

  • Exam content should match course/clerkship and session objectives
  • Important topics should be emphasized
  • The testing time devoted to each topic should reflect the relative importance
  • The sample of items should be representative of the instructional goals.

Recommendation #1: The committee recommends that assessments not employing essay or free text should only use single best answer formats that require test-takers to select the single best response. Avoid use of true/false (C-type, K-type, and X-type) questions.

Background: One Best Answer (A-type) questions consist of a stem (e.g. a clinical case presentation) and a lead in question, followed by a series of choices, typically one correct answer and three to four distractors. When written correctly, incorrect options do not have to be entirely wrong, but less correct than the keyed option.

Why Single Best Answer?

  • Prevents test-taker from having to guess author’s intent
  • More efficient/easier to write as incorrect options do not have to be entirely wrong
  • Same stem can be paired with different lead-ins to create item sets (diagnosis and management)

Best Practices for Item Construction

  • Items should focus on an important concept
  • Items should assess application of knowledge, not recall of an isolated fact, ideally by using clinical vignettes or analysis of data.
  • Question should be able to be answered without seeing the options.
  • The options should all be homogenous and plausible.
  • Do not use true-false questions.
  • Avoid negatively phrased A type questions- for example, all of the following are correct except or which of the following statements is not correct.
  • Include as much of the item as possible in the stem; the stems should be long and the options short.
  • Avoid superfluous information.
  • Avoid tricky or overly complex items.
  • Write options that are grammatically consistent and logically compatible with the stem.
  • Distractors should have the same relative length as the answer.
  • Avoid use of imprecise terms- usually, frequently, often, commonly, most of the time, almost never.

Recommendation #2: Item analysis should be performed after each exam. The following analyses should be performed and reviewed in the context of the entire exam:

  • Analysis of item difficulty
  • Analysis of item discrimination
  • Analysis of item options

Avoid flaws related to testwiseness

  • Grammatical clues- one or more distractors don’t follow grammatically from the stem
  • Logical cues- a subset of the options is collectively exhaustive
  • Absolute terms- terms such as always or never are in some options
  • Long correct answer- correct answer is longer, more specific, or more complete than other options
  • Word repeats- a word or phrase is included in the stem and in the correct answer
  • Convergence strategy- the correct answer includes the most elements in common with the other options

Avoid flaws related to irrelevant difficulty

  • Options are long, complicated, or double
  • Numeric data are not stated consistently
  • Terms in the options are vague (rarely, usually)
  • Language in the options is not parallel
  • Options are in a nonlogical order
  • None of the above is used as an option
  • Stems are tricky or unnecessarily complicated
  • The answer to an item in hinged to the answer of a related item.

Guidelines for Writing Foundational and Clinical Science Items

  • An assessment blueprint should be developed for each assessment to keep item writers focused on important topics and key content.
  • Test application of knowledge using clinical vignettes or experimental vignettes to provide context to the question being asked.
  • Focus items on common or potentially catastrophic problems; Avoid “zebras.”
  • Pose decision-making tasks appropriate for the level of training.
  • Questions should focus on specific tasks that students must be able to undertake at the next stage of training: What is the most likely diagnosis, additional labs needed, formulate the next step in management.
  • Focus on areas in which clinical reasoning mistakes are commonly made.
  • Use of vignettes ensures that the student not only knows the information but can apply it to hypothetical situations.

Writing one-best-answer items

  • Whenever possible, items should be written with a clinical stem.
  • Clinical vignettes should begin with presenting problem of a patient, followed by the history including duration, physical findings, results, initial treatment, subsequent findings.
  • Vignette may only include a portion of this information but should be supplied in that order.
  • Stem lead-in should pose a clear question that the examinees should be able to answer without looking at the options.
  • Vignettes should avoid red herrings and lying patients.

Examples of types of foundational and clinical lead-ins

  • Which of the following is the most like cause/mechanism of this effect?
  • Which of the following is the most likely causal infectious agent?
  • This patient most likely has a defect in which of the following?
  • This patient most likely has a defect in which of the following enzymes?
  • Which of the following cytokines is the most likely cause of this condition?
  • Which of the following structures is at greatest risk for damage during this procedure?
  • The most appropriate medication for this patient will have which of the following mechanisms of action?
  • Which of the following factors in the patient’s history most increased his/her risk for developing this condition?
  • Which of the following is the most likely explanation for these findings?
  • Which of the following is the most likely location of the patient’s lesion?
  • Which of the following is the most likely pathogen?
  • Which of the following findings is most likely to be increased/decreased?
  • Which of the following is the most appropriate intervention?
  • For which of the following conditions is the patient at greatest risk?
  • Which of the following is most likely to have prevented this condition?
  • Which of the following is the most appropriate next step in management to prevent morbidity/mortality/disability?
  • Which of the following is the most likely diagnosis?
  • Which of the following is the most appropriate next step in diagnosis?
  • Which of the following is most likely to confirm the diagnosis?
  • Which of the following is the most appropriate initial or next step in patient care?
  • Which of the following is the most effective management?
  • Which of the following is the most appropriate pharmacotherapy?
  • Which of the following is the first priority in caring for this patient?

Example of grammatical cues

A 60-year-old man is brought to the emergency department by the police who found him lying unconscious on the sidewalk. After ascertaining that the airway is open, the first step in management should be intravenous administration of

  1. Examination of CSF
  2. Glucose with vitamin B1 (thiamine)
  3. CT scan of the head
  4. Phenytoin
  5. Diazepam

Options A and C do not follow logically from the stem.

Logical Cues

Crime is

  1. Equally distributed among the social classes
  2. Overrepresented among the poor
  3. Overrepresented among the middle class and rich
  4. Primarily an indication of psychosexual maladjustment
  5. Reaching a plateau of tolerability for the nation

Options A, B, and C include all possibilities so the student knows that one must be correct.

Absolute terms

In patients with advanced dementia, Alzheimer’s type, the memory defect

  1. Can be treated adequately with phosphatidylcholine
  2. Could be a sequel of early parkinsonism
  3. Is never seen in patients with neurofibrillary tangles at autopsy
  4. Is never severe
  5. Possibly involves the cholinergic system

Options C and D can be eliminated based on the absolute terms.

Word Repeats

A 58-year-old man with a history of heavy alcohol use and previous psychiatric hospitalizations is confused and agitated. He speaks of experiencing the world as unreal. This symptom is called

  1. Depersonalization
  2. Derailment
  3. Derealization
  4. Focal memory deficit
  5. Signal anxiety

Unreal appears in the stem and in the options.