This is the first in several planned blog posts on the basics of research heavily borrowing from Hulley’s “Designing Clinical Research.” Embedded within these chapter notes are examples from our experience.

Hulley starts where most of us in healthcare are already comfortable: Anatomy and Physiology. In this metaphor, Anatomy describes how research is constructed. Physiology describes how research functions.


After doing a literature review on a topic, if you haven’t found an answer to your question, then it is time to consider designing your own study. Start by completing Hulley’s study outline which can be found here. Make a copy (File → Make a Copy), and start filling it in. The details follow below.


There are several levels of description of a study.

  • Study outline: a one-page summary of the design that services as a reminder to address all the components.
  • Study protocol: about 5-15 pages that is often used to apply for IRB approval and grant support.
  • Operations manual: collection of procedural instructions, surveys and other things to help the people carrying out the study to do it with consistency and quality.

The study outline has the following components:

  1. An appropriately focused research question.
  2. An exploration of what is already known setting the background and significance of the study
  3. A description of the study design, including the time frame and epidemiologic design
  4. A description of the subjects, how they were selected and, if a sample was taken from a larger population, how that was done
  5. A description of the different variables we’ll measure: predictors, confounders, outcomes
  6. And finally, a list of the statistical issues: hypotheses, sample size, analytic approach

Let’s explore each in the context of our department’s study on resident feedback. All these elements will be explored in greater detail in future posts.

The Research Question

The research question starts from our curiosity, but needs to be refined to a question that is answerable with the data, time and resources available. We started with the question of “how do we improve feedback faculty give to residents?” This question as it stands is too broad. What kind of feedback? What kind of improvement? What are we doing to do to improve it? Which residents?

A commonly cited tool to refine a research question are the FINER criteria. The research question needs to be feasible, interesting, novel, ethical and relevant.

  • Feasible: Measuring improvement of resident performance in patient care outcomes would be challenging, but collecting surveys on resident satisfaction is much more achievable.
  • Interesting: Improving the feedback process should be of interest to residency programs.
  • Novel: After a review of the literature, we didn’t find anyone looking at using the milestones as a guide to giving feedback.
  • Ethical: This study cannot affect anyone adversely. In this case, changing the feedback process should not alter resident morale or success. The Institutional Review Board (IRB) reviews all study protocols for ethical violations.
  • Relevant: This study is relevant to the education of residents in training.

Ultimately the question was revised to: Can the use of a written tool structured around performance on the EM residency milestones improve perceived resident satisfaction with feedback given by attending physicians.

Background and Significance

The background and significance section should describe where this study fits with what is already known about a topic. A new study should improve upon what came before and prove why this understand is important.

After a fairly comprehensive review, the literature on resident feedback describes characteristics of good feedback as well as barriers to delivering effective feedback. There are several models that already exist. Our study builds upon existing models, attempts to address the criteria for good feedback and mitigate the known barriers to giving good feedback.


Study is often broken down into observational and interventional trials. Observational studies watch for associations. From here we can try to make predictions: when we see x, we predict it is followed by y.

These predictions are not perfect. They are often confounded by other factors that we may not have previously noticed. A oft cited example is the observation that on days when ice cream sales increase, the number of patients presenting with gun shot wounds also increase. Obviously ice cream is not causing penetrating trauma. However the variable we may not have measured is the high temp a true on those days. Increased heat causes increased demand for ice cream as well as increased likelihood for trauma.

With interventional trials, as the name suggests, we do more than just observe. We intervene. Each group of subjects gets a different intervention and we look for the difference in outcomes between the two groups. The most commonly known type is the randomized controlled trial. There are, of course, interventional trials in which we don’t use the randomization.

For example, in our feedback study we had an intervention: the written milestone-based tool. There were two groups of residents, but they were differentiated by time. In the first time frame, the residents received the normal run-of-the-mill feedback we normally give. In the second time frame, the residents used the written milestone-based tool to help guide attendings to give more directed feedback. In this before-after design, an outcome (resident satisfaction with feedback) is measured before and after an intervention (initiation of the milestone based feedback tool).^before-after

Ultimately, you should be able to summarize the research question and study design in a single sentence: this is a before-after study in R1 to R3 EM residents comparing resident satisfaction with faculty given feedback before and after implementation of a milestone-based written feedback tool.

Study Subjects

The description of the study subjects clearly enumerates who is being included and who is being excluded. There are always trade offs you’ll have to make. You can’t study everyone you would like. Luckily, sampling strategies[^sampling] can lighten the load.

In our feedback study, our inclusion criteria were EM faculty and residents who worked in the Rush ED within a particular 6 month time frame and were willing to consent to the study once informed about what we were trying to achieve and how. The only exclusion criteria we had were those who didn’t consent to be studied.


Identifying the variables to measure upfront is extremely important. In order to capture the full set of data you need, you need to know what that data exactly is. There are several types of variables:

  • Predictor or independent variable: this is the variable that is changed (in interventional trials) or observed (in observational studies) to assess its effect on an outcome. For example, in an observation study of smoking and lung cancer, smoking is the independent variable.
  • Outcome or dependent variable: this is the variable that may be changed by the predictor variable. For example, in a trial of effectiveness of a COVID-19 vaccine, the dependent variable is the percentage of people who get sick with COVID-19.
  • Confounder variables: this is a third (usually unmeasured and unknown) variable that alters both the predictor and the outcome.

In our feedback study, our independent variable is the use or not of the milestone-based feedback tool. The dependent variable is resident satisfaction measured on a questionnaire.

Statistical issues

First thing you need to do is claim a hypothesis. This allows you to pick the sample size and calculate power (both of which we’ll cover later). Sample size it the number of subjects (measurements) we need in order to make a statistically significant decision. The power[^power] of our study assures us that we’ll be able to notice if our intervention made a difference.


The functioning of research is really a series of compromises between three different truths.

The Physiology of Clinical Research according to Hulley.
  • The first is the Truth as it exists in the world. This is unattainable but where we strive to reach.
  • The seconds the truth in the study we designed.
  • The third is the truth of what actually happened when the study was carried out

In our search for the Truth in the world, we realize certain compromises will have to be made. Some of those compromises are described above such as the refinement of the study question or the selection of only a subset of the world’s population to include in your sample. These compromise result in the truth as we write it in our study design. The degree of match between the world’s Truth and our study design’s truth is external validity.

No plan ever goes as intended. When we actually do the study, we’ll have to make compromises and bend the rules of the study plan a bit. Here we have compromises from truth of the study design to the truth of what we actually did. The degree of match between the study design’s truth and the truth of what we actually did is internal validity.

This process works in reverse when reading a study. When looking at the results of the study, you should ask yourself how closely did the researchers follow their plan? This is an assessment of the internal validity. Also ask yourself how closely does the study plan reflect the world’s Truth? This is the external validity.

In our feedback study, we wanted to know if feedback centered around milestones could help improve delivery feedback given to residents by attendings. The use of a written tool describing the milestones is already one compromise. If attending physicians misinterpret the written tool, then this tool is not a good approximation of milestones. This is a threat to external validity. If residents only used the milestone tool half of the time during implementation of the study (instead of all the time as it was written), then we have a threat to internal validity.


Every study will have error. Some occur randomly. Others occur because of the way the study was designed. There are two types:

  • Random error is due to chance. It is equally as likely that it wrongly pushes a result in favor of your hypothesis or against it. Though it often cannot be avoided, the best way to minimize random error is to increase your sample size.
  • Systematic error is due to faulty study design. This is bias. It wrongly pushes results consistently in one direction (toward or against your hypothesis). The best way to mitigate bias is to design your study with potential biases in mind.

Errors can occur at each step in the study design and implementation process. Our job is to try to identify and minimize both of these the best we can. However all studies will have errors in them just as all will have compromises. These trade offs exists and it’s up to us to make the decision if the study is still valuable enough to conduct.


  1. Hulley et al. “Designing Clinical Research. Fourth Edition” LWW. 2013.
  2. sampling: accessed March 4, 2021.
  3. power: accessed on March 4, 2021.
  4. compromises: Jagger, et al. accessed March 4, 2021 and hundreds of times prior to this.