Category Archives: Biostatistics

Relative Risk and Odds Ratio


Risk and Odds just seemed the same to me for a long time. Since then, I have come to understand to important difference. Lets start with Relative Risk.

Relative Risk can be addressed by asking the following question: How many times more likely is an “exposed” group to develop a disease over a certain period of time as compared to a “non-exposed” group?

Here’s the key: Relative Risk looks to the future for the effect of a particular cause hence it is used in prospective studies say a cohort study.

Lets compare the above with Odds Ratio. The Odds Ratio can be addressed by asking te following question: How many times more likely is a diseased group to have been exposed to a risk factor as compared to a non-diseased group?

Here’s the key: Odds Ratio looks to the past for the cause of a particular effect hence it is used in retrospective studies such as a case-control study.

Lets go through some examples so we can get a better picture.

Using the table above as our 2×2 contingency table lets first consider the following case. A group of 70 individuals decide to begin a new therapeutic drugĀ X, however the drug X has been known to cause cancer. They are compared to a control group of 60 individuals that takes a placebo instead. Question: What is the Relative Risk of developing cancer from Drug X compared to the control group? Here we need to consider whether we are looking at a case-control study or a cohort study. This is more of a cohort study, meaning the study is looking to the future to see if Drug X leads to cancer. Remember that a case-control study looks to the past.


Take the number of individuals who developed cancer (disease in the table) after having been exposed to the drug (40 or A in the table) and divide that number by the total of individuals exposed (70). Consider this value to be more of a percentage of the total exposed. We then divide this value (A/A+C) by the number of non-exposed who developed cancer over the total number of non-exposed (B/B+D). Therefore Relative Risk = the ratio (A/A+C)/(B/B+D).

To understand Odds Ratio now, lets go through another but similar example. A group of 60 individuals with cancer are being evaluated to see they were exposed to a particular toxin X. They are compared to a group of 70 individuals that do not have cancer and is similarly being evaluated for exposure to Toxin X. It is found that 40 of the 60 cancer individuals were indeed exposed and that 30 of the non-cancer individuals have also been exposed to Toxin X. Lets pause for a moment and realize that we are looking at a studying that is taking people who ALREADY have a disease and looking to the past to see if they were exposed to a Toxin thereby possibly drawing some association between the toxin and cancer.


You could just memorize the shortcut AD/BC. For those of you who want to understand why this is the case read on. To calculate this lets first take the diseased group (with cancer) and compare the odds of having been exposed to not having been exposed. Not here that we are NOT dividing by the total amount in the group as we did in Relative Risk (ie. it is not a percentage of the total rather it is a comparison between two values in this case having been exposed and not having been exposed). In the table above it would be A/B. This new value is now divided by the odds of having been exposed versus not having been exposed in the non-diseased group (C/D). This comes out to (A/B)/(C/D). If we remember from basic math dividing two fractions by each other is the same as multiplying one fraction by the reciprocal of the other (A/B)*(D/C) and multiply across, which is now (AD/BC).

To recap:

Odds Ratio – Look to the past, Case-control study

Relative Risk – Look to the future, Cohort study

I hope this helps. Please leave a comment if there are any mistakes here or if you have any questions.

Leave a comment

Filed under Biostatistics

Overview: Biostatistics

Biostatistics was very confusing for me at first but I made it a point to understand it. It was easy enough to memorize the equations but I really wanted to know what they all meant, how it all came together. Getting to that place of really understanding and feeling comfortable with the material took a combination of videos (from Kaplan and from YouTube), High Yield Biostatistics by Glaser, along with the new Subject Review Series that UsmleWorld came out with. Throughout all this I was doing Biostat questions from the UsmleWorld Step 1 Qbank. I did it in this order (roughly from simplest to more challenging):

1. Kaplan Biostats Videos/YouTube Videos

2. HY Biostats

3. UsmleWorld Biostats Subject review

4. UsmleWorld Step 1 Qbank

The UW Biostat subject review was by far the one that brought it all home for me. Granted this was probably because I had gained some basic understanding already from the previous videos and Glaser’s book. The subject review is nicely organized by main sections and organized in order that builds on itself. I definitely recommend purchasing it. If you only pick one thing to do I suggest doing that, because honestly the Kaplan books and videos do not cover everything you need to know for potential Step 1 questions.

Here’s an example of a video that was helpful for me. Khan academy actually has several videos out on YouTube for Statistics. I would watch these during my breaks and found that the presenter clarified some things I never really understood. You might or might not like his style of teaching. Enjoy!

Here are some topics I feel are high yield for the Step 1 exam:

  • Sensitivity (snout)
  • Specificity (spin)
  • Positive predictive value, PPV (remember this depends on prevalence)
  • Negative predictive value, NPV (also depends on prevalence)
  • Relative Risk (remember to use in cohort studies)
  • Odds Ratio (remember to use in case-control studies)
  • Confidence Intervals
  • Setting a cutoff point on normal distributions (classic example is the fasting blood glucose cutoff for diabetes)
  • Attributable Risk
  • Number Needed to Treat
  • P value (probability that the null hypothesis is correct)
  • Correlation coefficient (describes a linear association does NOT necessarily imply causation)
  • Variability or the percent of variability (remember to square the correlation coefficient)
  • Which test to use chi-square? correlation? t-test? ANOVA?
  • The biases: length-time, lead-time, confounding, selection, etc.

All that said, I am sure I left out some potential test question topics. What I left out however, I’m sure the UW subject review will cover. One thing I do want to cover is something I personally had difficulty understanding for the longest time and it was only recently that it became clear and that is the difference between relative risk and odds ratio. Risk and Odds, they always sounded like the same thing.


Filed under Biostatistics, Overview