What is the purpose of an evidence hierarchy

The first and earliest principle of evidence-based medicine indicated that a hierarchy of evidence exists. Not all evidence is the same. This principle became well known in the early 1990s as practising physicians learnt basic clinical epidemiology skills and started to appraise and apply evidence to their practice. Since evidence was described as a hierarchy, a compelling rationale for a pyramid was made. Evidence-based healthcare practitioners became familiar with this pyramid when reading the literature, applying evidence or teaching students.

Various versions of the evidence pyramid have been described, but all of them focused on showing weaker study designs in the bottom (basic science and case series), followed by case–control and cohort studies in the middle, then randomised controlled trials (RCTs), and at the very top, systematic reviews and meta-analysis. This description is intuitive and likely correct in many instances. The placement of systematic reviews at the top had undergone several alterations in interpretations, but was still thought of as an item in a hierarchy.1 Most versions of the pyramid clearly represented a hierarchy of internal validity (risk of bias). Some versions incorporated external validity (applicability) in the pyramid by either placing N-1 trials above RCTs (because their results are most applicable to individual patients2) or by separating internal and external validity.3

Another version (the 6S pyramid) was also developed to describe the sources of evidence that can be used by evidence-based medicine (EBM) practitioners for answering foreground questions, showing a hierarchy ranging from studies, synopses, synthesis, synopses of synthesis, summaries and systems.4 This hierarchy may imply some sort of increasing validity and applicability although its main purpose is to emphasise that the lower sources of evidence in the hierarchy are least preferred in practice because they require more expertise and time to identify, appraise and apply.

The traditional pyramid was deemed too simplistic at times, thus the importance of leaving room for argument and counterargument for the methodological merit of different designs has been emphasised.5 Other barriers challenged the placement of systematic reviews and meta-analyses at the top of the pyramid. For instance, heterogeneity (clinical, methodological or statistical) is an inherent limitation of meta-analyses that can be minimised or explained but never eliminated.6 The methodological intricacies and dilemmas of systematic reviews could potentially result in uncertainty and error.7 One evaluation of 163 meta-analyses demonstrated that the estimation of treatment outcomes differed substantially depending on the analytical strategy being used.7 Therefore, we suggest, in this perspective, two visual modifications to the pyramid to illustrate two contemporary methodological principles (figure 1). We provide the rationale and an example for each modification.

What is the purpose of an evidence hierarchy

Figure 1

The proposed new evidence-based medicine pyramid. (A) The traditional pyramid. (B) Revising the pyramid: (1) lines separating the study designs become wavy (Grading of Recommendations Assessment, Development and Evaluation), (2) systematic reviews are ‘chopped off’ the pyramid. (C) The revised pyramid: systematic reviews are a lens through which evidence is viewed (applied).

In the early 2000s, the Grading of Recommendations Assessment, Development and Evaluation (GRADE) Working Group developed a framework in which the certainty in evidence was based on numerous factors and not solely on study design which challenges the pyramid concept.8 Study design alone appears to be insufficient on its own as a surrogate for risk of bias. Certain methodological limitations of a study, imprecision, inconsistency and indirectness, were factors independent from study design and can affect the quality of evidence derived from any study design. For example, a meta-analysis of RCTs evaluating intensive glycaemic control in non-critically ill hospitalised patients showed a non-significant reduction in mortality (relative risk of 0.95 (95% CI 0.72 to 1.25)9). Allocation concealment and blinding were not adequate in most trials. The quality of this evidence is rated down due to the methodological imitations of the trials and imprecision (wide CI that includes substantial benefit and harm). Hence, despite the fact of having five RCTs, such evidence should not be rated high in any pyramid. The quality of evidence can also be rated up. For example, we are quite certain about the benefits of hip replacement in a patient with disabling hip osteoarthritis. Although not tested in RCTs, the quality of this evidence is rated up despite the study design (non-randomised observational studies).10

Therefore, the first modification to the pyramid is to change the straight lines separating study designs in the pyramid to wavy lines (going up and down to reflect the GRADE approach of rating up and down based on the various domains of the quality of evidence).

Another challenge to the notion of having systematic reviews on the top of the evidence pyramid relates to the framework presented in the Journal of the American Medical Association User's Guide on systematic reviews and meta-analysis. The Guide presented a two-step approach in which the credibility of the process of a systematic review is evaluated first (comprehensive literature search, rigorous study selection process, etc). If the systematic review was deemed sufficiently credible, then a second step takes place in which we evaluate the certainty in evidence based on the GRADE approach.11 In other words, a meta-analysis of well-conducted RCTs at low risk of bias cannot be equated with a meta-analysis of observational studies at higher risk of bias. For example, a meta-analysis of 112 surgical case series showed that in patients with thoracic aortic transection, the mortality rate was significantly lower in patients who underwent endovascular repair, followed by open repair and non-operative management (9%, 19% and 46%, respectively, p<0.01). Clearly, this meta-analysis should not be on top of the pyramid similar to a meta-analysis of RCTs. After all, the evidence remains consistent of non-randomised studies and likely subject to numerous confounders.

Therefore, the second modification to the pyramid is to remove systematic reviews from the top of the pyramid and use them as a lens through which other types of studies should be seen (ie, appraised and applied). The systematic review (the process of selecting the studies) and meta-analysis (the statistical aggregation that produces a single effect size) are tools to consume and apply the evidence by stakeholders.

Changing how systematic reviews and meta-analyses are perceived by stakeholders (patients, clinicians and stakeholders) has important implications. For example, the American Heart Association considers evidence derived from meta-analyses to have a level ‘A’ (ie, warrants the most confidence). Re-evaluation of evidence using GRADE shows that level ‘A’ evidence could have been high, moderate, low or of very low quality.12 The quality of evidence drives the strength of recommendation, which is one of the last translational steps of research, most proximal to patient care.

One of the limitations of all ‘pyramids’ and depictions of evidence hierarchy relates to the underpinning of such schemas. The construct of internal validity may have varying definitions, or be understood differently among evidence consumers. A limitation of considering systematic review and meta-analyses as tools to consume evidence may undermine their role in new discovery (eg, identifying a new side effect that was not demonstrated in individual studies13).

This pyramid can be also used as a teaching tool. EBM teachers can compare it to the existing pyramids to explain how certainty in the evidence (also called quality of evidence) is evaluated. It can be used to teach how evidence-based practitioners can appraise and apply systematic reviews in practice, and to demonstrate the evolution in EBM thinking and the modern understanding of certainty in evidence.


Page 2

1. McCarthy CM, Collins ED, Pusic AL. Where do we find the best evidence? Plast Reconstr Surg. 2008;122:1942–1947. [PubMed] [Google Scholar]

2. Chung KC, Swanson JA, Schmitz D, Sullivan D, Rohrich RJ. Introducing evidence-based medicine to plastic and reconstructive surgery. Plast Reconstr Surg. 2009;123:1385–1389. [PMC free article] [PubMed] [Google Scholar]

3. Chung KC, Ram AN. Evidence-based medicine: the fourth revolution in American medicine? Plast Reconstr Surg. 2009;123:389–398. [PMC free article] [PubMed] [Google Scholar]

4. Rohrich RJ. So you want to be better: the role of evidence-based medicine in plastic surgery. Plast Reconstr Surg. 2010;126:1395–1398. [PubMed] [Google Scholar]

5. Burns PB, Chung KC. Developing good clinical questions and finding the best evidence to answer those questions. Plast Reconstr Surg. 2010;126:613–618. [PMC free article] [PubMed] [Google Scholar]

6. Sprague S, McKay P, Thoma A. Study design and hierarchy of evidence for surgical decision making. Clin Plast Surg. 2008;35:195–205. [PubMed] [Google Scholar]

7. The periodic health examination. Canadian Task Force on the Periodic Health Examination. Can Med Assoc J. 1979;121:1193–1254. [PMC free article] [PubMed] [Google Scholar]

8. Sackett DL. Rules of evidence and clinical recommendations on the use of antithrombotic agents. Chest. 1989;95:2S–4S. [PubMed] [Google Scholar]

9. American Society of Plastic Surgeons. [Accessed December 17, 2010]; Available at: http://www.plasticsurgery.org/Medical_Professionals/Health_Policy_and_Advocacy/Health_Policy_Resources/Evidence-based_GuidelinesPractice_Parameters/Description_and_Development_of_Evidence-based_Practice_Guidelines/ASPS_Evidence_Rating_Scales.html.

10. Centre for Evidence Based Medicine. [Accessed December 17, 2010]; Available at http://www.cebm.net.

11. Jadad AR, Moore RA, Carroll D, et al. Assessing the quality of reports of randomized clinical trials: is blinding necessary? Control Clin Trials. 1996;17:1–12. [PubMed] [Google Scholar]

12. Bhandari M, Richards RR, Sprague S, Schemitsch EH. The quality of reporting of randomized trials in the Journal of Bone and Joint Surgery from 1988 through 2000. J Bone Joint Surg Am. 2002;84-A:388–396. [PubMed] [Google Scholar]

13. Poolman RW, Struijs PA, Krips R, Sierevelt IN, Lutz KH, Bhandari M. Does a "Level I Evidence" rating imply high quality of reporting in orthopaedic randomised controlled trials? BMC Med Res Methodol. 2006;6:44. [PMC free article] [PubMed] [Google Scholar]

14. Urschel JD, Goldsmith CH, Tandan VR, Miller JD. Users' guide to evidence-based surgery: how to use an article evaluating surgical interventions. Evidence-Based Surgery Working Group. Can J Surg. 2001;44:95–100. [PMC free article] [PubMed] [Google Scholar]

15. Thoma A, Farrokhyar F, Bhandari M, Tandan V. Users' guide to the surgical literature. How to assess a randomized controlled trial in surgery. Can J Surg. 2004;47:200–208. [PMC free article] [PubMed] [Google Scholar]

16. Bhandari M, Guyatt GH, Swiontkowski MF. User's guide to the orthopaedic literature: how to use an article about prognosis. J Bone Joint Surg Am. 2001;83-A:1555–1564. [PubMed] [Google Scholar]

17. Guyatt GH, Sackett DL, Cook DJ. Users' guides to the medical literature. II. How to use an article about therapy or prevention. A. Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA. 1993;270:2598–2601. [PubMed] [Google Scholar]

18. Guyatt GH, Haynes RB, Jaeschke RZ, et al. Users' Guides to the Medical Literature: XXV. Evidence-based medicine: principles for applying the Users' Guides to patient care. Evidence-Based Medicine Working Group. JAMA. 2000;284:1290–1296. [PubMed] [Google Scholar]

19. Loiselle F, Mahabir RC, Harrop AR. Levels of evidence in plastic surgery research over 20 years. Plast Reconstr Surg. 2008;121:207e–211e. [PubMed] [Google Scholar]

20. McCarthy JE, Chatterjee A, McKelvey TG, Jantzen EM, Kerrigan CL. A detailed analysis of level I evidence (randomized controlled trials and meta-analyses) in five plastic surgery journals to date: 1978 to 2009. Plast Reconstr Surg. 2010;126:1774–1778. [PubMed] [Google Scholar]

21. Chung KC, Kalliainen LK, Spilson SV, Walters MR, Kim HM. The prevalence of negative studies with inadequate statistical power: an analysis of the plastic surgery literature. Plast Reconstr Surg. 2002;109:1–6. [PubMed] [Google Scholar]

22. Newman MK, Zemmel NJ, Bandak AZ, Kaplan BJ. Primary breast lymphoma in a patient with silicone breast implants: a case report and review of the literature. J Plast Reconstr Aesthet Surg. 2008;61:822–825. [PubMed] [Google Scholar]

23. Gaudet G, Friedberg JW, Weng A, Pinkus GS, Freedman AS. Breast lymphoma associated with breast implants: two case-reports and a review of the literature. Leuk Lymphoma. 2002;43:115–119. [PubMed] [Google Scholar]

24. Sahoo S, Rosen PP, Feddersen RM, Viswanatha DS, Clark DA, Chadburn A. Anaplastic large cell lymphoma arising in a silicone breast implant capsule: a case report and review of the literature. Arch Pathol Lab Med. 2003;127:e115–e118. [PubMed] [Google Scholar]

25. Keech JA, Jr, Creech BJ. Anaplastic T-cell lymphoma in proximity to a saline-filled breast implant. Plast Reconstr Surg. 1997;100:554–555. [PubMed] [Google Scholar]

26. Duvic M, Moore D, Menter A, Vonderheid EC. Cutaneous T-cell lymphoma in association with silicone breast implants. J Am Acad Dermatol. 1995;32:939–942. [PubMed] [Google Scholar]

27. Lipworth L, Tarone RE, McLaughlin JK. Breast implants and lymphoma risk: a review of the epidemiologic evidence through 2008. Plast Reconstr Surg. 2009;123:790–793. [PubMed] [Google Scholar]

28. Lipworth L, Tarone RE, Friis S, et al. Cancer among Scandinavian women with cosmetic breast implants: a pooled long-term follow-up study. International Journal of Cancer. 2009;124:490–493. [PubMed] [Google Scholar]

29. Deapen DM, Hirsch EM, Brody GS. Cancer risk among Los Angeles women with cosmetic breast implants. Plast Reconstr Surg. 2007;119:1987–1992. [PubMed] [Google Scholar]

30. Brisson J, Holowaty EJ, Villeneuve PJ, et al. Cancer incidence in a cohort of Ontario and Quebec women having bilateral breast augmentation. Int J Cancer. 2006;118:2854–2862. [PubMed] [Google Scholar]

31. Pukkala E, Boice JD, Jr, Hovi SL, et al. Incidence of breast and other cancers among Finnish women with cosmetic breast implants, 1970–1999. J Long Term Eff Med Implants. 2002;12:271–279. [PubMed] [Google Scholar]

32. Brinton LA, Lubin JH, Burich MC, Colton T, Brown SL, Hoover RN. Cancer risk at sites other than the breast following augmentation mammoplasty. Ann Epidemiol. 2001;11:248–256. [PubMed] [Google Scholar]

33. Denkler K. A comprehensive review of epinephrine in the finger: to do or not to do. Plast Reconstr Surg. 2001;108:114–124. [PubMed] [Google Scholar]

34. Thomson CJ, Lalonde DH, Denkler KA, Feicht AJ. A critical look at the evidence for and against elective epinephrine use in the finger. Plast Reconstr Surg. 2007;119:260–266. [PubMed] [Google Scholar]

35. Lalonde D, Bell M, Benoit P, Sparkes G, Denkler K, Chang P. A multicenter prospective study of 3,110 consecutive cases of elective epinephrine use in the fingers and hand: the Dalhousie Project clinical phase. J Hand Surg Am. 2005;30:1061–1067. [PubMed] [Google Scholar]

36. Chowdhry S, Seidenstricker L, Cooney DS, Hazani R, Wilhelmi BJ. Do not use epinephrine in digital blocks: myth or truth? Part II. A retrospective review of 1111 cases. Plast Reconstr Surg. 2010;126:2031–2034. [PubMed] [Google Scholar]

37. Wilhelmi BJ, Blackwell SJ, Miller JH, et al. Do not use epinephrine in digital blocks: myth or truth? Plast Reconstr Surg. 2001;107:393–397. [PubMed] [Google Scholar]

38. Concato J, Shah N, Horwitz RI. Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med. 2000;342:1887–1892. [PMC free article] [PubMed] [Google Scholar]


Page 2

Canadian Task Force on the Periodic Health Examination’s Levels of Evidence*

LevelType of evidence
IAt least 1 RCT with proper randomization
II.1Well designed cohort or case-control study
II.2Time series comparisons or dramatic results from uncontrolled studies
IIIExpert opinions