Welcome to the new OASIS website! We have academic skills, library skills, math and statistics support, and writing resources all together in one new home.

research study hierarchy

  • Walden University
  • Faculty Portal

Evidence-Based Research: Levels of Evidence Pyramid

Introduction.

One way to organize the different types of evidence involved in evidence-based practice research is the levels of evidence pyramid. The pyramid includes a variety of evidence types and levels.

  • systematic reviews
  • critically-appraised topics
  • critically-appraised individual articles
  • randomized controlled trials
  • cohort studies
  • case-controlled studies, case series, and case reports
  • Background information, expert opinion

Levels of evidence pyramid

The levels of evidence pyramid provides a way to visualize both the quality of evidence and the amount of evidence available. For example, systematic reviews are at the top of the pyramid, meaning they are both the highest level of evidence and the least common. As you go down the pyramid, the amount of evidence will increase as the quality of the evidence decreases.

Levels of Evidence Pyramid

Text alternative for Levels of Evidence Pyramid diagram

EBM Pyramid and EBM Page Generator, copyright 2006 Trustees of Dartmouth College and Yale University. All Rights Reserved. Produced by Jan Glover, David Izzo, Karen Odato and Lei Wang.

Filtered Resources

Filtered resources appraise the quality of studies and often make recommendations for practice. The main types of filtered resources in evidence-based practice are:

Scroll down the page to the Systematic reviews , Critically-appraised topics , and Critically-appraised individual articles sections for links to resources where you can find each of these types of filtered information.

Systematic reviews

Authors of a systematic review ask a specific clinical question, perform a comprehensive literature review, eliminate the poorly done studies, and attempt to make practice recommendations based on the well-done studies. Systematic reviews include only experimental, or quantitative, studies, and often include only randomized controlled trials.

You can find systematic reviews in these filtered databases :

  • Cochrane Database of Systematic Reviews Cochrane systematic reviews are considered the gold standard for systematic reviews. This database contains both systematic reviews and review protocols. To find only systematic reviews, select Cochrane Reviews in the Document Type box.
  • JBI EBP Database (formerly Joanna Briggs Institute EBP Database) This database includes systematic reviews, evidence summaries, and best practice information sheets. To find only systematic reviews, click on Limits and then select Systematic Reviews in the Publication Types box. To see how to use the limit and find full text, please see our Joanna Briggs Institute Search Help page .

Open Access databases provide unrestricted access to and use of peer-reviewed and non peer-reviewed journal articles, books, dissertations, and more.

You can also find systematic reviews in this unfiltered database :

Some journals are peer reviewed

To learn more about finding systematic reviews, please see our guide:

  • Filtered Resources: Systematic Reviews

Critically-appraised topics

Authors of critically-appraised topics evaluate and synthesize multiple research studies. Critically-appraised topics are like short systematic reviews focused on a particular topic.

You can find critically-appraised topics in these resources:

  • Annual Reviews This collection offers comprehensive, timely collections of critical reviews written by leading scientists. To find reviews on your topic, use the search box in the upper-right corner.
  • Guideline Central This free database offers quick-reference guideline summaries organized by a new non-profit initiative which will aim to fill the gap left by the sudden closure of AHRQ’s National Guideline Clearinghouse (NGC).
  • JBI EBP Database (formerly Joanna Briggs Institute EBP Database) To find critically-appraised topics in JBI, click on Limits and then select Evidence Summaries from the Publication Types box. To see how to use the limit and find full text, please see our Joanna Briggs Institute Search Help page .
  • National Institute for Health and Care Excellence (NICE) Evidence-based recommendations for health and care in England.
  • Filtered Resources: Critically-Appraised Topics

Critically-appraised individual articles

Authors of critically-appraised individual articles evaluate and synopsize individual research studies.

You can find critically-appraised individual articles in these resources:

  • EvidenceAlerts Quality articles from over 120 clinical journals are selected by research staff and then rated for clinical relevance and interest by an international group of physicians. Note: You must create a free account to search EvidenceAlerts.
  • ACP Journal Club This journal publishes reviews of research on the care of adults and adolescents. You can either browse this journal or use the Search within this publication feature.
  • Evidence-Based Nursing This journal reviews research studies that are relevant to best nursing practice. You can either browse individual issues or use the search box in the upper-right corner.

To learn more about finding critically-appraised individual articles, please see our guide:

  • Filtered Resources: Critically-Appraised Individual Articles

Unfiltered resources

You may not always be able to find information on your topic in the filtered literature. When this happens, you'll need to search the primary or unfiltered literature. Keep in mind that with unfiltered resources, you take on the role of reviewing what you find to make sure it is valid and reliable.

Note: You can also find systematic reviews and other filtered resources in these unfiltered databases.

The Levels of Evidence Pyramid includes unfiltered study types in this order of evidence from higher to lower:

You can search for each of these types of evidence in the following databases:

TRIP database

Background information & expert opinion.

Background information and expert opinions are not necessarily backed by research studies. They include point-of-care resources, textbooks, conference proceedings, etc.

  • Family Physicians Inquiries Network: Clinical Inquiries Provide the ideal answers to clinical questions using a structured search, critical appraisal, authoritative recommendations, clinical perspective, and rigorous peer review. Clinical Inquiries deliver best evidence for point-of-care use.
  • Harrison, T. R., & Fauci, A. S. (2009). Harrison's Manual of Medicine . New York: McGraw-Hill Professional. Contains the clinical portions of Harrison's Principles of Internal Medicine .
  • Lippincott manual of nursing practice (8th ed.). (2006). Philadelphia, PA: Lippincott Williams & Wilkins. Provides background information on clinical nursing practice.
  • Medscape: Drugs & Diseases An open-access, point-of-care medical reference that includes clinical information from top physicians and pharmacists in the United States and worldwide.
  • Virginia Henderson Global Nursing e-Repository An open-access repository that contains works by nurses and is sponsored by Sigma Theta Tau International, the Honor Society of Nursing. Note: This resource contains both expert opinion and evidence-based practice articles.
  • Previous Page: Phrasing Research Questions
  • Next Page: Evidence Types
  • Office of Student Disability Services

Walden Resources

Departments.

  • Academic Residencies
  • Academic Skills
  • Career Planning and Development
  • Customer Care Team
  • Field Experience
  • Military Services
  • Student Success Advising
  • Writing Skills

Centers and Offices

  • Center for Social Change
  • Office of Academic Support and Instructional Services
  • Office of Degree Acceleration
  • Office of Research and Doctoral Services
  • Office of Student Affairs

Student Resources

  • Doctoral Writing Assessment
  • Form & Style Review
  • Quick Answers
  • ScholarWorks
  • SKIL Courses and Workshops
  • Walden Bookstore
  • Walden Catalog & Student Handbook
  • Student Safety/Title IX
  • Legal & Consumer Information
  • Website Terms and Conditions
  • Cookie Policy
  • Accessibility
  • Accreditation
  • State Authorization
  • Net Price Calculator
  • Cost of Attendance
  • Contact Walden

Walden University is a member of Adtalem Global Education, Inc. www.adtalem.com Walden University is certified to operate by SCHEV © 2024 Walden University LLC. All rights reserved.

research study hierarchy

Banner

OHSU Evidence-Based Practice Course for Interprofessional Clinical Teams

  • Course Info
  • Course Application
  • Search Strategy Consultation
  • GRADE Table Consultation
  • EBP Guideline
  • Informatics Consult
  • Final Team Presentations and Guidelines
  • Hierarchy of Evidence and Study Design

Session 2: Pre-Session Work

Hierarchy of evidence, is it a good fit for my pico, types of study designs.

Please watch the 3 videos below for more information on study design. This should take about 10 minutes. 

Overview of Research Studies - The 5 C's

Randomized Controlled Trials (RCTs)

Systematic review & Meta-analysis

research study hierarchy

Randomized Controlled Trial  is a prospective, analytical, experimental study using primary data generated in the clinical environment. Individuals similar at the beginning are randomly allocated to two or more groups (treatment and control) then followed to determine the outcome of the intervention. 

Cohort Study (prospective) is a study of a group of individuals, some of whom are exposed to a variable of interest (e.g., drug or environmental exposure), in which participants are followed up over time to determine who develops the outcome of interest and whether the outcome is associated with the exposure.

Cohort Study (retrospective) is when data is gathered for a cohort that was formed sometime in the past. Exposures and outcomes have already occurred at the start of the study. You are studying the risk factor and see if you can associate a disease to it. Individuals split by exposure.

Case Control Study  is a study in which patients who already have a specific condition or outcome are compared with people who do not. Researchers look back in time (retrospective) to identify possible exposures. They often rely on medical records and patient recall for data collection. Individuals split by disease.

Survey Study is an epidemiologic study that produces survey results, and will consist of simultaneous assessments of the health outcome, primary risk exposure and potential confounders and effect modifiers. Two types of survey research are cross-sectional and longitudinal studies.

Cross-Sectional Study is the observation of a defined population at a single point in time or during a specific time interval to examine associations between the outcomes and exposure to interventions. Exposure and outcome are determined simultaneously. Often rely on data originally collected for other purposes.

Longitudinal Study follow subjects over time with continuous or repeated monitoring of risk factors or health outcomes, or both. Researchers conduct several observations of the same subjects over a period of time, sometimes lasting many years.

Before and After Study is a study in   in which observations are made before (pre) and after (post) the implementation of an intervention, both in a group that receives the intervention and in a control group that does not. 

Case Series  and  Case Reports  are descriptive study/studies that consist of collections of reports on the treatment of individual patients or a report on a single patient. 

Systematic Review  usually focuses on a specific clinical question and conducts an extensive literature search to identify studies with sound methodology. The studies are reviewed, assessed, and the results summarized according to pre-determined criteria of the review question.

Meta-Analysis  takes a systematic review one step further by combining all the results using accepted statistical methodology.

  • << Previous: Session 2
  • Next: Session 3 >>
  • Last Updated: Feb 7, 2023 12:40 PM
  • URL: https://libguides.ohsu.edu/EBP4ClinicalTeams

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • BMJ Journals

You are here

  • Volume 21, Issue 4
  • New evidence pyramid
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • M Hassan Murad ,
  • Mouaz Alsawas ,
  • http://orcid.org/0000-0001-5481-696X Fares Alahdab
  • Rochester, Minnesota , USA
  • Correspondence to : Dr M Hassan Murad, Evidence-based Practice Center, Mayo Clinic, Rochester, MN 55905, USA; murad.mohammad{at}mayo.edu

https://doi.org/10.1136/ebmed-2016-110401

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

  • EDUCATION & TRAINING (see Medical Education & Training)
  • EPIDEMIOLOGY
  • GENERAL MEDICINE (see Internal Medicine)

The first and earliest principle of evidence-based medicine indicated that a hierarchy of evidence exists. Not all evidence is the same. This principle became well known in the early 1990s as practising physicians learnt basic clinical epidemiology skills and started to appraise and apply evidence to their practice. Since evidence was described as a hierarchy, a compelling rationale for a pyramid was made. Evidence-based healthcare practitioners became familiar with this pyramid when reading the literature, applying evidence or teaching students.

Various versions of the evidence pyramid have been described, but all of them focused on showing weaker study designs in the bottom (basic science and case series), followed by case–control and cohort studies in the middle, then randomised controlled trials (RCTs), and at the very top, systematic reviews and meta-analysis. This description is intuitive and likely correct in many instances. The placement of systematic reviews at the top had undergone several alterations in interpretations, but was still thought of as an item in a hierarchy. 1 Most versions of the pyramid clearly represented a hierarchy of internal validity (risk of bias). Some versions incorporated external validity (applicability) in the pyramid by either placing N-1 trials above RCTs (because their results are most applicable to individual patients 2 ) or by separating internal and external validity. 3

Another version (the 6S pyramid) was also developed to describe the sources of evidence that can be used by evidence-based medicine (EBM) practitioners for answering foreground questions, showing a hierarchy ranging from studies, synopses, synthesis, synopses of synthesis, summaries and systems. 4 This hierarchy may imply some sort of increasing validity and applicability although its main purpose is to emphasise that the lower sources of evidence in the hierarchy are least preferred in practice because they require more expertise and time to identify, appraise and apply.

The traditional pyramid was deemed too simplistic at times, thus the importance of leaving room for argument and counterargument for the methodological merit of different designs has been emphasised. 5 Other barriers challenged the placement of systematic reviews and meta-analyses at the top of the pyramid. For instance, heterogeneity (clinical, methodological or statistical) is an inherent limitation of meta-analyses that can be minimised or explained but never eliminated. 6 The methodological intricacies and dilemmas of systematic reviews could potentially result in uncertainty and error. 7 One evaluation of 163 meta-analyses demonstrated that the estimation of treatment outcomes differed substantially depending on the analytical strategy being used. 7 Therefore, we suggest, in this perspective, two visual modifications to the pyramid to illustrate two contemporary methodological principles ( figure 1 ). We provide the rationale and an example for each modification.

  • Download figure
  • Open in new tab
  • Download powerpoint

The proposed new evidence-based medicine pyramid. (A) The traditional pyramid. (B) Revising the pyramid: (1) lines separating the study designs become wavy (Grading of Recommendations Assessment, Development and Evaluation), (2) systematic reviews are ‘chopped off’ the pyramid. (C) The revised pyramid: systematic reviews are a lens through which evidence is viewed (applied).

Rationale for modification 1

In the early 2000s, the Grading of Recommendations Assessment, Development and Evaluation (GRADE) Working Group developed a framework in which the certainty in evidence was based on numerous factors and not solely on study design which challenges the pyramid concept. 8 Study design alone appears to be insufficient on its own as a surrogate for risk of bias. Certain methodological limitations of a study, imprecision, inconsistency and indirectness, were factors independent from study design and can affect the quality of evidence derived from any study design. For example, a meta-analysis of RCTs evaluating intensive glycaemic control in non-critically ill hospitalised patients showed a non-significant reduction in mortality (relative risk of 0.95 (95% CI 0.72 to 1.25) 9 ). Allocation concealment and blinding were not adequate in most trials. The quality of this evidence is rated down due to the methodological imitations of the trials and imprecision (wide CI that includes substantial benefit and harm). Hence, despite the fact of having five RCTs, such evidence should not be rated high in any pyramid. The quality of evidence can also be rated up. For example, we are quite certain about the benefits of hip replacement in a patient with disabling hip osteoarthritis. Although not tested in RCTs, the quality of this evidence is rated up despite the study design (non-randomised observational studies). 10

Rationale for modification 2

Another challenge to the notion of having systematic reviews on the top of the evidence pyramid relates to the framework presented in the Journal of the American Medical Association User's Guide on systematic reviews and meta-analysis. The Guide presented a two-step approach in which the credibility of the process of a systematic review is evaluated first (comprehensive literature search, rigorous study selection process, etc). If the systematic review was deemed sufficiently credible, then a second step takes place in which we evaluate the certainty in evidence based on the GRADE approach. 11 In other words, a meta-analysis of well-conducted RCTs at low risk of bias cannot be equated with a meta-analysis of observational studies at higher risk of bias. For example, a meta-analysis of 112 surgical case series showed that in patients with thoracic aortic transection, the mortality rate was significantly lower in patients who underwent endovascular repair, followed by open repair and non-operative management (9%, 19% and 46%, respectively, p<0.01). Clearly, this meta-analysis should not be on top of the pyramid similar to a meta-analysis of RCTs. After all, the evidence remains consistent of non-randomised studies and likely subject to numerous confounders.

Therefore, the second modification to the pyramid is to remove systematic reviews from the top of the pyramid and use them as a lens through which other types of studies should be seen (ie, appraised and applied). The systematic review (the process of selecting the studies) and meta-analysis (the statistical aggregation that produces a single effect size) are tools to consume and apply the evidence by stakeholders.

Implications and limitations

Changing how systematic reviews and meta-analyses are perceived by stakeholders (patients, clinicians and stakeholders) has important implications. For example, the American Heart Association considers evidence derived from meta-analyses to have a level ‘A’ (ie, warrants the most confidence). Re-evaluation of evidence using GRADE shows that level ‘A’ evidence could have been high, moderate, low or of very low quality. 12 The quality of evidence drives the strength of recommendation, which is one of the last translational steps of research, most proximal to patient care.

One of the limitations of all ‘pyramids’ and depictions of evidence hierarchy relates to the underpinning of such schemas. The construct of internal validity may have varying definitions, or be understood differently among evidence consumers. A limitation of considering systematic review and meta-analyses as tools to consume evidence may undermine their role in new discovery (eg, identifying a new side effect that was not demonstrated in individual studies 13 ).

This pyramid can be also used as a teaching tool. EBM teachers can compare it to the existing pyramids to explain how certainty in the evidence (also called quality of evidence) is evaluated. It can be used to teach how evidence-based practitioners can appraise and apply systematic reviews in practice, and to demonstrate the evolution in EBM thinking and the modern understanding of certainty in evidence.

  • Leibovici L
  • Agoritsas T ,
  • Vandvik P ,
  • Neumann I , et al
  • ↵ Resources for Evidence-Based Practice: The 6S Pyramid. Secondary Resources for Evidence-Based Practice: The 6S Pyramid Feb 18, 2016 4:58 PM. http://hsl.mcmaster.libguides.com/ebm
  • Vandenbroucke JP
  • Berlin JA ,
  • Dechartres A ,
  • Altman DG ,
  • Trinquart L , et al
  • Guyatt GH ,
  • Vist GE , et al
  • Coburn JA ,
  • Coto-Yglesias F , et al
  • Sultan S , et al
  • Montori VM ,
  • Ioannidis JP , et al
  • Altayar O ,
  • Bennett M , et al
  • Nissen SE ,

Contributors MHM conceived the idea and drafted the manuscript. FA helped draft the manuscript and designed the new pyramid. MA and NA helped draft the manuscript.

Competing interests None declared.

Provenance and peer review Not commissioned; externally peer reviewed.

Linked Articles

  • Editorial Pyramids are guides not rules: the evolution of the evidence pyramid Terrence Shaneyfelt BMJ Evidence-Based Medicine 2016; 21 121-122 Published Online First: 12 Jul 2016. doi: 10.1136/ebmed-2016-110498
  • Perspective EBHC pyramid 5.0 for accessing preappraised evidence and guidance Brian S Alper R Brian Haynes BMJ Evidence-Based Medicine 2016; 21 123-125 Published Online First: 20 Jun 2016. doi: 10.1136/ebmed-2016-110447

Read the full text or download the PDF:

Evidence Based Practice Toolkit

  • What is EBP?
  • Asking Your Question

Levels of Evidence / Evidence Hierarchy

Evidence pyramid (levels of evidence), definitions, research designs in the hierarchy, clinical questions --- research designs.

  • Evidence Appraisal
  • Find Research
  • Standards of Practice

Profile Photo

Levels of evidence (sometimes called hierarchy of evidence) are assigned to studies based on the research design, quality of the study, and applicability to patient care. Higher levels of evidence have less risk of bias . 

Levels of Evidence (Melnyk & Fineout-Overholt 2023)

Level 1

Evidence from a systematic review or meta-analysis of all relevant RCTs (randomized controlled trials).

Level 2

Evidence from at least one well-designed RCT (e.g. large multi-site RCT).

Level 3

Level 4

Evidence from well-designed case-control or cohort studies

Level 5

Evidence from systematic reviews of descriptive and qualitative studies (meta-synthesis)

Level 6

Evidence from a single descriptive or qualitative study, EBP, EBQI and QI projects

Level 7

Evidence from the opinion of authorities and/or reports of expert committees, reports from committees of experts and narrative and literature reviews

*Adapted from: Melnyk, & Fineout-Overholt, E. (2023).  Evidence-based practice in nursing & healthcare: A guide to best practice   (Fifth edition.). Wolters Kluwer.

Levels of Evidence (LoBiondo-Wood & Haber 2022)

1 Systematic Review or meta-analysis of RCTs (randomized control trials)
2 Randomized control trials 
3 Quasi-experimental Studies
4 Non-experimental studies
5 Meta-synthesis
6 Qualitative studies
7 Expert opinions: reports from expert panels and organizations, not based on research

Adapted from LoBiondo-Wood, G. & Haber, J. (2022). Nursing research: Methods and critical appraisal for evidence-based practice (10th ed.). Elsevier.

Evidence Pyramid

" Evidence Pyramid " is a product of Tufts University and is licensed under BY-NC-SA license 4.0

Tufts' "Evidence Pyramid" is based in part on the  Oxford Centre for Evidence-Based Medicine: Levels of Evidence (2009)

Cover Art

  • Oxford Centre for Evidence Based Medicine Glossary

Different types of clinical questions are best answered by different types of research studies.  You might not always find the highest level of evidence (i.e., systematic review or meta-analysis) to answer your question. When this happens, work your way down to the next highest level of evidence.

This table suggests study designs best suited to answer each type of clinical question.

All Clinical Questions

Systematic review, meta-analysis

Therapy

Randomized controlled trial (RCT), meta-analysis 
cohort study, case-control study, case series

Etiology

Randomized controlled trial (RCT), meta-analysis, cohort study 
case-control study, case series

Diagnosis

Randomized controlled trial (RCT) 
cohort study

Prevention

Randomized controlled trial (RCT), meta-analysis 
: prospective study, cohort study, case-control study, case series

Prognosis

Cohort study
case-control study, case series

Meaning

Qualitative study

Quality Improvement

Randomized controlled trial (RCT) 
qualitative study 

Cost

Economic evaluation

  • << Previous: Asking Your Question
  • Next: Evidence Appraisal >>
  • Updated: Aug 26, 2024 12:43 PM
  • URL: https://libguides.winona.edu/ebptoolkit

WSU

  • Levy Library
  • Levy Library Guides
  • Research Support
  • Evidence Based Medicine

The Evidence Hierarchy

Evidence based medicine: the evidence hierarchy.

  • Introduction
  • The PICO Framework
  • Evidence Summary Resources
  • Pre-Appraised Resources
  • Evidence Alert Services
  • Module 3: Appraise
  • Module 4: Apply
  • Training Materials

The hierarchy of evidence  is   a core principal of EBM.  EBM hierarchies rank study types based on the strength and precision of their research methods.  Different hierarchies exist for different question types, and even experts may disagree on the exact rank of information in the evidence hierarchies. Still, most agree that current, well designed systematic reviews and meta-analyses are at the top of the pyramid, and that expert opinion and anecdotal experience are at the bottom.

The Evidence Hierarchy, Questions about Treatment

EBM hierarchy of evidence pyramid adapted from figure 2.1 in Greenhalgh How to Read a Paper 2010 ISBN: 978-1444390360

Figure adapted from Greenhalgh T. How to Read a Paper: The Basics of Evidence-Based Medicine. Hoboken: John Wiley & Sons, Incorporated; 2010. Figure 2.1 A simple hierarchy of evidence for assessing the quality of trial design in therapy studies.

Systematic Reviews

A systematic review is "a review that uses explicit, systematic methods to collate and synthesise findings of studies that address a clearly formulated question". 1

Systematic reviews are useful for:

  • confirming current practices
  • guiding decision-making
  • informing future research

Meta-analyses,  while often part of systematic reviews, are not interchangeable with them. They use statistical analysis to combine data from the studies found in the systematic review process. These studies must be homogenous enough that the data from them can be pooled together.

Limitations of systematic reviews include:

  • That since they are focused on a clearly formulated question, their conclusions only answer that particular question and cannot be generalized
  • That the synthesis of materials is only as reliable as the primary studies the review analyzed
  • Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, Shamseer L, Tetzlaff JM, Akl EA, Brennan SE, Chou R. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews . BMJ. 2021 Mar 29;372.
  • Garg AX, Hackam D, Tonelli M. Systematic review and meta-analysis: when one study is just not enough . Clinical Journal of the American Society of Nephrology. 2008 Jan 1;3(1):253-60.

Primary Studies

In the absence of a current, well designed systematic review is not available, practitioners turn to the primary studies to answer their questions. The best research design depends on the question type. The table below lists optimal study methodologies for common types of clinical questions.

  • Oxford Centre for Evidence-Based Medicine Working Group Oxford (OCEBM). The Oxford Levels of Evidence 2 [Internet]. University of Oxford, Oxford UK; [cited 2021 Dec 13]. Available from: https://www.cebm.ox.ac.uk/resources/levels-of-evidence/ocebm-levels-of-evidence
  • Burns PB, Rohrich RJ, Chung KC. The levels of evidence and their role in evidence-based medicine. Plast Reconstr Surg. 2011 Jul;128(1):305-310. doi: 10.1097/PRS.0b013e318219c171. PMID: 21701348; PMCID: PMC3124652.
  • ACP Journal Club Inclusion Criteria [Internet]. American College of Physicians (ACP); [updated 2013 Nov 14; cited 2021 Dec 13]. Available from: https://www.acpjournals.org/journal/aim/acpjc/inclusion-criteria
  • << Previous: The PICO Framework
  • Next: Module 2: Acquire >>
  • Last Updated: Sep 4, 2024 1:24 PM
  • URL: https://libguides.mssm.edu/ebm

Systematic Reviews

  • Levels of Evidence
  • Evidence Pyramid
  • Joanna Briggs Institute

The evidence pyramid is often used to illustrate the development of evidence. At the base of the pyramid is animal research and laboratory studies – this is where ideas are first developed. As you progress up the pyramid the amount of information available decreases in volume, but increases in relevance to the clinical setting.

Meta Analysis  – systematic review that uses quantitative methods to synthesize and summarize the results.

Systematic Review  – summary of the medical literature that uses explicit methods to perform a comprehensive literature search and critical appraisal of individual studies and that uses appropriate st atistical techniques to combine these valid studies.

Randomized Controlled Trial – Participants are randomly allocated into an experimental group or a control group and followed over time for the variables/outcomes of interest.

Cohort Study – Involves identification of two groups (cohorts) of patients, one which received the exposure of interest, and one which did not, and following these cohorts forward for the outcome of interest.

Case Control Study – study which involves identifying patients who have the outcome of interest (cases) and patients without the same outcome (controls), and looking back to see if they had the exposure of interest.

Case Series   – report on a series of patients with an outcome of interest. No control group is involved.

  • Levels of Evidence from The Centre for Evidence-Based Medicine
  • The JBI Model of Evidence Based Healthcare
  • How to Use the Evidence: Assessment and Application of Scientific Evidence From the National Health and Medical Research Council (NHMRC) of Australia. Book must be downloaded; not available to read online.

When searching for evidence to answer clinical questions, aim to identify the highest level of available evidence. Evidence hierarchies can help you strategically identify which resources to use for finding evidence, as well as which search results are most likely to be "best".                                             

Hierarchy of Evidence. For a text-based version, see text below image.

Image source: Evidence-Based Practice: Study Design from Duke University Medical Center Library & Archives. This work is licensed under a Creativ e Commons Attribution-ShareAlike 4.0 International License .

The hierarchy of evidence (also known as the evidence-based pyramid) is depicted as a triangular representation of the levels of evidence with the strongest evidence at the top which progresses down through evidence with decreasing strength. At the top of the pyramid are research syntheses, such as Meta-Analyses and Systematic Reviews, the strongest forms of evidence. Below research syntheses are primary research studies progressing from experimental studies, such as Randomized Controlled Trials, to observational studies, such as Cohort Studies, Case-Control Studies, Cross-Sectional Studies, Case Series, and Case Reports. Non-Human Animal Studies and Laboratory Studies occupy the lowest level of evidence at the base of the pyramid.

  • << Previous: What is a Systematic Review?
  • Next: Locating Systematic Reviews >>
  • Getting Started
  • What is a Systematic Review?
  • Locating Systematic Reviews
  • Searching Systematically
  • Developing Answerable Questions
  • Identifying Synonyms & Related Terms
  • Using Truncation and Wildcards
  • Identifying Search Limits/Exclusion Criteria
  • Keyword vs. Subject Searching
  • Where to Search
  • Search Filters
  • Sensitivity vs. Precision
  • Core Databases
  • Other Databases
  • Clinical Trial Registries
  • Conference Presentations
  • Databases Indexing Grey Literature
  • Web Searching
  • Handsearching
  • Citation Indexes
  • Documenting the Search Process
  • Managing your Review

Research Support

  • Last Updated: Aug 14, 2024 11:07 AM
  • URL: https://guides.library.ucdavis.edu/systematic-reviews

Logo for University of Minnesota Libraries

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Evidence Hierarchy: What is the Best Evidence?

Now that we have learned about the different types of resources – filtered, unfiltered, and background – let’s take a look at the evidence hierarchy (also known as the levels of evidence).

The evidence hierarchy pyramid is a visual representation of the strength of different research study designs. It can be helpful to think about evidence as a pyramid – not all study designs and resource types are created equal.

Pyramid illustration of types of evidence. The base of the pyramid is background information and expert opinion. The next three levels are defined as unfiltered information. They include, moving up the pyramid, case-controlled studies, case series, and reports. The next three levels, moving up to the peak of the pyramid, are defined as filtered information. Moving up, the layers are critically-appraised individual articles and article synopses, critically-appraised topics or evidence syntheses and guidelines, and finally, at the peak, systematic reviews.

Filtered Information

At the top of the pyramid, we have filtered information – this includes systematic reviews, meta-analyses, and evidence syntheses; practice guidelines; and critically-appraised topics found in clinical resources. This type of information has used a high-quality methodology to synthesize primary resources – meaning that they have searched for available primary literature and evaluated its validity to provide answers to specific clinical questions. It is important to remember that the quality and reliability of filtered information can only be as good as the primary literature it includes.

Unfiltered Information

In the middle of the pyramid, we have unfiltered information – this is known as primary literature. These are individual experimental study designs. A randomized controlled trial is considered the highest quality individual study design, followed by cohort studies and case-controlled studies. We will discuss these study designs in more detail later in the tutorial.

Background Information

At the base of the pyramid, we have background information and expert opinion . Background information is not typically used in making complex clinical decisions, but can be helpful in defining parts of your clinical question.

Evidence-Based Practice Copyright © by Various Authors - See Each Chapter Attribution is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Advocate Health - Midwest Library Homepage

Levels of Evidence and Study Design: Levels of Evidence

Levels of evidence.

  • Study Design
  • Study Design by Question Type
  • Rating Systems

This is a general set of levels to aid in critically evaluating evidence. It was adapted from the model presented in the book, Evidence-Based Practice in Nursing and Healthcare: A Guide to Best Practice  (Melnyk & Fineout-Overholt, 2019). Some specialties may have adopted a slightly different and/or smaller set of levels.

Evidence from a clinical practice guideline based on systematic reviews or meta-analyses of randomized controlled trials. Is this is not available, then evidence from a systematic review or meta-analysis of random controlled trials.

Evidence from randomized controlled studies with good design.

Evidence from controlled trials that have good design but are not randomized.

Evidence from case-control and cohort studies with good design.

Evidence from systematic reviews of qualitative and descriptive studies.

Evidence from qualitative and descriptive studies.

Evidence from the opinion of authorities and/or the reports of expert committees. 

Evidence Pyramid

The pyramid below is a hierarchy of evidence for quantitative studies. It shows the hierarchy of studies by study design; starting with secondary and reappraised studies, then primary studies, and finally reports and opinions, which have no study design. This pyramid is a simplified, amalgamation of information presented in the book chapter “Evidence-based decision making” (Forest et al., 2019) and book Evidence-Based Practice in Nursing and Healthcare: A Guide to Best Practice  (Melnyk & Fineout-Overholt, 2019).

Levels of evidence

Evidence Table for Nursing

Advocate Health - Midwest provides system-wide evidence based practice resources. The Nursing Hub* has an Evidence-Based Quality Improvement (EBQI) Evidence Table , within the Evidence-Based Practice (EBP) Resource. It also includes information on evidence type, and a literature synthesis table.

*The Nursing Hub requires access to the Advocate Health - Midwest SharePoint platform.

Forrest, J. L., Miller, S. A., Miller, G. W., Elangovan, S., & Newman, M. G. (2019). Evidence-based decision making. In M. G. Newman, H. H. Takei, P. R. Klokkevold, & F. A. Carranza (Eds.),  Newman and Carranza's clinical periodontology  (13th ed., pp. 1-9.e1). Elsevier.

  • Melnyk, B. M., & Fineout-Overholt, E. (2019).  Evidence-based practice in nursing and healthcare: A guide to best practice  (4th ed.). Wolters Kluwer. 
  • << Previous: Overview
  • Next: Study Design >>
  • Last Updated: Dec 29, 2023 2:03 PM
  • URL: https://library.aah.org/guides/levelsofevidence

Logo for Open Textbooks @ UQ

21 Hierarchy of research evidence

The hierarchy of research evidence is a framework used to categorise different types of research studies based on their methodological rigor, validity, and potential for bias. It helps researchers and healthcare professionals determine the strength of evidence when making clinical decisions or recommendations. The hierarchy generally progresses from lower levels of evidence (less rigorous) to higher levels (more rigorous). Keep in mind that this hierarchy may slightly vary depending on the field or context, but the following outline provides a common framework (lower levels to higher levels):

1. Expert opinion, editorials, and anecdotal evidence

  • At the lowest level of the hierarchy.
  • Includes opinions, editorials, and anecdotal reports without systematic research methods.
  • Provides minimal or no scientific rigor and often lacks peer review.

2. Case reports and case series

  • Descriptive accounts of individual cases or a series of cases.
  • Limited in terms of generalisability and potential for bias.
  • Often used to generate hypotheses for further research.

3. Cross-sectional studies (descriptive studies)

  • Examines relationships among variables at a single point in time.
  • Observational and descriptive in nature, often based on surveys.
  • Limited in establishing cause-and-effect relationships due to lack of temporal sequence.

4. Case-control studies (analytical observational studies)

  • Compares individuals with a specific outcome (cases) to individuals without that outcome (controls).
  • Retrospective design, examines associations, but causation is difficult to establish.

5. Cohort studies (prospective and retrospective)

  • Follows a group of individuals (cohort) over time to assess the development of outcomes.
  • Can be prospective (forward in time) or retrospective (backward in time).
  • Provides stronger evidence for causality compared to case-control studies.

6. Randomised controlled trials (RCTs)

  • Experimental studies with random assignment of participants to different interventions.
  • Gold standard for establishing cause-and-effect relationships.
  • Involves manipulation of variables to determine the effect of interventions.

7. Systematic reviews and meta-analyses

  • Synthesise and analyse the results of multiple studies on a specific research question.
  • Provide a comprehensive overview of the available evidence.
  • Meta-analyses statistically combine results from different studies to increase statistical power.

8. Clinical practice guidelines

  • Summarise evidence-based recommendations for clinical practice.
  • Developed by expert panels based on systematic reviews and research evidence.
  • Serve as a resource for healthcare professionals to make informed decisions.

It’s important to note that the hierarchy is not strictly linear; different study designs serve different purposes and can provide valuable insights. Researchers and healthcare professionals need to consider the specific research question, study design, potential biases, and the applicability of findings to the patient population in question. While higher levels of evidence are generally considered more reliable, the entire hierarchy contributes to the body of knowledge and informs evidence-based decision-making.

Evidence based practice, clinician supported and patient focused – the triad

The evidence-based triad in nursing and midwifery healthcare refers to the integration of three key elements: clinical expertise, patient values and preferences, and the best available external evidence. This approach emphasises the importance of combining these three components to guide clinical decision-making and improve patient outcomes.

1. Clinical expertise

Clinical expertise refers to the knowledge and skills that healthcare professionals, including nurses and midwives, develop through education, training, and experience. This expertise allows them to make informed judgments about patient care. Research emphasises that clinical expertise should not be undermined in the pursuit of evidence-based practice. Instead, it should complement the other components of the triad. Experienced clinicians can bring a deep understanding of the unique needs of individual patients, recognising subtleties and nuances that may not be captured by research evidence alone.

2. Patient values and preferences

Patient-cantered care has become a cornerstone of modern healthcare. Recognising and respecting patients’ values, preferences, and individual circumstances is essential for providing effective care. The evidence-based triad encourages healthcare professionals to engage patients in shared decision-making, where the patient’s perspective is considered alongside clinical expertise and research evidence. Studies emphasise that involving patients in the decision-making process can lead to greater adherence to treatment plans, increased patient satisfaction, and improved health outcomes.

3. Best available external evidence

The third component of the evidence-based triad involves integrating the best available research evidence into clinical decision-making. This evidence is typically derived from well-designed research studies, systematic reviews, and clinical guidelines. Evidence-based practice encourages healthcare professionals to critically appraise and apply research findings to their patient care decisions. The emphasis on external evidence aims to ensure that clinical decisions are based on scientifically sound and validated information.

Synthesis and application

research study hierarchy

The synthesis of these three components—clinical expertise, patient values, and best available evidence—forms the foundation of evidence-based practice in nursing and midwifery. This approach seeks to strike a balance between scientific rigor and individualised patient care. Literature indicates that successful implementation of the evidence-based triad requires ongoing professional development, interdisciplinary collaboration, and a commitment to lifelong learning. This approach acknowledges that evidence-based practice is not a rigid formula but a dynamic process. It requires healthcare professionals to continually update their knowledge and skills, adapt guidelines to individual patient contexts, and remain open to evolving research and patient preferences. Successful implementation of the evidence-based triad can lead to improved patient outcomes, increased patient satisfaction, and enhanced healthcare quality. The evidence-based triad in nursing and midwifery healthcare recognises the importance of combining clinical expertise, patient values and preferences, and the best available external evidence to guide clinical decision-making. This approach acknowledges the complexities of healthcare and emphasises the need for a patient-centred and holistic approach to providing high-quality care.

Match the term to the type of knowledge generation and definition.

Shearer, M. (1989). Maternity patient movements in the United States 1820-1985. In I. Chalmers, M. Enkin,  M. Keirse (Eds.), Effective care in pregnancy and childbirth (pp. 110-130). Oxford University Press.

van Wagner, V. (2017). Midwives using research: Evidence-based practice and evidence-informed midwifery . Open Library. Creative Commons Attribution-NonCommercial 4.0 International License

van Wagner, V. (2013). Reconsidering evidence: Evidence-based practice and maternity care in Canada . [Doctoral dissertation, York University]. YorkSpace. http://yorkspace.library.yorku.ca/xmlui/handle/10315/31318

Quality in Healthcare: Assessing What We Do Copyright © 2024 by The University of Queensland is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

  • Research Process
  • Manuscript Preparation
  • Manuscript Review
  • Publication Process
  • Publication Recognition
  • Language Editing Services
  • Translation Services

Elsevier QRcode Wechat

Levels of evidence in research

  • 5 minute read
  • 119.2K views

Table of Contents

Level of evidence hierarchy

When carrying out a project you might have noticed that while searching for information, there seems to be different levels of credibility given to different types of scientific results. For example, it is not the same to use a systematic review or an expert opinion as a basis for an argument. It’s almost common sense that the first will demonstrate more accurate results than the latter, which ultimately derives from a personal opinion.

In the medical and health care area, for example, it is very important that professionals not only have access to information but also have instruments to determine which evidence is stronger and more trustworthy, building up the confidence to diagnose and treat their patients.

5 levels of evidence

With the increasing need from physicians – as well as scientists of different fields of study-, to know from which kind of research they can expect the best clinical evidence, experts decided to rank this evidence to help them identify the best sources of information to answer their questions. The criteria for ranking evidence is based on the design, methodology, validity and applicability of the different types of studies. The outcome is called “levels of evidence” or “levels of evidence hierarchy”. By organizing a well-defined hierarchy of evidence, academia experts were aiming to help scientists feel confident in using findings from high-ranked evidence in their own work or practice. For Physicians, whose daily activity depends on available clinical evidence to support decision-making, this really helps them to know which evidence to trust the most.

So, by now you know that research can be graded according to the evidential strength determined by different study designs. But how many grades are there? Which evidence should be high-ranked and low-ranked?

There are five levels of evidence in the hierarchy of evidence – being 1 (or in some cases A) for strong and high-quality evidence and 5 (or E) for evidence with effectiveness not established, as you can see in the pyramidal scheme below:

Level 1: (higher quality of evidence) – High-quality randomized trial or prospective study; testing of previously developed diagnostic criteria on consecutive patients; sensible costs and alternatives; values obtained from many studies with multiway sensitivity analyses; systematic review of Level I RCTs and Level I studies.

Level 2: Lesser quality RCT; prospective comparative study; retrospective study; untreated controls from an RCT; lesser quality prospective study; development of diagnostic criteria on consecutive patients; sensible costs and alternatives; values obtained from limited stud- ies; with multiway sensitivity analyses; systematic review of Level II studies or Level I studies with inconsistent results.

Level 3: Case-control study (therapeutic and prognostic studies); retrospective comparative study; study of nonconsecutive patients without consistently applied reference “gold” standard; analyses based on limited alternatives and costs and poor estimates; systematic review of Level III studies.

Level 4: Case series; case-control study (diagnostic studies); poor reference standard; analyses with no sensitivity analyses.

Level 5: (lower quality of evidence) – Expert opinion.

Levels of evidence in research hierarchy

By looking at the pyramid, you can roughly distinguish what type of research gives you the highest quality of evidence and which gives you the lowest. Basically, level 1 and level 2 are filtered information – that means an author has gathered evidence from well-designed studies, with credible results, and has produced findings and conclusions appraised by renowned experts, who consider them valid and strong enough to serve researchers and scientists. Levels 3, 4 and 5 include evidence coming from unfiltered information. Because this evidence hasn’t been appraised by experts, it might be questionable, but not necessarily false or wrong.

Examples of levels of evidence

As you move up the pyramid, you will surely find higher-quality evidence. However, you will notice there is also less research available. So, if there are no resources for you available at the top, you may have to start moving down in order to find the answers you are looking for.

  • Systematic Reviews: -Exhaustive summaries of all the existent literature about a certain topic. When drafting a systematic review, authors are expected to deliver a critical assessment and evaluation of all this literature rather than a simple list. Researchers that produce systematic reviews have their own criteria to locate, assemble and evaluate a body of literature.
  • Meta-Analysis: Uses quantitative methods to synthesize a combination of results from independent studies. Normally, they function as an overview of clinical trials. Read more: Systematic review vs meta-analysis .
  • Critically Appraised Topic: Evaluation of several research studies.
  • Critically Appraised Article: Evaluation of individual research studies.
  • Randomized Controlled Trial: a clinical trial in which participants or subjects (people that agree to participate in the trial) are randomly divided into groups. Placebo (control) is given to one of the groups whereas the other is treated with medication. This kind of research is key to learning about a treatment’s effectiveness.
  • Cohort studies: A longitudinal study design, in which one or more samples called cohorts (individuals sharing a defining characteristic, like a disease) are exposed to an event and monitored prospectively and evaluated in predefined time intervals. They are commonly used to correlate diseases with risk factors and health outcomes.
  • Case-Control Study: Selects patients with an outcome of interest (cases) and looks for an exposure factor of interest.
  • Background Information/Expert Opinion: Information you can find in encyclopedias, textbooks and handbooks. This kind of evidence just serves as a good foundation for further research – or clinical practice – for it is usually too generalized.

Of course, it is recommended to use level A and/or 1 evidence for more accurate results but that doesn’t mean that all other study designs are unhelpful or useless. It all depends on your research question. Focusing once more on the healthcare and medical field, see how different study designs fit into particular questions, that are not necessarily located at the tip of the pyramid:

  • Questions concerning therapy: “Which is the most efficient treatment for my patient?” >> RCT | Cohort studies | Case-Control | Case Studies
  • Questions concerning diagnosis: “Which diagnose method should I use?” >> Prospective blind comparison
  • Questions concerning prognosis: “How will the patient’s disease will develop over time?” >> Cohort Studies | Case Studies
  • Questions concerning etiology: “What are the causes for this disease?” >> RCT | Cohort Studies | Case Studies
  • Questions concerning costs: “What is the most cost-effective but safe option for my patient?” >> Economic evaluation
  • Questions concerning meaning/quality of life: “What’s the quality of life of my patient going to be like?” >> Qualitative study

Find more about Levels of evidence in research on Pinterest:

Elsevier News Icon

17 March 2021 – Elsevier’s Mini Program Launched on WeChat Brings Quality Editing Straight to your Smartphone

Professor anselmo paiva: using computer vision to tackle medical issues with a little help from elsevier author services, you may also like.

what is a descriptive research design

Descriptive Research Design and Its Myriad Uses

Doctor doing a Biomedical Research Paper

Five Common Mistakes to Avoid When Writing a Biomedical Research Paper

Writing in Environmental Engineering

Making Technical Writing in Environmental Engineering Accessible

Risks of AI-assisted Academic Writing

To Err is Not Human: The Dangers of AI-assisted Academic Writing

Importance-of-Data-Collection

When Data Speak, Listen: Importance of Data Collection and Analysis Methods

choosing the Right Research Methodology

Choosing the Right Research Methodology: A Guide for Researchers

Why is data validation important in research

Why is data validation important in research?

Writing a good review article

Writing a good review article

Input your search keywords and press Enter.

Evidence-Based Practice in Health

  • Introduction
  • PICO Framework and the Question Statement
  • Types of Clinical Question
  • Hierarchy of Evidence

The Evidence Hierarchy: What is the "Best Evidence"?

Systematic reviews versus primary studies: what's best, systematic reviews and narrative reviews: what's the difference, filtered versus unfiltered information, the cochrane library.

  • Selecting a Resource
  • Searching PubMed
  • Module 3: Appraise
  • Module 4: Apply
  • Module 5: Audit
  • Reference Shelf

What is "the best available evidence"?  The hierarchy of evidence is a core principal of Evidence-Based Practice (EBP) and attempts to address this question.  The evidence higherarchy allows you to take a top-down approach to locating the best evidence whereby you first search for a recent well-conducted systematic review and if that is not available, then move down to the next level of evidence to answer your question.

EBP hierarchies rank study types based on the rigour (strength and precision) of their research methods.  Different hierarchies exist for different question types, and even experts may disagree on the exact rank of information in the evidence hierarchies.  The following image represents the hierarchy of evidence provided by the National Health and Medical Research Council (NHMRC). 1

Most experts agree that the higher up the hierarchy the study design is positioned, the more rigorous the methodology and hence the more likely it is that the study design can minimise the effect of bias on the results of the study.  In most evidence hierachies current, well designed systematic reviews and meta-analyses are at the top of the pyramid, and expert opinion and anecdotal experience are at the bottom. 2

Systematic Reviews and Meta Analyses

Well done systematic reviews, with or without an included meta-analysis, are generally considered to provide the best evidence for all question types as they are based on the findings of multiple studies that were identified in comprehensive, systematic literature searches.  However, the position of systematic reviews at the top of the evidence hierarchy is not an absolute.  For example:

  • The process of a rigorous systematic review can take years to complete and findings can therefore be superseded by more recent evidence.
  • The methodological rigor and strength of findings must be appraised by the reader before being applied to patients.
  • A large, well conducted Randomised Controlled Trial (RCT) may provide more convincing evidence than a systematic review of smaller RCTs. 4

Primary Studies

If a current, well designed systematic review is not available, go to primary studies to answer your question. The best research designs for a primary study varies depending on the question type.  The table below lists optimal study methodologies for the main types of questions.

Therapy (Treatment) Randomised Contolled Trial (RCT)
Prevention RCT or Prospective Study
Diagnosis RCT or Cohort Study
Prognosis (Forecast) Cohort Study and/or Case-Control Series
Etiology (Causation) Cohort Study
Meaning Qualitative Study

Note that the Clinical Queries filter available in some databases such as PubMed and CINAHL matches the question type to studies with appropriate research designs. When searching primary literature, look first for reports of clinical trials that used the best research designs. Remember as you search, though, that the best available evidence may not come from the optimal study type. For example, if treatment effects found in well designed cohort studies are sufficiently large and consistent, those cohort studies may provide more convincing evidence than the findings of a weaker RCT.

What is a Systematic Review?

A systematic review synthesises the results from all available studies in a particular area, and provides a thorough analysis of the results, strengths and weaknesses of the collated studies.  A systematic review has several qualities:

  • It addresses a focused, clearly formulated question.
  • It uses systematic and explicit methods:

                  a. to identify, select and critically appraise relevant research, and                   b. to collect and analyse data from the studies that are included in the review

Systematic reviews may or may not include a meta-analysis used to summarise and analyse the statistical results of included studies. This requires the studies to have the same outcome measure.

What is a Narrative Review?

Narrative reviews (often just called Reviews) are opinion with selective illustrations from the literature.  They do not qualify as adequate evidence to answer clinical questions.  Rather than answering a specific clinical question, they provide an overview of the research landscape on a given topic and so maybe useful for background information.  Narrative reviews usually lack systematic search protocols or explicit criteria for selecting and appraising evidence and are threfore very prone to bias. 5

Filtered information appraises the quality of a study and recommend its application in practice.  The critical appraisal of the individual articles has already been done for you—which is a great time saver.  Because the critical appraisal has been completed, filtered literature is appropriate to use for clinical decision-making at the point-of-care. In addition to saving time, filtered literature will often provide a more definitive answer than individual research reports.  Examples of filtered resources include, Cochrane Database of Systematic Reviews , BMJ Clincial Evidence , and ACP Journal Club .

Unfiltered information are original research studies that have not yet been synthesized or aggregated. As such, they are the more difficult to read, interpret, and apply to practice.  Examples of unfiltered resources include, CINAHL , EMBASE , Medline , and PubMe d . 3

Full text

The Cochrane Collaboration is an international voluntary organization that prepares, maintains and promotes the accessibility of systematic reviews of the effects of healthcare. 

The Cochrane Library is a database from the Cochrane Collaboration that allows simultaneous searching of six EBP databases.  Cochrane Reviews are systematic reviews authored by members of the Cochrane Collaboration and available via The Cochrane Database of Systematic Reviews .  They are widely recognised as the gold standard in systematic reviews due to the rigorous methodology used. 

Abstracts of completed Cochrane Reviews are freely available through PubMed and Meta-Search engines such as TRIP database. 

National access to the Cochrane Library is provided by the Australian Government via the National Health and Medical Research Council (NHMRC).

1. National Health and Medical Research Council. (2009). [Hierarchy of Evidence] . Retrieved 2 July, 2014 from: https://www.nhmrc.gov.au/

2. Hoffman, T., Bennett, S., & Del Mar, C. (2013). Evidence-Based Practice: Across the Health Professions (2nd ed.). Chatswood, NSW: Elsevier.

3. Kendall, S. (2008). Evidence-based resources simplified. Canadian Family Physician , 54, 241-243

4. Davidson, M., & Iles, R. (2013). Evidence-based practice in therapeutic health care. In, Liamputtong, P. (ed.). Research Methods in Health: Foundations for Evidence-Based Practice (2nd ed.). South Melbourne: Oxford University Press.

5. Cook, D., Mulrow, C., & Haynes, R. (1997). Systematic reviews: synthesis of best evidence for clinical decisions. Annals of Internal Medicine , 126, 376–80.

  • << Previous: Types of Clinical Question
  • Next: Module 2: Acquire >>
  • Last Updated: Jul 24, 2023 4:08 PM
  • URL: https://canberra.libguides.com/evidence

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Hierarchy of evidence: a framework for ranking evidence evaluating healthcare interventions

Affiliation.

  • 1 Department of Clinical Nursing, University of Adelaide, South Australia 5005.
  • PMID: 12519253
  • DOI: 10.1046/j.1365-2702.2003.00662.x

A number of hierarchies of evidence have been developed to enable different research methods to be ranked according to the validity of their findings. However, most have focused on evaluation of the effectiveness of interventions. When the evaluation of healthcare addresses its appropriateness or feasibility, then existing hierarchies are inadequate. This paper reports the development of a hierarchy for ranking of evidence evaluating healthcare interventions. The aims of this hierarchy are twofold. Firstly, it is to provide a means by which the evidence from a range of methodologically different types of research can be graded. Secondly, it is to provide a logical framework that can be used during the development of systematic review protocols to help determine the study designs which can contribute valid evidence when the evaluation extends beyond effectiveness. The proposed hierarchy was developed based on a review of literature, investigation of existing hierarchies and examination of the strengths and limitations of different research methods. The proposed hierarchy of evidence focuses on three dimensions of the evaluation: effectiveness, appropriateness and feasibility. Research that can contribute valid evidence to each is suggested. To address the varying strengths of different research designs, four levels of evidence are proposed: excellent, good, fair and poor. The strength of the proposed hierarchy is that it acknowledges the valid contribution of evidence generated by a range of different types of research. However, hierarchies only provide a guide to the strength of the available evidence and other issues such as the quality of research also have an important influence.

PubMed Disclaimer

Similar articles

  • How has the impact of 'care pathway technologies' on service integration in stroke care been measured and what is the strength of the evidence to support their effectiveness in this respect? Allen D, Rixson L. Allen D, et al. Int J Evid Based Healthc. 2008 Mar;6(1):78-110. doi: 10.1111/j.1744-1609.2007.00098.x. Int J Evid Based Healthc. 2008. PMID: 21631815
  • Extending an evidence hierarchy to include topics other than treatment: revising the Australian 'levels of evidence'. Merlin T, Weston A, Tooher R. Merlin T, et al. BMC Med Res Methodol. 2009 Jun 11;9:34. doi: 10.1186/1471-2288-9-34. BMC Med Res Methodol. 2009. PMID: 19519887 Free PMC article.
  • The effectiveness of internet-based e-learning on clinician behavior and patient outcomes: a systematic review protocol. Sinclair P, Kable A, Levett-Jones T. Sinclair P, et al. JBI Database System Rev Implement Rep. 2015 Jan;13(1):52-64. doi: 10.11124/jbisrir-2015-1919. JBI Database System Rev Implement Rep. 2015. PMID: 26447007
  • A systematic review of faculty development initiatives designed to improve teaching effectiveness in medical education: BEME Guide No. 8. Steinert Y, Mann K, Centeno A, Dolmans D, Spencer J, Gelula M, Prideaux D. Steinert Y, et al. Med Teach. 2006 Sep;28(6):497-526. doi: 10.1080/01421590600902976. Med Teach. 2006. PMID: 17074699 Review.
  • Primary Care Interventions to Prevent Motor Vehicle Occupant Injuries [Internet]. Williams S, Whitlock E, Smith P, Edgerton B, Beil T. Williams S, et al. Rockville (MD): Agency for Healthcare Research and Quality (US); 2007 Aug. Report No.: 07-05103-EF-1. Rockville (MD): Agency for Healthcare Research and Quality (US); 2007 Aug. Report No.: 07-05103-EF-1. PMID: 20722149 Free Books & Documents. Review.
  • NICE Approaches to Expert Opinion Evidence in Highly Specialised Technologies: Time to Change? Evidence Assessment Group Perspective. Moradi N, O'Connor N, Thomson KH, Shabaninejad H, Sotire T, Still M, Fernandez-Garcia C, Wallace SA, Williams O, Vale L, Sagoo GS. Moradi N, et al. Pharmacoeconomics. 2024 Sep;42(9):913-917. doi: 10.1007/s40273-024-01405-8. Epub 2024 Jun 11. Pharmacoeconomics. 2024. PMID: 38862872 No abstract available.
  • Evaluating the impact of virtual reality game training on upper limb motor performance in children and adolescents with developmental coordination disorder: a scoping review using the ICF framework. Alharbi M, Du H, Harris D, Wood G, Dodd H, Buckingham G. Alharbi M, et al. J Neuroeng Rehabil. 2024 Jun 5;21(1):95. doi: 10.1186/s12984-024-01393-y. J Neuroeng Rehabil. 2024. PMID: 38840217 Free PMC article. Review.
  • The Australian Traumatic Brain Injury Initiative: Statement of Working Principles and Rapid Review of Methods to Define Data Dictionaries for Neurological Conditions. Bagg MK, Hicks AJ, Hellewell SC, Ponsford JL, Lannin NA, O'Brien TJ, Cameron PA, Cooper DJ, Rushworth N, Gabbe BJ, Fitzgerald M. Bagg MK, et al. Neurotrauma Rep. 2024 Apr 11;5(1):424-447. doi: 10.1089/neur.2023.0116. eCollection 2024. Neurotrauma Rep. 2024. PMID: 38660461 Free PMC article.
  • Estimating the Prognostic Value of the NTRK Fusion Biomarker for Comparative Effectiveness Research in The Netherlands. Santi I, Vellekoop H, M Versteegh M, A Huygens S, Dinjens WNM, Mölken MR. Santi I, et al. Mol Diagn Ther. 2024 May;28(3):319-328. doi: 10.1007/s40291-024-00704-2. Epub 2024 Apr 14. Mol Diagn Ther. 2024. PMID: 38616205 Free PMC article.
  • Evidence uptake is only part of the process: Stakeholders' insights on WHO treatment guideline recommendation processes for radical cure of P. vivax malaria. Ruwanpura VSH, Grietens KP, Price RN, Thriemer K. Ruwanpura VSH, et al. PLOS Glob Public Health. 2024 Mar 14;4(3):e0002990. doi: 10.1371/journal.pgph.0002990. eCollection 2024. PLOS Glob Public Health. 2024. PMID: 38483936 Free PMC article.

Publication types

  • Search in MeSH

Related information

  • Cited in Books

LinkOut - more resources

Full text sources.

  • Ovid Technologies, Inc.

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Library Homepage

  • Literature Searching
  • Steps in Conducting a Literature Search
  • 1 Reflection on Research Question
  • Primary and Secondary Sources

Hierarchy of Evidence

  • Keyword Identification
  • Boolean Search Operators
  • Subject Headings, Thesaurus and MeSH
  • 4 Conducting Search
  • Critical Appraisal
  • Critical Appraisal Tools
  • 6 Documenting Search
  • Literature Reviews

research study hierarchy

The "best available evidence" in the hierarchy of evidence refers to:

Systematic Reviews and Meta-Analyses

Well-conducted systematic reviews and meta-analyses are generally considered to provide the highest level of evidence for all types of questions in evidence-based practice (EBP) hierarchies. This is because:

  • Systematic reviews synthesize the results from all available studies on a particular topic through comprehensive literature searches and thorough analysis.
  • Meta-analyses combine and analyze the data from multiple studies, providing a more reliable summary of evidence than individual studies.
  • The greater the number and diversity of participants and settings included in a meta-analysis, the more convincing the evidence.

However, it is important to note that the position of systematic reviews at the top of the evidence hierarchy is not absolute. If a current, well-designed systematic review is not available, one should move down to the next level of evidence, which typically consists of primary studies such as randomized controlled trials (RCTs), cohort studies, or case-control studies, depending on the type of question being addressed. Additionally, the quality of a systematic review depends on the quality of the studies included, and a large, well-conducted RCT may provide stronger evidence than a systematic review of small or poor-quality studies. In summary, while systematic reviews and meta-analyses are generally considered the "best available evidence" in EBP hierarchies, the specific study design that provides the highest level of evidence may vary depending on the availability of high-quality systematic reviews and the type of question being addressed.

  • << Previous: Primary and Secondary Sources
  • Next: 3 Search Strategy >>
  • Last Updated: Sep 4, 2024 11:42 AM
  • URL: https://library.lsbu.ac.uk/literaturesearching

Home

The hierarchy of research evidence - from well conducted meta-analysis down to small case series

PLEASE NOTE:

We are currently in the process of updating this chapter and we appreciate your patience whilst this is being completed.

Evidence-based medicine has been described as ‘the conscientious, explicit and judicious use of current best evidence in making decisions about the care of individual patients.’ 1 This involves evaluating the quality of the best available clinical research, by critically assessing techniques reported by researchers in their publications, and integrating this with clinical expertise. Although it has provoked controversy, the hierarchy of evidence lies at the heart of the appraisal process.  

Ranking of trial designs

The hierarchy indicates the relative weight that can be attributed to a particular study design. Generally, the higher up a methodology is ranked, the more robust it is assumed to be. At the top end lies the meta-analysis – synthesising the results of a number of similar trials to produce a result of higher statistical power. At the other end of the spectrum lie individual case reports, thought to provide the weakest level of evidence.

Several possible methods for ranking study designs have been proposed, but one of the most widely accepted is listed below. 2 Information about the individual study designs can be found elsewhere in Section 1A .

  • Systematic reviews and meta-analyses
  • Randomised controlled trials
  • Cohort studies
  • Case-control studies
  • Cross-sectional surveys
  • Case series and case reports  

Concerns and caveats

The hierarchy is widely accepted in the medical literature, but concerns have been raised about the ranking of evidence, versus that which is most relevant to practice. Particular concerns are highlighted below.

  • Techniques lower down the ranking are not always superfluous. For example, the link between smoking and lung cancer was initially discovered via case-control studies carried out in the 1950s 3 . Although randomised control trials (RCTs) are considered more robust, it would in many cases be unethical to perform an RCT. For example, if studying a risk factor exposure, you would need a cohort exposed to the risk factor by chance or personal choice.
  • The hierarchy is also not absolute. A well-conducted observational study may provide more compelling evidence about a treatment than a poorly conducted RCT.
  • The hierarchy focuses largely on quantitative methodologies. However, it is again important to choose the most appropriate study design to answer the question. For example, it is often not possible to establish why individuals choose to pursue a course of action without using a qualitative technique, such as interviewing.

Alternatives to the traditional hierarchy of evidence have been suggested. For example, the GRADE system (Grades of Recommendation, Assessment, Development and Evaluation) classifies the quality of evidence not only based on the study design, but also the potential limitations and, conversely, the positive effects found. For example, an observational study would start off as being defined as low-quality evidence. However, they can be downgraded to “very low” quality if there are clear limitations in the study design, or can be upgraded to “moderate” or “high” quality if they show a large magnitude of effect or a dose-response gradient.

The GRADE system is summarised in the following table (reproduced from 4 ):

Insert diagram re the GRADE system here:

The Oxford Centre for Evidence-Based Medicine have also developed individual levels of evidence depending on the type of clinical question which needs to be answered. For example, to answer questions on how common a problem is, they define the best level of evidence to be a local and current random sample survey, with a systematic review being the second best level of evidence. The complete table of clinical question types considered, and the levels of evidence for each, can be found here . 5

  • Sackett DL, Rosenberg WM, Gray JA, Haynes RB, Richardson WS. Evidence based medicine: what it is and what it isn't. BMJ 1996: 312:7023
  • Guyatt GH, Sackett DL, Sinclair JC, Hayward R, Cook DJ, Cook RJ. Users' guides to the medical literature. IX. A method for grading health care recommendations. JAMA 1995; 274:1800-4.
  • Doll R and Hill AB. Smoking and carcinoma of the lung. BMJ 1950;2:739.
  • Takada T, Strasberg S, Solomkin J et al. Updated Tokyo Guidelines for the management of acute cholangitis and cholecystitis. Journal of Hepato-Biliary-Pancreatic Sciences 2013;20:1-7.
  • Oxford Centre for Evidence-Based Medicine. Levels of evidence, 2011 http://www.cebm.net/wp-content/uploads/2014/06/CEBM-Levels-of-Evidence-2.1.pdf   - Accessed 8/04/17 

Further reading

  • Greenhalgh T. How to Read a Paper: The Basics of Evidence Based Medicine. London: BMJ, 2001
  • Guyatt G, Rennie D et al. Users' Guides to the Medical Literature: A Manual for Evidence-Based Clinical Practice. McGraw-Hill Medical, 2008.

© Helen Barratt 2009, Saran Shantikumar 2018

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Front Vet Sci

Levels of Evidence, Quality Assessment, and Risk of Bias: Evaluating the Internal Validity of Primary Research

Jan m. sargeant.

1 Department of Population Medicine, Ontario Veterinary College, University of Guelph, Guelph, ON, Canada

Marnie L. Brennan

2 Centre for Evidence-Based Veterinary Medicine, School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, Loughborough, United Kingdom

Annette M. O'Connor

3 Department of Large Animal Clinical Sciences, College of Veterinary Medicine, Michigan State University, East Lansing, MI, United States

Clinical decisions in human and veterinary medicine should be based on the best available evidence. The results of primary research are an important component of that evidence base. Regardless of whether assessing studies for clinical case management, developing clinical practice guidelines, or performing systematic reviews, evidence from primary research should be evaluated for internal validity i.e., whether the results are free from bias (reflect the truth). Three broad approaches to evaluating internal validity are available: evaluating the potential for bias in a body of literature based on the study designs employed (levels of evidence), evaluating whether key study design features associated with the potential for bias were employed (quality assessment), and applying a judgement as to whether design elements of a study were likely to result in biased results given the specific context of the study (risk of bias assessment). The level of evidence framework for assessing internal validity assumes that internal validity can be determined based on the study design alone, and thus makes the strongest assumptions. Risk of bias assessments involve an evaluation of the potential for bias in the context of a specific study, and thus involve the least assumptions about internal validity. Quality assessment sits somewhere between the assumptions of these two. Because risk of bias assessment involves the least assumptions, this approach should be used to assess internal validity where possible. However, risk of bias instruments are not available for all study designs, some clinical questions may be addressed using multiple study designs, and some instruments that include an evaluation of internal validity also include additional components (e.g., evaluation of comprehensiveness of reporting, assessments of feasibility or an evaluation of external validity). Therefore, it may be necessary to embed questions related to risk of bias within existing quality assessment instruments. In this article, we overview the approaches to evaluating internal validity, highlight the current complexities, and propose ideas for approaching assessments of internal validity.

Introduction

Every day in clinical practice, veterinary professionals need to make decisions ranging from a decision as to whether (or not) to use an intervention or to apply a diagnostic test, to decisions about the overall management of complex clinical conditions. Increasingly, it is expected that clinical decisions are evidence-based. Evidence-based veterinary medicine incorporates clinician experience, client preferences, animal needs, and scientific evidence when making clinical decisions ( 1 ). In this approach, scientific evidence is obtained from relevant research. When research-based evidence does not exist, other sources of evidence, such as expert opinion may need to be used. Traditional narrative reviews provide an overview of a topic, and thus may be an attractive way of quickly acquiring knowledge for making clinical decisions. However, narrative reviews generally do not provide information on the identification and selection of the primary research being summarized (if any), the methodological quality of the studies, or the magnitude of the expected effect ( 2 , 3 ).

Formal methods have been developed to systematically identify, select, and synthesize the available evidence to assist veterinary professionals in evidence-based decision-making. These include critically appraised topics (CATs) ( 4 ), systematic review and meta-analysis (SR-MA) ( 5 – 7 ), and clinical practice guidelines ( 8 ) (see Box 1 for a short overview of these methods). These evidence synthesis approaches have different purposes which results in different processes and endpoints, but each includes an assessment of the internal validity of the research used. Critical appraisal of an individual study also includes an evaluation of internal validity, in addition to an evaluation of feasibility and generalizability ( 10 ). The evaluation of internal validity is the focus of this article. Understanding the different ways internal validity can be assessed, and the assumptions associated with these approaches, is necessary for researchers evaluating internal validity, and for veterinary professionals to assess studies for integration of evidence into practice.

Overview of synthesis methods used in veterinary practice and research.

Systematic review, meta-analysis, and network meta-analysis: Systematic review is a structured methodology for identifying, selecting and evaluating all relevant research to address a structured question, which may relate to descriptive characteristics such as prevalence, etiology, efficacy of interventions, or diagnostic test accuracy ( 5 ). Meta-analysis is the statistical combination of results from multiple studies. For addressing questions on intervention efficacy, meta-analysis provides an overall effect size for pairwise comparisons between two intervention groups. Network meta-analysis allows an estimation of the comparative efficacy across all available intervention options ( 6 ), which may provide more relevant information for veterinary professionals when there are multiple intervention options available. However, systematic reviews with pairwise meta-analysis or network meta-analysis require that a body of research exists that can be synthesized to address a clinical question and can also be resource and time intensive to conduct. Therefore, there are many clinical questions for which formally synthesized research summaries do not exist.

Critically appraised topics: Critically appraised topics (CATs) use the same principles as systematic reviews to address clinical questions but employ a more rapid approach, particularly in relation to the screening and summation of the evidence. They were designed to be employed by clinicians as a way of rapidly gathering and interpreting evidence on clinical questions relating to specific cases ( 4 ). Therefore, there is a greater risk that research addressing the question may be missed. However, in the absence of a well conducted systematic review or meta-analysis, CATs can provide a faster evaluation of research addressing a clinical question and can be undertaken by veterinary professionals who may have fewer resources and potentially less methodological or statistical expertise, particularly if they are freely available and accessible.

Clinical practice guidelines: Veterinary professionals often are involved in the management of complex clinical conditions, where an array of questions need to be addressed, including those related to etiology, prognosis, diagnostic test accuracy, and intervention efficacy. Clinical practice guidelines are intended to assist healthcare professionals in assessing more than one aspect of case approach, including appropriate prevention, diagnosis, treatment, or clinical management of diseases, disorders, and other health conditions ( 9 ). Although there are differences in the methods among authors and institutions, the key elements of guideline development include the establishment of a multidisciplinary working group to develop the guidelines, the involvement of appropriate stakeholders, identification of the topic area, systematic searches for research evidence, assessment of the internal validity of studies comprising the evidence base, a process for drafting recommendations, and ongoing review and updating of the guidelines as new evidence becomes available ( 8 ).

Internal validity refers to the extent to which the study results reflect the true state of nature (i.e., whether the effect size estimated in a study is free from systematic error, also called bias) ( 11 ). Although there are a large number of named biases ( 12 ), for studies that assess interventions or risk factors, the biases can be categorized into three broad types of bias: selection bias, information bias, and confounding ( 13 ). Selection bias impacts the effect size if, compared to the source population, the exposure or intervention groups differ in the distribution of factors associated with the outcome at the time the study population is selected, or if differential loss to follow up between groups occurs during the study. In case-control studies, selection bias occurs if cases or controls are selected based on criteria that are related to the exposure of interest. Information bias occurs when there are errors in measuring the exposure or intervention, or the outcome, or both. Finally, confounding is a mixing of effects that occurs when a variable (the confounder) that is independently associated with both the exposure and the outcome is not properly controlled. When confounding is not controlled, the estimate of the relationship between the exposure and the outcome will be biased.

There are several terms used to describe the approaches to assessing internal validity of primary research studies, including evidence hierarchies and levels of evidence, quality assessment, and risk of bias assessment. The use of these terms may be confusing, and it is not uncommon for some of these terms to be used interchangeably ( 14 , 15 ). Also, authors may mislabel the approaches and some evaluation tools (instruments) available for assessing internal validity may include additional components, such as those related to comprehensiveness (quality) of reporting, feasibility of applying an intervention, or external validity. Finally, some instruments may use the approach as a label for the instrument [e.g., Cochrane's risk of bias tool ( 16 ), which is an instrument that employs a risk of bias approach] and other instruments may not include the approach in the instrument name [e.g., the Jadad scale ( 17 ), which employs a quality assessment approach]. In an evaluation of the comprehensiveness of reporting in animal health systematic reviews (SRs), Sargeant et al., ( 18 )found that a range of instruments involving all three approaches had been used for assessing the internal validity of primary research studies. Although a large number of instruments are available, the approaches within each instrument used to assess internal validity can be grouped into three broad categories: based on study design, based on the presence or absence of design features, or based on a judgement about bias in the context of the study. These categories generally correspond to levels of evidence, quality assessment, and risk of bias, respectively. Therefore, our objective was to review these approaches to assessing internal validity as distinct entities and to describe the assumptions associated with each approach. Although we provide examples of specific instruments that include an evaluation of internal validity, our focus is on the approaches, rather than the tools. We discuss advances in the use of these approaches to assessing internal validity in human healthcare and propose a process for veterinary medicine for selecting the approach with the least assumptions as appropriate to the clinical question, the purpose of the assessment, and the research found that addresses the question of interest. The target audience for this article is individuals who assess internal validity of studies, individuals who develop instruments that include items related to the assessment of internal validity, and those who use evidence synthesis products created by others, such as systematic reviews or clinical practice guidelines.

Evaluating Internal Validity by Study Design: Levels of Evidence

Levels of evidence is an approach to evaluating the internal validity of a body of evidence, based on the potential for bias which is inherent to the employed study designs that were used to address the clinical question. The concept behind levels of evidence is that there is a hierarchy of study designs, with different study designs having different potential for bias. The way evidence hierarchies are used is based on either the name of the design or the description of the design. Readers of a study look for this information, then determine the design and assign a level of evidence. No further differentiation of methodological features or judgment is conducted.

Evidence hierarchies were initially introduced in 1979 by the Canadian Task Force on the Periodic Health Examination ( 19 ), with further development into an evidence pyramid by David Sackett in 1989 ( 20 ). A pyramid shaped figure commonly is used to illustrate the hierarchy of study designs for evaluating the efficacy of an intervention under realistic-use conditions (owned animals, as opposed to experimental settings), with the potential for bias decreasing from the base to the top of the pyramid ( Figure 1 ). Thus, study designs on the top of the pyramid represent those with inherently lower risk of bias compared to study designs lower on the hierarchy. The pyramid shape acknowledges that the quantity of research tends to decrease in the higher levels of evidence (for instance, there will be a larger volume of randomized controlled trials (RCTs) compared to SR-MA). Suggested modifications to the evidence pyramid for veterinary intervention studies include dividing RCTs into those conducted under realistic-use conditions vs. those conducted in nonrealistic-use conditions (e.g., research facility) ( 21 ), the inclusion of challenge trials (where disease outcomes are deliberately induced) below RCTs in the pyramid ( 21 , 22 ), and increasing the interpretability of the concept for students by displaying the hierarchy as a staircase rather than a pyramid ( 23 ).

An external file that holds a picture, illustration, etc.
Object name is fvets-09-960957-g0001.jpg

Illustration of an evidence pyramid hierarchy for addressing intervention studies in veterinary medicine. SR, systematic review; MA, meta-analysis; RCT, randomized controlled trial.

The concept of evaluating the potential for bias in an individual study based on the study design can be extended to an evaluation of the potential for bias in a body of literature. This approach for evaluating the internal validity of a body of literature is referred to as “levels of evidence”. The approach is applied by identifying research (or other evidence) that pertains to the clinical question, determining the study design used for each of the studies, and then assigning each study to a level of evidence based on that design. For instance, a framework for levels of evidence in veterinary clinical nutrition has been proposed by Roudebush et al. ( 24 ). In this framework, level 1 evidence corresponds to at least 1 appropriately designed RCT in the target species with natural disease development, level 2 evidence would correspond to RCTs in laboratory settings with natural disease development, level 3 evidence would be obtained from non-randomized trials, deliberate disease induction trials, analytical observational studies or case series, and level 4 evidence would correspond to expert opinion, descriptive studies, studies in other species, or pathophysiological justification. Therefore, if the clinical question involves interventions, and the evidence found to address the question consists of 2 RCTs, 3 case-control studies, and 3 case series, the evidence would be designated as “level 1 evidence” because study designs with the highest evidentiary level in the available research consisted of RCTs. If all available evidence was from expert opinion, the body of research would comprise “level 4” evidence. This evidence would represent the best available evidence to inform decision-making at the time the assessment was made, although the overall level assigned would change as higher evidentiary level information becomes available.

The levels of evidence approach may be perceived as a quick and easy approach to assessing internal validity because it requires only a knowledge of the study design employed and not the individual features of a study that may or may not be associated with the potential for bias. However, that ease of use is based on very strong assumptions: 1) that study design maps directly to bias, 2) that authors always correctly label study designs, and 3) that authors execute and report study designs appropriately. The approach also pertains to a body of evidence, implying that there are multiple comparable studies available to address the question of interest.

An important critique of levels of evidence is that the approach focuses on the study design, rather than the actual design features that were used or the context of the study. Thus, although this framework illustrates the inherent potential for bias of the different study designs, it does not provide a consideration of the methodological rigor with which any specific individual study was conducted ( 25 ). For instance, although a well-conducted cohort study may be less biased than a poorly executed RCT, this nuance is not captured by a levels of evidence approach. Additionally, levels of evidence are based on the potential for confounding and selection biases, but there is no mechanism to evaluate the potential for information bias because this is linked to the outcome and the levels of evidence approach is based on features at the study, rather than outcome, level. For instance, RCTs provide a higher level of evidence compared to observational studies because random allocation to intervention groups minimizes the potential for confounding, and case-control studies provide a lower level of evidence than cohort studies because they are more prone to selection bias. However, a RCT that used a subjectively measured outcome would be assigned a higher level of evidence than a cohort study with an objective outcome, although the observational study may have a lower risk of information bias. Finally, studies may be mislabeled in terms of their study design; there is empirical evidence that this occurs in the veterinary literature ( 26 – 28 ). For example, studies labeled as case series in veterinary medicine frequently include a component corresponding to a cohort study design ( 27 ); these studies may be assigned an inappropriately low level of evidence if individuals classifying these studies rely on authors terminology rather than the complete design description to determine the design employed.

An additional consideration is that for questions related to aspects of clinical care other than selection of interventions, the framework and positioning of study designs included in Figure 1 may not be appropriate. Levels of evidence schema are available for other clinical questions, such as prognosis, diagnostic test accuracy, disease screening, and etiology ( 29 , 30 ).

Evaluating Internal Validity Based on Inclusion of Study Features Associated With Bias: Quality Assessment

As the name implies, quality assessment represents an evaluation of the quality of a primary research article. However, the term “quality” is difficult to specifically define in the context of evidence-based medicine, in that it does not appear to have been used consistently in the literature. The Merriam-Webster dictionary defines quality as “how good or bad something is” or “a high level of value or excellence” ( https://www.merriam-webster.com/dictionary/quality ). Quality generally is understood to be a multi-dimensional concept. While clear definitions are difficult to find in the research literature, the lay literature includes numerous treaties on the dimensions of quality. One example is the eight dimensions of quality delineated by David Gavin, which include performance, features, reliability, conformance, durability, serviceability, aesthetics, and perceived quality ( https://en.wikipedia.org/wiki/Eight_dimensions_of_quality ).

The findings from a review ( 31 ) identified that available instruments labeled as quality assessment tools varied in clarity and often involved more than just assessing internal validity. In addition to including an assessment of internal validity, quality assessment instruments also generally contain elements related to quality of reporting or an assessment of the inclusion of study features not directly related to bias, such as whether ethical approval was sought or whether the study participants were similar to those animals in the care of the individual doing the critique ( 14 , 31 – 33 ).

Quality assessment as an approach to evaluating internal validity involves an evaluation of the presence or absence of design features, i.e., a methodological checklist ( 14 , 15 ). For example, the Jadad scale ( 17 ) involves completing a checklist of whether the study was described as randomized, whether the study was described as double blind, and whether there was a description of withdrawals and dropouts, with points assigned for each category. Therefore, the Jadad scale uses a quality assessment approach to evaluating internal validity. In terms of assumptions, the quality assessment approach also makes strong assumptions, although these are less than those used in levels of evidence assessments. Instead of mapping bias to the study design, quality assessment maps bias to a design feature i.e., if a trial was randomized, it is assumed to be “good quality” and if the trial was not randomized the assumption is that it is “poor quality”. The same process is followed for additional study aspects, such a blinding or losses to follow-up, and an overall assessment of quality is then based on how the study 'performs' against these questions.

Quality assessment also considers more than just confounding and selection bias as components of internal validity. The inclusion of blinding as a design feature of interest illustrates this. Blinding as a design feature is intended to reduce the potential for differential care as a source of confounding bias (blinding of caregivers) or may be intended to reduce the potential for information bias (blinding of outcome assessors). Conducting a quality assessment is more complicated and time-consuming than evaluating levels of evidence because the presence or absence of the specific design features needs to be identified and validated within the study report. However, the approach requires only that the person evaluating internal validity can identify whether (or not) a design feature was used. Therefore, this approach requires more technical expertise that the levels of evidence approach, but less than the risk of bias approach.

Evaluating Internal Validity by Making Contextualized Judgements on Potential Occurrence of Bias: Risk of Bias Assessment

Risk of bias assessments have been developed specifically for evaluating the potential for elements of the design or conduct employed within a study to lead to a biased effect size ( 34 , 35 ). The components of risk of bias assessments are selected based on empirical evidence of their association with estimates of effect sizes ( 24 , 32 ). The way risk of bias assessments work is that individuals evaluating a study for internal validity answer a series of signaling questions about the presence or absence of design features followed by a judgment about the potential for the use of the design feature to lead to a biased estimate in the context of the specific study. A conclusion is then reached about potential for bias based on all evaluated design features in the context of the study. Thus, a risk of bias assessment makes fewer assumptions about the link between study design and design features compared to quality assessment. For instance, a quality assessment for an RCT would include an evaluation as to whether blinding of outcome assessors occurred, whereas a risk of bias assessment would involve an evaluation not only as to whether blinding was used, but also a judgement as to whether a lack of blinding of outcome assessors would be likely to lead to a biased estimate given the context of the study and the outcome measures used. Thus, a RCT that did not include blinding of outcome assessors might be rated as poor on a quality assessment but might not be a concern in a risk of bias assessment if the outcomes were measured objectively, precluding the likelihood that the estimate would be biased by a knowledge of the intervention group when classifying the outcomes. Because of the necessity of making a judgement about the potential that bias is associated with design features in the context of a specific topic area, this approach requires the highest level of knowledge of study design and bias. The risk of bias approach also generally is conducted at the outcome level, rather than at the study level. For instance, an unblinded RCT of interventions to treat lameness might be considered to have a high risk of bias if the outcome was assessed by owners (a subjective outcome) but not if the outcome was assessed by force plate measurement (an objective outcome). For a level of evidence assessment, the assessment of internal validity would be high quality because the trial was an RCT. For quality assessment, the study may be considered poor quality because it was unblinded, but the overall judgement would be dependent on a number of other study design flaws identified. Finally, in a risk of bias assessment, the study would likely be low risk of bias for the objective outcome and high risk of bias for the subjective outcome if blinding was not used.

Some components of a risk of bias assessment are the same as those included in a quality assessment approach (e.g., an assessment of randomization, allocation concealment, and blinding could be included in both). However, the way the assessment is done differs, with quality assessments generally involving present/absent judgements as opposed to assessments as to whether the risk of bias is likely or not. Hartling et al. ( 14 ) applied two instruments using a quality assessment approach and one instrument using a risk of bias approach to a sample of 163 trials and found that there was low correlation between quality assessment and risk of bias approaches when comparing the assessment of internal validity.

Although the critical elements for risk of bias are well described for RCTs in human healthcare and to a large extent in veterinary RCTs, these elements are not as well described for non-randomized trials and observational studies where allocation to groups is not under the control of the investigator. There are some risk of bias tools available for assessing risk of bias in non-randomized studies, such as ROBINS-I ( 36 ). However, ROBINS-I has been criticized for being challenging to use and for having low reliability, particularly amongst less experienced raters ( 37 , 38 ). A review and critique of approaches to risk of bias assessment for observational studies is available ( 39 ). It is anticipated that risk of bias tools for observational study designs, including studies related to questions of prognosis and causation, will continue to evolve as new instruments are developed and validated.

Historical Contexts and Comparisons of Internal Validity Assessment Approaches

Currently, the available approaches to assessing internal validity tend to be used for different applications. Levels of evidence have previously been used for creating evidence-based recommendations or clinical practices guidelines ( 30 , 40 , 41 ), where it is anticipated that multiple study designs may have been used to address the clinical question(s) of interest. Both quality assessment and risk of bias assessment approaches have been used as a component of systematic reviews with meta-analysis or network meta-analysis, as the intended product of these reviews is to summarize a single parameter (such as incidence or prevalence) or a summary effect size (such as a risk ratio, odds ratio, or hazard ratio) where it is desired that the estimate is unbiased. Often, that estimate is derived from studies with the same study design or a narrow range of study designs from high levels in the evidence hierarchy for the research question type. Therefore, the focus is on a specific parameter estimate based on multiple studies, rather than a descriptive summary of the evidentiary strength of those studies.

However, the different approaches are not necessarily mutually exclusive, but are nested within each other based on assumptions, and the methodology and use of the different approaches has evolved over time. As previously described, a criticism of the use of levels of evidence is that the potential for bias is based on the study design that was employed, rather than the methodological rigor of a specific study ( 42 ). For this reason, many frameworks for levels of evidence included wording such as “appropriately designed” ( 24 ) or “well designed” ( 41 )for the study designs, although the criteria for determining whether a study was designed and executed with rigor generally is not described. A lack of transparency for the criteria for evaluating internal validity of studies within an evidence level is problematic for individuals wishing to use the results. An example of the evolution toward more transparent considerations of internal validity of individual studies within a levels of evidence framework is seen in the progression of the Australian National Health and Medical Research Council (NHMRC) system for evaluating evidence in the development of clinical practice guidelines. The designation of levels of evidence in this framework originally was based on levels of evidence, with descriptors such as “properly-designed” or “well-designed” included for each type of study design ( 40 ). A concern with this approach was that the framework was not designed to address the strength of evidence from individual studies within each evidence level ( 43 ). Therefore, the framework was modified to include the use of risk of bias evaluations of individual studies within each evidence level. The combined use of levels of evidence and risk of bias assessment of studies within each level of evidence now forms the “evidence base” component of the NHMRC's FORM framework for the development of evidence-based clinical guidelines ( 44 ).

Another example of the evolution of approaches to assessing internal validity is from the Cochrane Back review group, who conduct systematic reviews of neck and back pain. The initial methods guidelines, published in 1997, recommended that a quality assessment be performed on each included study, with each item in the quality assessment tool scored based on whether the authors reported their use ( 45 ). Updated methods guidelines were published in 2003 ( 46 ). The framework for levels of evidence in this guidance was restricted to a consideration of randomized controlled trials and non-randomized controlled clinical trials, as these were considered the study designs that potentially were appropriate to address research questions in this content area. In the updated guidelines, the recommendations for the assessment of internal validity moved to a risk of bias approach, where judgements were made on whether the characteristics of each study were likely to lead to biased study results. In the 2003 methods guidelines, levels of evidence were recommended as an approach to qualitative analysis rather than the use of “vote counting” (summing the number of studies where a positive or negative outcome was reported). The guidelines were again updated in 2009 ( 47 ). In this version, the assessment of the internal validity of individual studies explicitly employed a risk of bias approach. It was further recommended that the use of evidence levels as a component of a qualitative synthesis be replaced with a formal rating of the quality of the evidence for each of the included outcomes. It was recommended that review authors use the GRADE approach for this component. The GRADE approach explicitly includes a consideration of the risk of bias across all studies included in the review, as well as an assessment of the consistency of results across studies, the directness of the evidence to the review question, the precision in the effect size estimate, and the potential for publication bias ( 48 ).

The examples from the human medical literature illustrate that assessment of internal validity need not be static, and that modifications to our approach to assessing internal validity can strengthen the evidence base for clinical decision making. When developing or using tools which include an evaluation of internal validity, the assessment of internal validity should use the approach with the least assumptions about bias. This implies that the risk of bias approach, where context specific judgements are made related to the potential for bias, is the preferred approach for assessing internal validity. The risk of bias approach is well developed for RCTs. Therefore, when RCTs are included in the evidence available to address a clinical question, a risk of bias assessment approach should be used. When evaluating internal validity as a component of a SR-MA, the Cochrane ROB2.0 tool ( 16 ) could be used for this purpose. Modifications to this tool have been proposed for evaluating trials in livestock trials ( 49 – 51 ). For critical appraisal instruments for RCTs, where additional components such as feasibility and external validity are a desired component, the questions or items within the instrument that are specific to assessing internal validity still could follow a risk of bias approach by specifically requiring a judgement on the potential for bias. Similarly, the use of questions or items requiring a judgement on the potential for bias also could be used for evaluation of RCTs included in clinical practice guidelines when RCTs are present in the evidence base.

However, there are circumstances where these recommendations may not be appropriate or sufficient, such as for observational studies where risk of bias assessment instruments do not formally exist, or where a variety of study designs have been identified that answer the clinical question (particularly non-intervention type questions). When observational studies are used as evidence, individuals assessing internal validity may wish to evaluate risks of bias for each study ad hoc by considering the specific risks of bias related to selection bias, information bias, and confounding in the context of the topic area. However, this approach requires considerable methodological expertise. Alternatively, a quality assessment approach could be used to evaluate internal validity for observational studies, recognizing that more assumptions related to the potential for bias are involved. As instruments for evaluating the risk of bias for observational studies are developed and validated, these could replace ad hoc or quality assessment approaches.

For situations where the evidence base includes multiple study types, such as clinical practice guidelines, the use of levels of evidence may be useful for framing the potential for bias inherent in the studies identified to address the clinical questions. However, within each evidence level, there still is a need to evaluate the internal validity of each study. The proposed approach for situations where RCTs and observational studies are included in the evidence base was described in the preceding paragraphs. For lower levels of evidence, such as case series, textbooks and narrative reviews, and expert opinion, levels of evidence could be used to emphasize that these types of evidence have high potential for bias based on their design.

Broader Considerations

It should be noted that although this article has focused on approaches to evaluating internal validity of studies, this is only one component of the assessment of evidence. Critical appraisal, CATs, SR-MA, and clinical practice guidelines explicitly incorporate other aspects of decision-making, including a consideration of the magnitude and precision of an intervention effect or the potential clinical impact, the consistency of the research results across studies, the applicability (external validity and feasibility) of the research results, and the directness of the evidence to a clinical situation (for instance, whether the study populations are similar to those in a practice setting). However, a discussion of these components for decision-making is beyond the scope of the current study. The interested reader is referred to further details on the components used in evaluating evidence for CATs ( 4 ), for SR-MA using the GRADE approach ( 52 ), for network meta-analysis ( 53 ) and for clinical practice guidelines ( 8 , 44 ).

Author Contributions

JS drafted the manuscript. All authors contributed equally to the conceptualization of this work. All authors read and approved the final contents.

Partial funding support was obtained from the University of Guelph Research Leadership Chair (Sargeant).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 05 September 2024

Quantitative classification evaluation model for tight sandstone reservoirs based on machine learning

  • Xinglei Song 1 , 2 ,
  • Congjun Feng 1 , 2 ,
  • Teng Li 3 , 4 , 5 ,
  • Qin Zhang 6 ,
  • Xinhui Pan 1 , 2 ,
  • Mengsi Sun 7 &
  • Yanlong Ge 1 , 2  

Scientific Reports volume  14 , Article number:  20712 ( 2024 ) Cite this article

Metrics details

Tight sandstone reservoirs are a primary focus of research on the geological exploration of petroleum. However, many reservoir classification criteria are of limited applicability due to the inherent strong heterogeneity and complex micropore structure of tight sandstone reservoirs. This investigation focused on the Chang 8 tight reservoir situated in the Jiyuan region of the Ordos Basin. High-pressure mercury intrusion experiments, casting thin sections, and scanning electron microscopy experiments were conducted. Image recognition technology was used to extract the pore shape parameters of each sample. Based on the above, through grey relational analysis (GRA), analytic hierarchy process (AHP), entropy weight method (EWM) and comprehensive weight method, the relationship index Q1 between initial productivity and high pressure mercury injection parameters and the relationship index Q2 between initial productivity and pore shape parameters are obtained by fitting. Then a dual-coupled comprehensive quantitative classification prediction model for tight sandstone reservoirs was developed based on pore structure and shape parameters. A quantitative classification study was conducted on the target reservoir, analyzing the correlation between reservoir quality and pore structure and shape parameters, leading to the proposal of favourable exploration areas. The research results showed that when Q1 ≥ 0.5 and Q2 ≥ 0.5, the reservoir was classified as type I. When Q1 > 0.7 and Q2 > 0.57, it was classified as type I 1 , indicating a high-yield reservoir. When 0.32 < Q1 < 0.47 and 0.44 < Q2 < 0.56, was classified as type II. When 0.1 < Q1 < 0.32 and 0.3 < Q2 < 0.44, it was classified as type III. Type I reservoirs exhibit a zigzag pattern in the northwest part of the study area. Thus, the northwest should be prioritized in actual exploration and development. Additionally, the initial productivity of tight sandstone reservoirs showed a positive correlation with the porosity, permeability, sorting coefficient, coefficient of variation, and median radius. Conversely, it demonstrated a negative correlation with the median pressure and displacement pressure. The perimeters of pores, their circularity, and the length of the major axis showed a positive correlation with the porosity, permeability, sorting coefficient, coefficient of variation, and median radius. On the other hand, they exhibited a negative correlation with the median pressure and displacement pressure. This study quantitatively constructed a new classification and evaluation system for tight sandstone reservoirs from the perspective of microscopic pore structure, achieving an overall model accuracy of 93.3%. This model effectively predicts and evaluates tight sandstone reservoirs. It provides new guidance for identifying favorable areas in the study region and other tight sandstone reservoirs.

Introduction

With the depletion of conventional oil and gas reservoirs, tight oil reservoirs have gradually become a hot topic and a focal point for exploration and development, both domestically and internationally 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 . However, tight sandstone oil reservoirs exhibit complex reservoir characteristics, primarily manifested in their deep burial depths, wide distribution, and complex depositional processes. The reservoirs exhibit characteristics of low porosity, poor permeability, and high heterogeneity. The dominant pores are micro- and nano-scale, with narrow and dispersed throats, and are unfavorable for the migration and accumulation of oil and gas 10 , 11 , 12 , 13 , 14 , 15 . These factors necessitate considering the interdependent influences of multiple factors when classifying and evaluating tight sandstone reservoirs, which affects the accuracy of reservoir evaluation and hinders the selection of high-quality reservoirs. Therefore, the rapid and effective classification and evaluation of tight sandstone reservoirs has long been a focal point of scholarly research.

The quality of the reservoir is a key factor that determines the oil and gas production capacity. The classification and evaluation of reservoirs are central to reservoir studies and play a significant role in oilfield development. With the continuous advancement of oilfield development technologies, reservoir classification and evaluation methods have become increasingly diverse, gradually evolving from qualitative to quantitative research and from macro-parameter to micro-parameter evaluation. At present, both domestic and international scholars classify reservoirs using two main methods. The first is the traditional classification and evaluation method, which directly uses indicators such as the lithology, physical properties, pore structure, sedimentary facies, and oil and production experiments for classification. For example, Wei et al. classified the tight sandstone reservoirs of the Sha Creek Formation in the central Sichuan Basin based on the transverse relaxation (T 2 ) distribution of nuclear magnetic resonance 16 . Xu et al. studied the characteristics and controlling factors of tight sandstone using thin-section casting, scanning electron microscopy, X-ray diffraction (XRD), and spontaneous imbibition experiments 17 . Wu et al. analyzed the logging response characteristics using core data and electric imaging logging data and identified the reservoir type with the highest industrial production in the study area 18 . Zhang et al. established classification criteria for the third member of the Quan Formation based on mercury injection curves, core physical properties, and sedimentary facies characteristics 19 . Talib et al. quantitatively characterized tight oil and gas reservoirs through rock physics experiments and seismic inversion profiles 20 .

The second approach to reservoir classification involves initially choosing evaluation parameters that align with the geological conditions of the target area. Subsequently, machine learning techniques such as GRA the AHP, the EWM, and fuzzy analysis are employed to assign weight coefficients to each evaluation parameter. Finally, the reservoir is comprehensively scored. For example, Fang et al. proposed an automatic classification and verification method for reservoir types based on k-means clustering and Bayesian discriminant theory, using core logging and logging data from coring wells, combined with physical characteristics such as reservoir deposition and diagenesis 21 . Li et al. classified the Fuyu reservoir using GRA, Q clustering analysis, and discriminant analysis 22 . Wang et al.combined AHP and EWM, used the multi-factor superposition method, and established a new reservoir classification and evaluation method 23 . Fan et al. quantified the weight of evaluation parameters’ contribution to production by combining the relationships between variables and directional good production using the GRA 24 . Niu et al. proposed a new machine learning framework (GCA-CE-MGPK) for shale reservoirs, achieving efficient and accurate multi-scale evaluation of shale reservoirs 25 . In summary, traditional classification and evaluation methods are costly, inefficient and require extensive experimental data. They are mainly suitable for specific regions, making them inadequate for large-scale reservoir evaluation and prediction. Although machine learning techniques can improve efficiency and reduce costs, their accuracy often depends on the optimization of various mathematical methods, leading to high subjectivity in some models and lower overall precision, failing to meet the practical needs of production. Moreover, previous studies have primarily focused on evaluating single factors, lacking the integration of macro and micro perspectives. Based on these, this study combined multiple machine learning methods to directly link actual oilfield production data with micro-scale pore shape and structure parameters, effectively integrating macro and micro parameters.

Given the significant influence of subjective factors on the classification criteria for the quantitative evaluation of conventional reservoirs, adopting a new method for reservoir evaluation is essential. This study focuses on the Chang 8 tight sandstone reservoir in the Jiyuan area of the Ordos Basin, extracting pore shape parameters from 52 rock samples. Combined with the experimental data of high pressure mercury injection and the actual initial production capacity of the oil field. Through GRA, AHP, EWM and comprehensive weight method, the relationship index Q1 between initial productivity and high pressure mercury injection parameters and the relationship index Q2 between initial productivity and pore shape parameters are obtained by fitting. Then a dual-coupled comprehensive quantitative classification prediction model for tight sandstone reservoirs was developed based on pore structure and shape parameters. A quantitative classification study was conducted on the target reservoir, analyzing the correlation between reservoir quality and pore structure and shape parameters, leading to the proposal of favourable exploration areas. This method effectively combined the subjectivity-influenced AHP with the objectivity-influenced EWM to calculate the comprehensive weight coefficient, mitigating the impact of subjective factors and enhancing the model's accuracy. Validation results indicate that the model has an overall accuracy of 93.3%. Therefore, it was an effective tool for predicting and classifying tight sandstone reservoirs. It is significant for further exploration in the study area and other similar reservoirs.

Geological setting

The Ordos Basin is a large, multi-cycle, cratonic basin that formed on the crystalline basement during the Paleoproterozoic–Mesoproterozoic. The Ordos Basin, the second-largest sedimentary basin in China, has experienced five significant stages of sedimentary evolution. These stages include the middle to late Proterozoic rift valley, the early Paleozoic shallow marine platform, the late Paleozoic nearshore plain, the Mesozoic inland lake basin, and Cenozoic peripheral subsidence. This basin is known for its substantial reserves of oil and gas. The Ordos Basin extends across five provinces and regions, namely, Shaanxi, Gansu, Shanxi, Ningxia, and Inner Mongolia. Geographically, it stretches from the Yin Mountains in the north to the Qinling Mountains in the south, and from the Liupan Mountains in the west to the Lvliang Mountains in the east. The basin’s total area is 25 × 10 4 km 2 , with favorable areas covering 9.9 × 10 4 km 2 . The estimated resource volume is 6.2 × 10 12 m 3 , indicating significant exploration and development potential. Based on the basin’s geological nature, tectonic evolution, and structural pattern, the Ordos Basin can be divided into six primary tectonic units: the northern Shaanxi slope, the Tianhuan Depression, the western thrust fault zone, the Yimeng Uplift, the Weihebei Uplift, and the western Shanxi fold belt. The Jiyuan area, located in the central-western part of the Ordos Basin, covers a total area of 1302 km 2 (Fig. 1 a, c). This area spans the two primary tectonic units of the northern Shaanxi slope and the Tianhuan Depression, exhibiting a gently inclined monocline structure towards the west. Since the Mesozoic, the basin has developed thick fluvio-lacustrine deposits. In the Cenozoic, rift valleys were formed around the basin due to fault subsidence. The overall geological conditions are relatively complex, posing challenges for exploration. However, the area is rich in oil and gas resources, indicating favourable exploration prospects 26 , 27 , 28 , 29 . The proven petroleum geological reserves in this area amount to 800 × 10 6 t, with annual crude oil production of 700 × 10 4 t, making it the oilfield with the largest reserves and production levels in the Ordos Basin from the Mesozoic. Existing exploration results indicate that the Chang 8 oil-bearing formation is one of the most favourable hydrocarbon accumulation zones in the Jiyuan area, with a proven favourable oil-bearing area of 1500 km 2 .

figure 1

( a ) Location of the study area(modified from Tong 29 ), ( b ) columnar diagram of the Chang 8 formation, ( c ) well location distribution map of the study area.

The Chang 8 reservoir is located in the lower part of the Upper Triassic Yan’an Formation. It is primarily composed of grey sandstone and dark black mudstone interbeds. These sedimentary microfacies are predominantly characterized by subaqueous distributary channels and underwater distributary bays, indicating a deposition pattern typical of a shallow-water deltaic environment (Fig.  1 b). Based on the thin-section identification of the study area (Fig.  2 ), the lithology of the Chang 8 reservoir is predominantly composed of fine-grained feldspathic sandstone, feldspathic lithic sandstone, and a small amount of feldspar sandstone. The detrital components in the study area mainly consist of quartz, feldspar, and detritus. The ranges of contents are as follows: the quartz content is 20.1% to 58.6%, with an average of 31.21%; the feldspar content is 23.56% to 57.62%, with an average of 34.43%; and the detritus content is 6.25% to 29.45%, with an average of 21.38%.

figure 2

Triangular diagram and detrital composition diagram of the study area. ( a ) Triangular classification diagram of the sandstone in the Chang 8 reservoir, ( b ) histogram of the relative content of detrital components in the Chang 8 reservoir.

Materials and statistical methods

Materials and experiments.

In this study, 52 drilling core samples were obtained from the Chang 8 reservoir in Jiyuan, Ordos Basin, with all samples exhibiting a fine sandstone lithology. The samples underwent oil washing, gas permeability measurements, and the weight method for porosity calculation, allowing the determination of the reservoir’s petrophysical parameters (Table 1 ). The samples' average porosity was 8.23%, between 2.41 and 13.6%. The average permeability was 0.18 × 10 –3 µm 2 , ranging between 0.01 × 10 –3 µm 2 and 1.10 × 10 –3 µm 2 . Subsequently, thin-section casting and scanning electron microscopy experiments were conducted, resulting in 300 photographs. Additionally, high-pressure mercury intrusion was performed on the 52 samples to obtain the micropore throat characteristic parameters.

High pressure mercury intrusion and scanning electron microscopy

High pressure mercury intrusion experiment was used to evaluate the micropore throat characteristics of reservoirs quantitatively. This is achieved by observing the pressure changes during mercury injection into the pores, analyzing the characteristics of the capillary pressure curves, and studying the relationship between the intrusion volume of mercury and these characteristics 30 , 31 . In this experiment, the Auto Pore IV 9530 fully automated mercury porosimeter was utilized, with a pore diameter measurement range of 3 nm to 1100 μm. Continuous mercury injection was employed, with volume accuracy of less than 0.1 μl for both injection and withdrawal. The experimental procedure followed the national standard GB/T29171-2012, and the maximum mercury injection pressure reached 95.39 MPa.

Scanning electron microscopy (SEM) allows for high-resolution morphological observation and analysis of samples, as well as structural and compositional characterization. It also enables direct observation of the development characteristics of the micro-pore throats in the reservoir 32 , 33 , 34 . The experiment employed the Japanese Electron JSM-7500F field emission scanning electron microscope, which achieves a secondary-electron image resolution of 1 nm and magnification ranging from 20 to 300,000 times.

Pore parameter extraction technology

The ImageJ software, initially developed by Wayne Rasband at the National Institutes of Health in the United States, is a powerful open-source image processing system written in Java. It was initially applied in the fields of biomedical and agricultural sciences 35 . Recently, an increasing number of scholars have used it to identify and extract reservoir pores and fracture features 36 , 37 , 38 , 39 . In this study, the ImageJ software was used to process 210 scanning electron microscope images, extracting various pore parameters, including the perimeter, circularity, major axis length, aspect ratio, and solidity.

Statistical methodology

GRA is to address infinite space problems using finite sequences. It aims to evaluate the correlations between various factors within a system and determine the significance of each factor to the target function. This approach helps to avoid the subjective process of manually assigning weights to factor indicators 40 . In recent years, GRA has been applied in production forecasting and development plan optimization for tight sandstone reservoirs 41 , 42 , 43 , 44 . The specific steps are as follows.

Determine the initial sequence:

where X 0 is the reference sequence, X i is the comparative sequence, i is the number of comparative sequences, m is the number of independent variables, and n is the number of samples.

Normalize the data using the extreme value method:

Calculate the gray correlation coefficient:

Obtain the gray correlation coefficient matrix:

where ρ is the resolution coefficient, which takes values between 0 and 1. A smaller resolution coefficient indicates greater differences between the correlation coefficients and stronger discriminatory power. Usually, ρ is set to 0.5.

Determine the correlation degree. Represent the correlation strength between the series using the average of the n correlation coefficients:

where \(\mathop \varepsilon \nolimits_{{\mathop o\nolimits_{i} }}\) represents the correlation degree between the i -th comparative sequence and the reference sequence.

Determine the weights and rank the correlation degrees. Normalize the correlation degrees to obtain the weight W i of each comparative sequence:

AHP is a methodology that categorizes the factors within a complex problem into interconnected and prioritized levels. This approach facilitates the process of making decisions based on multiple criteria. It is primarily used to determine the weighting coefficients for comprehensive evaluations 45 , 46 , 47 . The process is as follows.

Construction of a judgment matrix: a judgment matrix is constructed to compare the importance of different factors:

where A is the matrix of pairwise comparisons, W is the weight vector, and λ max is the maximum eigenvalue.

Calculation of weights: the weight vector W is determined using the sum-product method.

Consistency check:

where n is the number of elements, I c is the consistency index, I R is the random consistency index, I cR is the consistency ratio, and \(\lambda^{\prime } \max\) is the average of the maximum eigenvalues.

If I cR  < 0.10, the consistency of the judgment matrix is considered acceptable.

EWM is an objective weighting approach that comprehensively examines the underlying patterns and informational value of unprocessed data. It can determine the uncertainty in variables through entropy values, where larger information content corresponds to smaller uncertainty and smaller entropy, and vice versa. The entropy weighting method is characterized by high accuracy and strong objectivity, and many scholars have applied it to oilfield production with good results 48 , 49 . The basic steps are as follows.

Normalize the data and calculate the information entropy:

where E i is the information entropy of the i th indicator, X ij is the value of the i th indicator on the j th sample, and N is the number of samples.

Calculate the weights:

where W i is the weight of the i th indicator, E i is the information entropy of the i th indicator, and M is the number of indicators.

Comprehensive weight coefficient

Weight coefficients can be used to classify and evaluate the reservoir quality effectively, and several methods are currently available to determine the weight coefficients. These include GRA, the expert evaluation method, Q clustering analysis and discriminant analysis, and factor analysis 50 , 51 , 52 . In this research, a comprehensive weight analysis methodology that integrated AHP and EWM was employed. The key advantage of this approach lies in its amalgamation of the subjective AHP analysis and the objective numerical analysis of EWM. This combination helps to mitigate the influence of subjective factors to a certain extent, thereby enhancing the reliability of the data.

where W iAHP is the weight coefficient obtained from the AHP method, and W iEWM is the weight coefficient obtained from the EWM method.

Results and discussion

Evaluation parameter selection.

Tight sandstone reservoirs are influenced by deposition, tectonics, and diagenesis.. These reservoirs demonstrate significant heterogeneity and an intricate distribution of micropore throats. The pore structure plays a crucial role in governing the storage and flow behaviour of the reservoir, where the different shape parameters of the pores govern the micropore structure of the rock formation 53 , 54 , 55 , 56 , 57 . Considering the characteristics above, this study aimed to provide a quantitative characterization of the reservoir by assessing three key aspects: the pore structure, the physical properties, and the pore shape parameters. Twelve parameters were selected to establish the relationship between the initial production capacity index and the pore structure and shape parameters. The actual initial production capacity of the oilfield was used as the indicator.

Sensitivity parameter selection for pore structure characteristics

The selected 52 samples were subjected to high-pressure mercury intrusion experiments using an Auto Pore IV 9530 automatic mercury porosimeter. The sorting coefficient varied between 1.5 and 2.74, with an average of 2.10. The coefficient of variation ranged between 13.94 and 17.32, with a mean value of 15.54. With an average value of 13.86 MPa, the median pressure varied between 10.5 and 18.79 MPa. The average displacement pressure was 1.23 MPa, ranging between 0.09 and 2.57 MPa. The median radius had a mean value of 0.09 μm and varied from 0.05 to 0.15 μm. With a mean value of 84.52%, the maximum mercury saturation varied from 62.77 to 93.76%. With an average of 34.90%, the mercury withdrawal efficiency varied between 16.7 and 46.6%. Overall, the pore structure of the reservoir in the study area was poor, with uneven sorting and poor connectivity among the pore throats, indicating strong heterogeneity. Correlation analysis was conducted on the initial production and mercury intrusion parameters (Fig.  3 ), and it was found that the correlation between the production capacity and permeability and porosity was the strongest, with correlation coefficients (R 2 ) of 0.91 and 0.75, respectively. This is mainly because porosity plays a crucial role in determining the size of the pore space within a reservoir, while permeability governs its flow capacity. In the context of tight sandstone reservoirs, the reservoir quality often depends on favourable pore permeability. The sorting coefficient and coefficient of variation provide insights into the uniformity of the distribution of the pore throat sizes. Higher values of these parameters indicate an improved pore structure and increased reservoir productivity. The median radius and median pressure indicate the pore permeability of the reservoir. A larger median radius and smaller median pressure indicate a larger pore space and stronger flow capacity, resulting in a larger oil production capacity. Therefore, the median radius positively correlates with production, while the median pressure is inversely correlated. The displacement pressure is inversely correlated with production (R 2  = 0.65). This is because displacement pressure refers to the capillary pressure corresponding to the largest connected pore, and a higher displacement pressure means a higher capillary pressure, making it more difficult for fluid to flow through. This indicates that tight oil has poor flow capacity in the reservoir and is more difficult to accumulate and extract. In conclusion, the initial production capacity is sensitive to the porosity, permeability, sorting coefficient, coefficient of variation, median pressure, median radius, and displacement pressure.

figure 3

Relationship between initial production and porosity, permeability, selectivity coefficient, coefficient of variation, median pressure, median radius, and displacement pressure.

Selection of pore-shape-sensitive parameters

A total of 210 high-resolution SEM images were captured for the 52 samples. The rock core pores were identified and extracted using ImageJ, obtaining pore shape parameters such as the perimeter, circularity, major axis length, aspect ratio, and solidity (Fig.  4 , Table 2 ). The average values of the identified pore shape parameters for each sample were then calculated. It was found that the pore perimeters of the 52 samples varied between 40.3 and 486.2 μm, with a mean value of 250.5 μm. The circularity ranged between 0.11 and 0.96, with a mean value of 0.31. The major axis lengths of the circumscribed ellipses spanned from 42.52 to 221.19 μm, with an average of 111.67 μm. The aspect ratios ranged from 1.14 to 2.92, and the average value was 2.32. The solidity values ranged between 0.09 and 0.89, with an average of 0.67. In general, the pore shape parameters of the tight sandstone reservoirs exhibited a wide range of variation, with relatively large average perimeters, average major axis lengths of the circumscribed ellipses, aspect ratios, and solidity, and with small average circularity (Fig.  5 ). This indicates that the pore shapes in tight sandstone are diverse, predominantly irregular and elongated, with few circular pores. Pearson correlation analysis was conducted between the most sensitive parameters for the prioritized pore structure characteristics and the extracted pore shape parameters (Fig.  6 ). The absolute value of the correlation coefficient always lies between −1 and 1. In this context, a value closer to 1 indicates a stronger positive relationship between the two independent variables, a value closer to -1 indicates a stronger negative relationship between the independent variables, and a value closer to 0 indicates a weak relationship between the variables. A significant and strong correlation (R 2  > 0.5) observed between the different shape parameters of the pores and the mercury injection parameters. This suggests that the shape parameters of the pores play a crucial role in determining the pore structures of tight sandstone reservoirs. In general, the perimeter, circularity, and major axis length of the pores displayed a positive correlation with the porosity (Φ), permeability (K), sorting coefficient (S p ), coefficient of variation (D r ), and median radius (R50). Conversely, they exhibited a negative correlation with the median pressure (P 50 ) and displacement pressure (Pd). On the other hand, the aspect ratio and solidity of the pores were inversely proportional to the porosity, permeability, sorting coefficient, coefficient of variation, and median radius. However, they were positively correlated with the median pressure and displacement pressure. Among them, there was a strong positive correlation (R 2  = 0.914) between the perimeter and porosity and a relatively strong negative correlation (R 2  = –0.766) with the displacement pressure. A larger pore perimeter results in a greater contact area between the reservoir fluid and the solid, facilitating fluid infiltration and storage. Circularity was strongly positively correlated with permeability (R 2  = 0.927) and negatively correlated with the displacement pressure (R 2  = –0.604). This is because larger circularity indicates a closer approximation to circular pores, which typically exhibit a uniform distribution, resulting in improved connectivity and fluid flow. The major axis length was strongly positively correlated with the permeability and porosity because the major axis length of the circumscribed ellipses of pores affects the connectivity and fluid flow path within the pores. A larger major axis length indicates better connectivity between pores, resulting in a more direct fluid flow path and higher permeability. Moreover, a longer major axis length corresponds to a larger pore size and higher porosity. The aspect ratio exhibited a strong negative correlation with the permeability and selectivity coefficient (R 2  = –0.866, R 2  = –0.754, respectively) and a strong positive correlation with the displacement pressure (R 2  = 0.652). As the aspect ratio increases, the pores become narrower and more uneven, resulting in longer and narrower flow channels, making fluid flow more difficult. As a result, the displacement pressure increases, the selectivity coefficient decreases, and the permeability decreases. Solidity exhibited a strong negative correlation with permeability (R 2  = –0.862) and a positive correlation with the displacement pressure (R 2  = 0.574). As the solidity increases, the pore shape becomes more concave, and the roundness deteriorates, making fluid flow between the pores more difficult. In conclusion, it can be observed that the perimeter, circularity, major axis of the circumscribed ellipse, aspect ratio, and solidity of the pores are sensitive to various parameters of mercury intrusion.

figure 4

Visualization of pore extraction results for rock samples. ( A ) Pore identification (sample no. 1), ( B ) pore extraction (sample no. 1), ( C ) pore identification (sample no. 10), ( D ) pore extraction (sample no. 10), ( E ) pore identification (sample no. 25), ( F ) pore extraction (sample no. 25).

figure 5

Distribution of pore shape parameters. ( a ) Distribution range of pore perimeter and major axis, ( b ) distribution range of pore circularity, solidity, and aspect ratio.

figure 6

Correlations between pore structural parameters and pore shape parameters.

Reservoir classification evaluation

Quantitative classification prediction formula.

Based on the results of the GRA, AHP, and EWM, a comprehensive quantitative classification prediction formula was constructed using the superposition principle. This formula was then used to classify and evaluate tight sandstone reservoirs.

where Q is the productivity index, a i is the dimensionless weight coefficients of various parameters, b i,N is the dimensionless normalized parameters, and n is the number of parameters.

Determination of weight coefficients

In this study, the initial production rate directly reflecting the reservoir quality was taken as the fundamental sequence. Seven sensitive parameters, namely, the porosity, permeability, sorting coefficient, coefficient of variation, median pressure, median radius, and displacement pressure, were considered as sub-sequences. The principles and steps of GRA were employed to determine the weights of various parameters, thereby assessing the sensitivity of each factor to the initial production rate (Table 3 ). Combining the correlation degree between the sensitive parameters determined by the gray correlation method and the initial productivity. Then, the parameters were compared in pairs, and values were assigned based on the 9-point scale method. The judgment matrix was obtained by pairwise comparisons of the seven sensitive parameters (Table 4 ). Subsequently, the weight coefficients were determined using the weighted product method within the AHP (Table 5 ). Formula ( 14 ) shows that the judgment matrix I cR  = 0.093 is less than 0.1, meeting the consistency requirements. Subsequently, the EWM analysis method was employed to conduct an objective analysis of each sensitive parameter, resulting in objective weight indices. The comprehensive weight coefficients were calculated using Eq. ( 17 ) (Table 5 ). The formula for the initial productivity and the mercury intrusion sensitivity parameter can be obtained as follows:

where Φ N is the normalized porosity, K N is the normalized permeability, S P,N is the normalized sorting coefficient, Dr, N is the normalized coefficient of variation, P 50,N is the normalized median pressure, R 50,N is the normalized median radius, and P d,N is the normalized displacement pressure.

Then, using the mercury intrusion parameter as the fundamental sequence, five sensitive parameters related to the pore shape, namely, the perimeter, circularity, major axis length, aspect ratio, and solidity, were considered sub-sequences. The correlation between the mercury intrusion parameters and the pore-shape-sensitive parameters was determined using GRA. The comprehensive weight coefficients for each mercury intrusion parameter were calculated using a combination of the AHP and the EWM (Table 6 ). Based on these weight coefficients, the correlation formulas between each mercury intrusion parameter and the pore shape parameters were obtained as follows:

Combined with Formula ( 19 ), the relationship between the initial productivity and pore shape parameters can be obtained:

where P N is the normalized perimeter, C N is the normalized circularity, M N is the normalized major axis, A N is the normalized aspect ratio, and S N is the normalized solidity.

Classification scheme and feature evaluation

Based on the indices Q1, which relate initial productivity to high-pressure mercury intrusion sensitivity parameters, and Q2, which relate initial productivity to pore shape parameters, a classification and evaluation scheme for the Chang 8 tight sandstone reservoir have been determined. As depicted in Fig.  7 , Q1 for type III reservoirs ranges from 0.1 to 0.31, and Q2 ranges between 0.3 and 0.44. For type II reservoirs, Q1 ranges from 0.32 to 0.47, and Q2 ranges from 0.44 to 0.56. For type I reservoirs, Q1 ≥ 0.5 and Q2 ≥ 0.5. Moreover, type I reservoirs can be further divided into type I 1 , comprising high-yield reservoirs, and type I 2 , comprising high-quality reservoirs, with Q1 > 0.7 and Q2 > 0.57 indicating type I 1 high-yield reservoirs. Type I reservoirs are considered optimal for the Chang 8 formation, with 15 out of 52 samples belonging to this type, accounting for 28.8%. The characteristics associated with this type of reservoir include favourable pore permeability, featuring an average porosity of 11.1% and permeability of 0.4 × 10 –3  µm 2 . Additionally, these reservoirs possess a low displacement pressure of 0.62 MPa, a low median pressure of 11.79 MPa, and a relatively high median radius of 0.12 µm. The reservoir exhibits good pore throat selectivity, characterized by a large sorting coefficient (2.5) and variation coefficient (16.43). The average pore perimeter of the reservoir is relatively long (360.30 µm), with good circularity (0.50) and a small aspect ratio (1.92). This indicates that the pore shape is more regular and almost circular. Generally, type II displays moderate petrophysical characteristics, characterized by an average porosity of 8.43% and permeability of 0.1 × 10 –3 µm 2 . Within this classification, 19 samples contribute to 36.54% of the dataset. Compared to type I, this reservoir type has a somewhat higher average displacement pressure and median pressure (1.11 MPa and 13.48 MPa, respectively). The median radius is lower (0.10 µm), and the average sorting coefficient and coefficient of variation are 2.41 and 16.18, respectively, indicating moderate sorting. The average pore perimeter of this reservoir type is smaller than that of type I (261.61 µm), with smaller circularity (0.26) and a larger aspect ratio (2.41). Compared to type I, the pores of type II reservoirs exhibit irregular and more elongated shapes. Type III exhibits poorer petrophysical properties, with an average permeability of 0.06 × 10 –3 μm 2 and porosity of 5.7%, significantly lower than those of type I and type II. There were 18 samples belonging to this type, accounting for 34.62%. This reservoir type has an average displacement pressure of 1.89 MPa and a median pressure of 16.1 MPa, greater than type II. The median radius is the smallest (0.07 µm). The average sorting coefficient and coefficient of variation are 1.81 and 14.7, respectively, indicating poor pore throat sorting. The average pore perimeter is the smallest (147.37 µm), with the poorest circularity (0.19) and the largest aspect ratio (2.56). This indicates that the pores of type III reservoirs are more elongated and slender, making them unfavorable for fluid flow and leading to poor reservoir permeability. In summary, it can be observed that as the reservoir quality deteriorates, the pore structure becomes increasingly worse, and the pore shapes become more complex and variable.

figure 7

Comprehensive quantitative classification prediction model for the research area of the Chang 8 reservoir.

According to the distribution maps of the well locations and sedimentary microfacies (Figs.  1 c, 8 ), it is observed that type I reservoir wells are mostly found in the northwest of the research region, within the subaqueous distributary channels, exhibiting a zigzag pattern. Most type II reservoir wells are located in the study area's centre, mainly within the middle portions of the subaqueous distributary channel's lateral sand bodies. On the other hand, the relatively poor type III reservoir wells are scattered around the type II reservoirs, with most of them located in the marginal areas adjacent to the interdistributary bay and the edge of the channel’s lateral sand bodies. Therefore, in practical exploration and development, the high-quality reservoirs (type I) in the study area's northwest part should be prioritised.

figure 8

Planar distribution map of comprehensive quantitative classification for the research area of the Chang 8 reservoir.

Additionally, the main reason for the high productivity of type I 1 reservoirs is the higher content of dissolved pores in type I reservoirs. According to Table 7 and Fig.  9 , samples 3, 15, 16, and 20 from type I reservoirs exhibit significant development of feldspar dissolution pores, intergranular pores, and a small number of rock particles that dissolve pores. The average absolute contents of feldspar dissolution and intergranular pores are 1.2% and 5.15%, respectively. The average face rate is 0.8%, higher than the other samples. The greater the development of feldspar dissolution and intergranular pores, the larger the flow channels and storage space they provide, thus improving the reservoir’s porosity and permeability, resulting in high-productivity reservoirs. The pore shape parameters of samples 3, 15, 16, and 20 were compared with those of the other samples (Table 2 ). It was found that these four samples have longer pore perimeters and major axes, larger shape factor (roundness) coefficients, and relatively smaller aspect ratios and concavity. This indicates that high-productivity reservoirs (type I 1 ) have larger pore perimeters, an increased contact area between the pores and reservoir fluids, higher pore circularity, and more circular shapes favourable for fluid flow and storage. Furthermore, as shown in Fig.  8 , the four high-productivity wells (JY-3, JY-15, JY-16, JY-20) are all located on the main channel of the subaqueous distributary channel. Therefore, from a macro perspective, thicker sand bodies may be another reason for their high productivity.

figure 9

Porosity structure of type I 1 reservoir. ( A ) Intergranular pores, developed dissolution pores (sample no. 3), ( B ) feldspar dissolution pores (sample no. 20), ( C ) rock fragment dissolution pores (sample no. 15), ( D ) intergranular pores, locally developed dissolution pores (sample no. 16).

Model validation

In order to verify the model, 15 coring wells in Jiyuan Chang 8 reservoir were selected. High-pressure mercury intrusion tests, scanning electron microscopy, and thin-section casting experiments were conducted on corresponding samples to extract the pore shape parameters. Next, the comprehensive indices Q1 and Q2, for reservoir categorization, were determined using the GRA, the AHP, and the EWM. Finally, the accuracy of the classification results was compared with that of the existing oil test parameters. As shown in Fig.  10 , three wells were classified as type I reservoirs, with an average initial yield of 5.73 t/d. Six wells were classified as type II reservoirs, with an average initial yield lower than type I at 3.52 t/d. One well was misclassified, deviating from the expected value. Five wells were classified as type III reservoirs, with the lowest average initial yield of 1.32 t/d. The quantitative evaluation of the comprehensive parameters matched the actual production capacity results, demonstrating a high matching rate of 93.3%. Compared to conventional models by other scholars for tight sandstone reservoirs, this model establishes a direct connection between actual oilfield production data, microscale pore shape parameters, and pore structure parameters, leading to quantitative reservoir classification evaluation 58 , 59 , 60 . It demonstrates higher and more stable classification accuracy.

figure 10

Comparative analysis of the integrated quantitative classification prediction for the Chang 8 reservoir.

Conclusions

Tight sandstone reservoirs display significant heterogeneity and intricate microscopic pore structures, which impact the accuracy of reservoir assessment. This study employed scanning electron microscopy, thin section analysis, and high-pressure mercury intrusion data as samples. It utilized image recognition technology and machine learning methods to develop a novel classification and evaluation system for tight sandstone reservoirs based on microscopic pore structures. This method utilizes minimal experimental data, is cost-effective, demonstrates relatively high model accuracy, and is particularly suitable for tight sandstone reservoirs. The research conclusions are as follows:

By analyzing high pressure mercury parameters, scanning electron microscopy images, and thin sections of the study area in the Chang 8 reservoir, a comprehensive quantitative classification prediction model for tight sandstone reservoirs was established. The model was constructed using twelve sensitive parameters: porosity, permeability, sorting coefficient, coefficient of variation, median pressure, median radius, displacement pressure, pore perimeter, circularity, major axis length, aspect ratio, and solidity, all extracted using image recognition technology.

The case study based on the comprehensive quantitative classification prediction model showed that Q1 ≥ 0.5 and Q2 ≥ 0.5 corresponded to type I reservoirs, while Q1 > 0.7 and Q2 > 0.57 corresponded to type I 1 high-yield reservoirs. When 0.32 < Q1 < 0.47 and 0.44 < Q2 < 0.56, a type II reservoir was identified. When 0.1 < Q1 < 0.32 and 0.3 < Q2 < 0.44, a type III reservoir was identified. Additionally, the presence of high-content dissolution pores, intergranular pores, and larger pore perimeters, as well as higher pore circularity, were the main factors contributing to high-yield reservoirs (type I 1 ). The model was validated, achieving an overall accuracy of 93.3%, which indicates its effectiveness in predicting the classification and evaluation of tight reservoirs.

Reservoir quality is influenced by the pore structure characteristics and shape parameters. In tight sandstone reservoirs, the productivity is positively correlated with the porosity, permeability, sorting coefficient, coefficient of variation, and median radius, but negatively correlated with the median pressure and displacement pressure. The perimeter, circularity, and major axis length of the pores are positively correlated with the porosity, permeability, sorting coefficient, coefficient of variation, and median radius, but negatively correlated with the median pressure and displacement pressure.

Type I reservoir wells were primarily found in the northwest of the research region, within the subaqueous distributary channels, exhibiting a zigzag pattern. The majority of type II reservoir wells were located in the study area's center, mostly within the middle portions of the subaqueous distributary channel’s lateral sand bodies. In contrast, the relatively inferior type III reservoir wells were dispersed among the type II reservoirs, primarily situated in the marginal zones bordering the interdistributary bay and the periphery of the channel’s lateral sand bodies. Therefore, in terms of practical exploration and development, priority should be given to the superior reservoirs (type I) in the northwestern sector of the research region.

The evaluation results of the quantitative classification of tight sandstone reservoirs using machine learning are generally consistent with previous multiparameter conventional evaluation studies. However, this approach effectively integrates macroscopic and microscopic parameters, resulting in higher model accuracy, easier operation, and lower costs. It is particularly suitable for large-scale quality assessments of tight sandstone reservoirs, offering essential guidance for further exploration in the study area and other similar reservoirs.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Abbreviations

Analytic hierarchy process

Grey relational analysis

Entropy weight method

X-ray diffraction

Scanning electron microscopy

Fine-grained lithic feldspar sandstone

Fine-grained feldspar lithic sandstone

Fine-grained feldspar sandstone

Grey correlation analysis, clustering ensemble, and the Kriging model combined with macro geological parameters

Ledingham, & Glen, W. Santigo, pool. California: Geological note. AAPG Bull. 31 (11), 2063–2067 (1947).

Google Scholar  

Wang, Q. R., Tao, S. Z. & Guan, P. Progress in research and exploration & development of shale oil in continental basins in China. Nat. Gas Geosci. 31 (3), 417–427 (2020).

Zou, C. N. et al. Geological concepts, characteristics, resource potential and key techniques of unconventional hydro-carbon: On unconventional petroleum geology. Pet. Explor. Dev. 40 (4), 385–399 (2013).

Article   Google Scholar  

Zou, C. N. et al. Progress in China’s unconventional oil & gas exploration and development and theoretical technologies. Acta Geol. Sin. 89 (6), 979–1007 (2015).

CAS   Google Scholar  

Zhou, N. et al. Limits and grading evaluation criteria of tight oil reservoirs in typical continental basins of China. Petrol. Explor. Dev. 48 (05), 1089–1100 (2021).

Zhao, W. et al. Types and resource potential of continental shale oil in China and its boundary with tight oil. Petrol. Explor. Dev. 47 (01), 1–11 (2020).

Article   CAS   Google Scholar  

Sun, L. et al. Development characteristics and orientation of tight oil and gas in China. Petrol. Explor. Dev. 46 (06), 1073–1087 (2019).

Xiang, F. et al. Classification evaluation criteria and exploration potential of tight oil resources in key basins of China. J. Nat. Gas Geosci. 4 (6), 309–319 (2019).

Gao, X., Chen, J., Xu, R., Zhen, Z., Zeng, X., Chen, X. & Cui, L. Research progress and prospect of the materials of bipolar plates for proton exchange membrane fuel cells (PEMFCs)[J]. International Journal of Hydrogen Energy . 50 , 711–743 (2024).

Article   ADS   CAS   Google Scholar  

Wang, J., Wu, S., Li, Q. & Guo, Q. An investigation into pore structure fractal characteristics in tight oil reservoirs: A case study of the Triassic tight sandstone with ultra-low permeability in the Ordos Basin, China. Arab. J. Geosci. 13 (18), 961 (2020).

Gao, H., Cao, J., Wang, C., He, M., Dou, L., Huang, X. & Li, T. Comprehensive characterization of pore and throat system for tight sandstone reservoirs and associated permeability determination method using SEM, rate controlled mercury and high pressure mercury. J. Petrol. Sci. Eng . 174 (2018).

Gao, H. et al. Effect of pressure pulse stimulation on imbibition displacement within a tight sandstone reservoir with local variations in porosity. Geoenergy Sci. Eng. 226 , 211811 (2023).

Wang, C., Gao, H., Gao, Y. & Fan, H. Influence of pressure on spontaneous imbibition in tight sandstone reservoirs. Energy Fuels 34 (8), 9275–9282 (2020).

Wang, C., Li, T., Gao, H., Zhao, J. & Gao, Y. Quantitative study on the blockage degree of pores due to asphaltene precipitation in low-permeability reservoirs with NMR technique. J. Petrol. Sci. Eng. 163 , 703–711 (2018).

Gao, H. et al. Effects of pore structure and salinity on the imbibition of shale samples using physical simulation and NMR technique: A case from Chang 7 shale, Ordos basin. Simulation. 97 (2), 167–173 (2021).

Wei, H. et al. Classification of tight sandstone reservoirs based on the nuclear magnetic resonance T 2 distribution: A case study on the Shaximiao Formation in Central Sichuan, China. Energy Fuels 36 , 10803–10812 (2022).

Xu, J. et al. Characteristics and controlling factors of tight gas sandstones from the Upper Shanxi and Lower Shihezi Formations in the Northern Sulige Area, Ordos Basin, China. Energy Fuels 37 (20), 15712–15729 (2023).

Wu, X. et al. A novel evaluation method of dolomite reservoir using electrical image logs: The Cambrian dolomites in Tarim Basin, China. Geoenergy Sci. Eng. 233 , 212509 (2024).

Zhang, Q. et al. Comprehensive evaluation and reservoir classification in the Quan 3 member of the Cretaceous Quantou Formation in the Fuxin Uplift, Songliao Basin. Front. Earth Sci. 10 , 1016924 (2022).

Article   ADS   Google Scholar  

Talib, M., Durrani, M. Z. A., Palekar, A. H., Sarosh, B. & Rahman, S. A. Quantitative characterization of unconventional (tight) hydrocarbon reservoir by integrating rock physics analysis and seismic inversion: A case study from the Lower Indus Basin of Pakistan. Acta Geophys. 70 (6), 2715–2731 (2022).

Fang, X., Zhu, G., Yang, Y., Li, F. & Feng, H. Quantitative method of classification and discrimination of a porous carbonate reservoir integrating k-means clustering and Bayesian theory. Acta Geol. Sin. (Beijing) 97 (1), 176–189 (2023).

Li, Y. et al. Microscopic pore-throat grading evaluation in a tight oil reservoir using machine learning: A case study of the Fuyu oil layer in Bayanchagan area, Songliao Basin central depression. Earth Sci. Inform. 14 (2), 601–617 (2021).

Wang, Z. et al. Quantitative evaluation of unconsolidated sandstone heavy oil reservoirs based on machine learning. Geol. J. (Chichester, England). 58 (6), 2321–2341 (2023).

Fan, J., Shi, J., Wan, X., Xie, Q. & Wang, C. Classification evaluation method for Chang 7 oil group of Yanchang formation in Ordos Basin. J. Pet. Explor. Prod. Te. 12 , 825–834 (2021).

Niu, D. et al. Multi-scale classification and evaluation of shale reservoirs and “sweet spot” prediction of the second and third members of the Qingshankou Formation in the Songliao Basin based on machine learning. J. Petrol Sci. Eng. 216 , 110678 (2022).

Li, C. et al. Oil charging pore throat threshold and accumulation effectiveness of tight sandstone reservoir using the physical simulation experiments combined with NMR. J. Petrol. Sci. Eng. 208 , 109–338 (2022).

Li, S. et al. The dissolution characteristics of the Chang 8 tight reservoir and its quantitative influence on porosity in the Jiyuan area, Ordos Basin, China. J. Nat. Gas Geosci. 3 (2), 95–108 (2018).

Song, X. et al. Analysis of the influence of micro-pore structure on oil occurrence using nano-CT scanning and nuclear magnetic resonance technology: An example from Chang 8 tight sandstone reservoir, Jiyuan, Ordos Basin. Processes 11 , 11274 (2023).

Tong, Q. et al. Research on sand body architecture at the intersection of a bidirectional sedimentary system in the Jiyuan area of Ordos Basin. Sci. Rep. 13 , 12261 (2023).

Fu, S. et al. Accurate characterization of full pore size distribution of tight sandstones by low-temperature nitrogen gas adsorption and high-pressure mercury intrusion combination method. Energy Sci. Eng. 9 (1), 80–100 (2021).

Li, P. et al. Occurrence characteristics and main controlling factors of movable fluids in Chang 81 reservoir, Maling Oilfield, Ordos Basin, China. J. Petrol. Explor. Prod. Technol. 9 (1), 17–29 (2018).

Li, C., Chen, G., Li, X., Zhou, Q. & Sun, Z. The occurrence of tight oil in the Chang 8 lacustrine sandstone of the Huaqing area, Ordos Basin, China: Insights into the content of adsorbed oil and its controlling factors. J. Nat. Gas Geosci. 7 (1), 27–37 (2022).

Gong, Y. & Liu, K. Pore throat size distribution and oiliness of tight sands-A case study of the Southern Songliao Basin, China. J. Petrol. Sci. Eng. 184 , 106508 (2020).

Liu, Y. et al. A novel experimental investigation on the occurrence state of fluids in microscale pores of tight reservoirs. J. Petrol. Sci. Eng. 196 , 107656 (2021).

Sandhya, N. & Baviskar, A. A quick & automated method for measuring. Am. Biol. Teach. 73 (9), 554–556 (2011).

Curtis, M. E., Cardott, B. J. & Sondergeld, C. H. Development to for organic porosity in the Woodford shale with increasing thermal maturity. Int. J. Coal Geol. 26 (31), 26–30 (2012).

Keller, L. M., Schuetz, P. & Erni, R. Characterization of multi-scale micro-structural features in opalinus clay. Microporous Mesoporous Mater. 83 , 84–90 (2013).

Jin, L. et al. Evolution of porosity and geochemistry in Mar cell us formation black shale during weathering. Chem. Geol. 50 , 51–56 (2013).

Rine, J. M. et al. Comparison of porosity distribution with in selected north American shale units by SEM examination of argon-ion-milled samples. Electron Microsc. Shale Hydrocarbon Reserv. AAPG Memoir. 102 , 137–152 (2013).

Zhao, J. Y. et al. A quantitative evaluation for well pattern adaptability in ultra-low permeability oil reservoirs: A case study of Triassic Chang 6 and Chang 8 reservoirs in Ordos Basin. Pet. Explor. Dev. 45 (3), 482–488 (2018).

Dong, Q., Dai Yin, Y. & Ya Zhou, Z. Fine classification of ultra-low permeability reservoirs around the Placanticline of Da Qing oilfield (PR of China). J. Petrol. Sci. Eng. 174 , 1042–1052 (2019).

Gao, Y. et al. Application of an analytic hierarchy process to hydro-carbon accumulation coefficient estimation. Petrol. Sci. 7 (3), 337–346 (2010).

Liu, Y. et al. A reservoir quality evaluation approach for tight sandstone reservoirs based on the gray correlation algorithm: A case study of the Chang 6 layer in the W area of the as oilfield, Ordos Basin. Energy Explor. Exploit. 39 (4), 1027–1056 (2021).

Shi, B., Chang, X., Yin, W., Li, Y. & Mao, L. Quantitative evaluation model for tight sandstone reservoirs based on statistical methods—A case study of the Triassic Chang 8 tight sandstones, Zhenjing area, Ordos Basin, China. J. Petrol. Sci. Eng. 173 , 601–616 (2019).

Liu, B. The analytic hierarchy process for the reservoir evaluation in Chaoyanggou oilfield. Adv. Petrol. Explor. Dev. 6 , 46–50 (2014).

Shang, Y. Z. Application of analytical hierarchy process in the low-grade oil reservoirs evaluation. Daqing Petrol. Geol. Oilfield Dev. 33 , 55–59 (2014).

Xi, Y. et al. Application of analytic hierarchy process in mineral prospecting prediction based on an integrated geology—aerogeophysics—geochemistry model. Minerals 13 (7), 978 (2023).

Lai, F. et al. Crushability evaluation of shale gas reservoir based on analytic hierarchy process. Spec. Oil Gas Reserv. 25 (3), 154–159 (2018).

Elhaj, M.A., Imtiaz, S. A., Naterer, G. F. & Zendehboudi, S. Production optimization of hydrocarbon reservoirs by entropy generation minimization. J. Nat. Gas Sci. Eng . 83 , 103538 (2020).

Szabo, N. P. et al. Cluster analysis of core measurements using heterogeneous data sources: An application to complex Miocene reservoirs. J. Petrol. Sci. Eng. 178 , 575–585 (2019).

Oliveira, G. P., Santos, M. D. & Roque, W. L. Constrained clustering approaches to identify hydraulic flow units in petroleum reservoirs. J. Petrol. Sci. Eng. 186 , 106732 (2020).

Jia, A., Wei, Y. & Jin, Y. Progress in key technologies for evaluating marine shale gas development in China. Petrol. Explor. Dev. 43 (6), 1035–1042 (2016).

Xiao, L., Bi, L., Yi, T., Lei, Y. & Wei, Q. Pore structure characteristics and influencing factors of tight reservoirs controlled by different provenance systems: A case study of the Chang 7 members in Heshui and Xin’anbian of the Ordos Basin. Energies 16 , 34108 (2023).

Dong, J. et al. Pore structure and fractal characteristics of tight sandstone: A case study for Huagang Formation in the Xihu Sag, East China Sea Basin, China. Energies 16 , 20134 (2023).

Gao, J. et al. Study on the coupling law between pore-scale fluid flow capacity and pore-throat configuration in tight sandstone reservoirs. Geofluids 2023 (1), 1693773 (2023).

Zhang, R. et al. Microscopic pore structures and their controlling factors of the lower carboniferous Luzhai Shale in Guizhong depression, China. Geofluids 2023 , 8890709 (2023).

Du, M. et al. Study on the quantitative characterization and heterogeneity of pore structure in deep ultra-high pressure tight glutenite reservoirs. Minerals 13 , 6015 (2023).

Wu, B. H. et al. Integrated classification method of tight sandstone reservoir based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means. Petrol. Sci. 20 (5), 2747–2758 (2023).

Lu, X., Xing, X., Hu, K. & Zhou, B. Classification and evaluation of tight sandstone reservoirs based on MK-SVM. Processes. 11 (9), 2678 (2023).

Qiu, X. et al. Quantitative evaluation of reservoir quality of tight oil sandstones in Chang 7 member of Ordos Basin. Front. Earth Sci. 10 , 1046489 (2023).

Download references

Acknowledgements

This research was sponsored by Natural Science Basic Research Plan in Shaanxi Province of China (Grant No. 2017JM4013; Grant No. 2020JQ-798).

Author information

Authors and affiliations.

State Key Laboratory of Continental Dynamics, Northwest University, Xi’an, 710069, China

Xinglei Song, Congjun Feng, Xinhui Pan & Yanlong Ge

Department of Geology, Northwest University, No. 229, Taibai North Road, Xi’an, 710069, Shaanxi, China

School of Petroleum Engineering, Xi’an Shiyou University, Xi’an, 710065, China

Engineering Research Center of Development and Management for Low to Ultra-Low Permeability Oil & Gas Reservoirs in West China, Ministry of Education, Xi’an, 710065, China

Xi’an Key Laboratory of Tight Oil (Shale Oil) Development, Xi’an, 710065, China

PetroChina Research Institute of Petroleum Exploration & Development, Beijing, 100083, People’s Republic of China

School of Petroleum Engineering and Environmental Engineering, Yan’an University, Yan’an, 716000, China

You can also search for this author in PubMed   Google Scholar

Contributions

Xinglei Song: Investigation, Formal analysis, Conceptualization, Data Curation, Writing-Original Draft; Congjun Feng: Writing-Review & Editing, Supervision, Funding acquisition,Methodology; Teng Li: Investigation, Resources, Data Curation, Writing-Review & Editing; Qin Zhang: Investigation, Resources, Data Curation; Xinhui Pan: Supervision, Project administration; Mengsi Sun: Supervision, Writing-Review & Editing, Project administration; Yanlong Ge: Investigation, Resources, Data Curation. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Congjun Feng .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Song, X., Feng, C., Li, T. et al. Quantitative classification evaluation model for tight sandstone reservoirs based on machine learning. Sci Rep 14 , 20712 (2024). https://doi.org/10.1038/s41598-024-71351-0

Download citation

Received : 22 April 2024

Accepted : 27 August 2024

Published : 05 September 2024

DOI : https://doi.org/10.1038/s41598-024-71351-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Tight sandstone
  • Pore structure
  • Quantitative evaluation
  • High-pressure mercury injection
  • Image recognition
  • Machine learning

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

research study hierarchy

Exploring Female Narratives of Sexual Intimacy and the Social Suppression of Desire

  • Published: 04 September 2024

Cite this article

research study hierarchy

  • Shivani Rajan   ORCID: orcid.org/0009-0008-3470-9241 1 &
  • Swati Pathak   ORCID: orcid.org/0000-0002-1189-1457 1  

Exploring the construction of sexual identities by women, this research attempts to provide an experiential understanding of sexual intimacy in young adulthood through critical narrative analysis of the accounts of ten unmarried cis-gender women, located in the postmodern feminist paradigm and drawing from contemporary psychoanalytic tradition. The study highlighted the lack of discourse on female desire and pleasure (that is not fetishised, penalised, or ostracised) in the hetero-patriarchal socio-cultural fabric of India and how it manifests in the sense of shame, guilt, and self-doubt in the navigation of sexual intimacy. The social matrix, including the influence of family and partner dynamics and cultural and generational differences, was observed to play a prominent role in the evolution of individual perceptions of sexual intimacy. Analysing the narratives through a feminist lens foregrounded the predominance of male satisfaction and pleasure, the sense of obligation towards male partners, the infringement of boundaries and compromise, and the performativity in sexual experiences, thus calling attention to the female struggle of realising and practising sexual agency. The research indicates the need to critically examine the pervasive phallocentrism in the experience of sexual intimacy and the marginalisation of female sexual desire due to the suppression of female sexuality in the patriarchal hierarchy of power distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Data Availability

No datasets were generated or analysed during the current study.

Atallah, S., & Redón, A. M. (2023). Relevant (sexual) aspects of Cultural differences. In S. Geuens, A. Polona Mivšek, & W. Gianotten (Eds.), Midwifery and sexuality . Springer. https://doi.org/10.1007/978-3-031-18432-1_23

Bell, E. (2005). Sex acts beyond boundaries and binaries: A feminist challenge for self care in performance studies. Text and Performance Quarterly , 25 (3), 187–219.

Article   Google Scholar  

Butler, J. (1990). Gender trouble: Feminism and the subversion of identity. Routledge.

Cash, T. F., Maikkula, C. L., & Yamamiya, Y. (2004). Baring the body in the bedroom: Body image, sexual self-schemas, and sexual functioning among college women and men. Electronic Journal of Human Sexuality , 7 , 1–9.

Google Scholar  

Chakraborty, K., & Thakurata, R. G. (2013). Indian concepts on sexuality. Indian Journal of Psychiatry , 55 (Suppl 2), S250–S255.

Article   PubMed   PubMed Central   Google Scholar  

Chasseguet-Smirgel, J. (1976). Freud and female sexuality: The consideration of some blind spots in the exploration of the Dark Continent. The International Journal of Psychoanalysis .

Crooks, R. L., Baur, K., & Widman, L. (2020). Our sexuality . Cengage Learning.

Dion, K. K., & Dion, K. L. (1993). Individualistic and collectivistic perspectives on gender and the cultural context of love and intimacy. Journal of Social Issues , 49 (3), 53–69.

Easteal, P., Holland, K., & Judd, K. (2015). Enduring themes and silences in media portrayals of violence against women. Women’s Studies International Forum (Vol. 48, pp. 103–113). Pergamon.

Emerson, P., & Frosh, S. (2004). Critical narrative analysis in psychology: A guide to practice. Springer.

Elise, D. (2008). Sex and shame: The inhibition of female desires. Journal of the American Psychoanalytic Association , 56 (1), 73–98.

Article   PubMed   Google Scholar  

Faustino, M. J., & Gavey, N. (2024). The failed promise of consent in women’s experiences of coercive and unwanted anal sex with men. Feminism & Psychology , 09593535241234429.

Fazli Khalaf, Z., Liow, J. W., Low, W. Y., Ghorbani, B., & Merghati-Khoei, E. (2018). Young women’s experience of sexuality: A battle of pleasure and sexual purity in the context of Malaysian society. Sexuality & Culture , 22, 849–864.

Freud, S. (1920). Introductory lectures on psychoanalysis (A. A. Brill, Trans) . Hogarth. (Originally published, 1916-17).

Freud, S. (1905). Three Essays on the Theory of Sexuality. In J. Strachy, & A. Freud (Eds.), The Standard Edition of the Complete Psychological Works of Sigmund Freud, Volume VII (1901-1905): A Case of Hysteria, Three Essays on Sexuality and Other Works (pp. 123–246). London: Hogarth Press.

Friedman, J., & Valenti, J. (Eds.). (2019). Yes means yes! Visions of female sexual power and a world without rape . Seal.

Gavin, J. (2000). Arousing suspicion and violating trust: The lived ideology of safe sex talk. Culture Health & Sexuality . https://doi.org/10.1080/136910500300750

Gill, R. (2009). Mediated intimacy and postfeminism: A discourse analytic examination of sex and relationships advice in a women’s magazine . Discourse & Communication. https://doi.org/10.1177/1750481309343870

Guba, E. G., & Lincoln Y. S. (1994). Competing paradigms in qualitative research. Handbook of qualitative research . Sage.

Haberland, N., & Rogow, D. (2015). Sexuality education: Emerging trends in evidence and practice. Journal of Adolescent Health , 56 (1), S15–S21.

Hawkes, G. (1996). Sociology of sex and sexuality . McGraw-Hill Education (UK).

Impett, E. A., & Peplau, L. A. (2002). Why some women consent to unwanted sex with a dating partner: Insights from attachment theory. Psychology of Women Quarterly , 26 (4), 360–370.

Jackson, A., & Guerra, N. S. (2011). Cultural Difference. In: Goldstein, S., Naglieri, J.A. (Eds.) Encyclopedia of Child Behavior and Development. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-79061-9_752

Kulwicki, C. (2008). Real sex education. Yes Means yes , 305–312.

McAdams, D. P. (1988). Biography, narrative, and lives: An introduction. Journal of Personality , 56 (1), 1–18.

Mitchell, J. (2000). Psychoanalysis and feminism: A radical reassessment of Freudian psychoanalysis . Basic Books.

Mitra, D. (2020). Indian sex life: Sexuality and the colonial origins of modern social thought . Princeton University Press.

Olamijuwon, E., & Odimegwu, C. (2022). Saving sex for marriage: An analysis of lay attitudes towards virginity and its perceived benefit for marriage. Sexuality & Culture , 26 (2), 568–594.

Rahmani, A., Merghati-Khoei, E., Moghaddam-Banaem, L., Hajizadeh, E., & Montazeri, A. (2016). The viewpoints of sexually active single women about premarital sexual relationships: A qualitative study in the Iranian context. International journal of high risk behaviors & addiction , 5(1).

Özcan, Ö., Cumurcu, B. E., Karlidag, R., Ünal, S., Mutlu, E. A., & Kartalci, S. (2015). Attachment styles in women with vaginismus/Vaginismusu olan kadinlarda baglanma stilleri. Anadolu Psikiyatri Dergisi , 16 (1), 37.

Sanchez, D. T., Fetterolf, J. C., & Rudman, L. A. (2012). Eroticizing Inequality in the United States: The Consequences and Determinants of Traditional Gender Role Adherence in Intimate Relationships. Journal of Sex Research, 49(2–3), 168–183. https://doi.org/10.1080/00224499.2011.653699

Simon-Kumar, R. (2014). Sexual violence in India: The discourses of rape and the discourses of justice. Indian Journal of Gender Studies , 21 (3), 451–460.

Sprecher, S., & Sedikides, C. (1993). Gender differences in perceptions of emotionality: The case of close heterosexual relationships. Sex Roles , 28 (9–10), 511–530.

Srivastava, S. (2020). Passionate modernity: Sexuality, class, and consumption in India . Taylor & Francis.

Szymanski, D. M., & Carr, E. R. (2008). The roles of gender role conflict and internalized heterosexism in gay and bisexual men’s psychological distress: Testing two mediation models. Psychology of Men & Masculinity , 9 (1), 40.

Tangney, J. P., & Dearing, R. L. (2003). Shame and guilt . Guilford Press.

Techasrivichien, T., Darawuttimaprakorn, N., Punpuing, S., Musumari, P. M., Lukhele, B. W., El-Saaidi, C., & Kihara, M. (2016). Changes in sexual behavior and attitudes across generations and gender among a population-based probability sample from an urbanizing province in Thailand. Archives of Sexual Behavior , 45 , 367–382.

Tillman, L. (2021). The influence of parent-child relationships on female sexual functioning . A Review Of The Literature.

Tolman, D. L. (2005). Dilemmas of desire: Teenage girls talk about sexuality . Harvard University Press.

Tolman, D. L., & McClelland, S. I. (2011). Normative sexuality development in adolescence: A decade in review, 2000–2009. Journal of Research on Adolescence , 21 (1), 242–255.

Twenge, J. M., Sherman, R. A., & Wells, B. E. (2015). Changes in American adults’ sexual behavior and attitudes, 1972–2012. Archives of Sexual Behavior , 44 , 2273–2285.

Wagner, B. (2009). Becoming a sexual being: Overcoming constraints on female sexuality. Sexualities , 12 (3), 289–311.

Wallwiener, S., Strohmaier, J., Wallwiener, L. M., Schönfisch, B., Zipfel, S., Brucker, S. Y., Rietschel, M., & Wallwiener, C. W. (2016). Sexual function is correlated with body image and partnership quality in female university students. The Journal of Sexual Medicine , 13 (10), 1530–1538.

Willoughby, B. J., Busby, D. M., & Young-Petersen, B. (2018). Understanding associations between Personal definitions of Pornography, using pornography, and Depression. Sexuality Research and Social Policy . https://doi.org/10.1007/s13178-018-0345-x

Witting, K., Santtila, P., Varjonen, M., Jern, P., Johansson, A., Von Der Pahlen, B., & Sandnabba, K. (2008). Female sexual dysfunction, sexual distress, and compatibility with partner. The Journal of Sexual Medicine , 5 (11), 2587–2599.

Yasmine, R., El Salibi, N., El Kak, F., & Ghandour, L. (2015). Postponing sexual debut among university youth: How do men and women differ in their perceptions, values and non-penetrative sexual practices? Culture Health & Sexuality , 17 (5), 555–575.

Download references

No funding was received for conducting this study.

Author information

Authors and affiliations.

Department of Psychology, CHRIST University, Delhi NCR, India

Shivani Rajan & Swati Pathak

You can also search for this author in PubMed   Google Scholar

Contributions

S.R. drafted the research idea, conducted the interview with the participants and prepared the initial draft of the manuscript.S.P the research supervisor who directly monitored the research and reviewed the progress of changes in the manuscript.

Corresponding author

Correspondence to Swati Pathak .

Ethics declarations

Ethics approval and consent to participate.

The study has been approved by the Research Conduct Ethics Committee (RCEC) at CHRIST University. Informed consent was obtained from all individual participants included in the study.

Competing Interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Rajan, S., Pathak, S. Exploring Female Narratives of Sexual Intimacy and the Social Suppression of Desire. Hu Arenas (2024). https://doi.org/10.1007/s42087-024-00441-2

Download citation

Received : 02 February 2024

Revised : 20 August 2024

Accepted : 21 August 2024

Published : 04 September 2024

DOI : https://doi.org/10.1007/s42087-024-00441-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Female sexuality
  • Sexual intimacy

Advertisement

  • Find a journal
  • Publish with us
  • Track your research

COMMENTS

  1. The Levels of Evidence and their role in Evidence-Based Medicine

    Abstract As the name suggests, evidence-based medicine (EBM), is about finding evidence and using that evidence to make clinical decisions. A cornerstone of EBM is the hierarchical system of classifying evidence. This hierarchy is known as the levels of evidence. Physicians are encouraged to find the highest level of evidence to answer clinical questions. Several papers published in Plastic ...

  2. Levels of Evidence Pyramid

    Introduction. One way to organize the different types of evidence involved in evidence-based practice research is the levels of evidence pyramid. The pyramid includes a variety of evidence types and levels. Filtered resources: pre-evaluated in some way. Unfiltered resources: typically original research and first-person accounts.

  3. Hierarchy of evidence

    A hierarchy of evidence, comprising levels of evidence (LOEs), that is, evidence levels (ELs), is a heuristic used to rank the relative strength of results obtained from experimental research, especially medical research. There is broad agreement on the relative strength of large-scale, epidemiological studies.

  4. Hierarchy of Evidence and Study Design

    Individuals split by disease. Survey Study is an epidemiologic study that produces survey results, and will consist of simultaneous assessments of the health outcome, primary risk exposure and potential confounders and effect modifiers. Two types of survey research are cross-sectional and longitudinal studies.

  5. New evidence pyramid

    The first and earliest principle of evidence-based medicine indicated that a hierarchy of evidence exists. Not all evidence is the same. This principle became well known in the early 1990s as practising physicians learnt basic clinical epidemiology skills and started to appraise and apply evidence to their practice. Since evidence was described as a hierarchy, a compelling rationale for a ...

  6. Research Hub: Evidence Based Practice Toolkit: Levels of Evidence

    Levels of Evidence / Evidence Hierarchy Levels of evidence (sometimes called hierarchy of evidence) are assigned to studies based on the research design, quality of the study, and applicability to patient care. Higher levels of evidence have less risk of bias.

  7. Evidence Based Medicine: The Evidence Hierarchy

    The Evidence Hierarchy The hierarchy of evidence is a core principal of EBM. EBM hierarchies rank study types based on the strength and precision of their research methods. Different hierarchies exist for different question types, and even experts may disagree on the exact rank of information in the evidence hierarchies. Still, most agree that current, well designed systematic reviews and meta ...

  8. Research Guides: Systematic Reviews: Levels of Evidence

    The hierarchy of evidence (also known as the evidence-based pyramid) is depicted as a triangular representation of the levels of evidence with the strongest evidence at the top which progresses down through evidence with decreasing strength. At the top of the pyramid are research syntheses, such as Meta-Analyses and Systematic Reviews, the ...

  9. The hierarchy of evidence: Levels and grades of recommendation

    Thus the type of study design used places the study broadly into a hierarchy of evidence from the case series up to the randomized controlled trial. There is also, however, an internal hierarchy within the overall levels of evidence and that is usually based on the study methodology and overall quality. Go to:

  10. Evidence Hierarchy: What is the Best Evidence?

    The evidence hierarchy pyramid is a visual representation of the strength of different research study designs. It can be helpful to think about evidence as a pyramid - not all study designs and resource types are created equal.

  11. Hierarchy of evidence: a framework for ranking evidence evaluating

    • This paper reports the development of a hierarchy for ranking of evidence evaluating healthcare interventions. The aims of this hierarchy are twofold. Firstly, it is to provide a means by which the evidence from a range of methodologically different types of research can be graded. Secondly, it is to provide a logical framework that can be used during the development of systematic review ...

  12. Library: Levels of Evidence and Study Design: Levels of Evidence

    The pyramid below is a hierarchy of evidence for quantitative studies. It shows the hierarchy of studies by study design; starting with secondary and reappraised studies, then primary studies, and finally reports and opinions, which have no study design.

  13. Hierarchy of research evidence

    The hierarchy of research evidence is a framework used to categorise different types of research studies based on their methodological rigor, validity, and potential for bias. It helps researchers and healthcare professionals determine the strength of evidence when making clinical decisions or recommendations. The hierarchy generally progresses ...

  14. Levels of evidence in research

    There are different levels of evidence in research. Here you can read more about the evidence hierarchy and how important it is to follow it.

  15. Hierarchy of Evidence Within the Medical Literature

    The quality of evidence from medical research is partially deemed by the hierarchy of study designs. On the lowest level, the hierarchy of study designs begins with animal and translational studies and expert opinion, and then ascends to descriptive case reports or case series, followed by analytic …

  16. Hierarchy of Evidence

    EBP hierarchies rank study types based on the rigour (strength and precision) of their research methods. Different hierarchies exist for different question types, and even experts may disagree on the exact rank of information in the evidence hierarchies. The following image represents the hierarchy of evidence provided by the National Health and Medical Research Council (NHMRC). 1

  17. Hierarchy of evidence: a framework for ranking evidence ...

    The proposed hierarchy was developed based on a review of literature, investigation of existing hierarchies and examination of the strengths and limitations of different research methods. The proposed hierarchy of evidence focuses on three dimensions of the evaluation: effectiveness, appropriateness and feasibility.

  18. LEVELS OF EVIDENCE IN MEDICINE

    Levels of evidence allow clinicians to appreciate the quality of a particular research paper quickly. The levels are generally set out in a hierarchical order, which is based largely upon the experimental design. While there are ideal designs for studies examining the effects of interventions, risk factors for a clinical condition or diagnostic ...

  19. Randomized, Controlled Trials, Observational Studies, and the Hierarchy

    In the hierarchy of research designs, the results of randomized, controlled trials are considered to be evidence of the highest grade, whereas observational studies are viewed as having less validi...

  20. LSBU Library: Literature Searching: Hierarchy of Evidence

    Hierarchy of Evidence . The "best available evidence" in the hierarchy of evidence refers to: Systematic Reviews and Meta-Analyses. Well-conducted systematic reviews and meta-analyses are generally considered to provide the highest level of evidence for all types of questions in evidence-based practice (EBP) hierarchies. This is because:

  21. The hierarchy of research evidence

    The hierarchy of research evidence - from well conducted meta-analysis down to small case series PLEASE NOTE: We are currently in the process of updating this chapter and we appreciate your patience whilst this is being completed. Evidence-based medicine has been described as 'the conscientious, explicit and judicious use of current best evidence in making decisions about the care of ...

  22. PDF A nurses' guide to the hierarchy of research designs and evidence

    The nursing research pyramid, or nursing research hierarchy of evidence, provides a visual and systematic depiction of forms of research from the least reliable (base) to the most reliable (apex).

  23. Levels of Evidence, Quality Assessment, and Risk of Bias: Evaluating

    Often, that estimate is derived from studies with the same study design or a narrow range of study designs from high levels in the evidence hierarchy for the research question type.

  24. Quantitative classification evaluation model for tight sandstone

    The research results showed that when Q1 ≥ 0.5 and Q2 ≥ 0.5, the reservoir was classified as type I. ... This study quantitatively constructed a new classification and evaluation system for ...

  25. Exploring Female Narratives of Sexual Intimacy and the Social

    Exploring the construction of sexual identities by women, this research attempts to provide an experiential understanding of sexual intimacy in young adulthood through critical narrative analysis of the accounts of ten unmarried cis-gender women, located in the postmodern feminist paradigm and drawing from contemporary psychoanalytic tradition. The study highlighted the lack of discourse on ...