search terms for literature review

Systematic Reviews

  • Introduction
  • Review Process: Step by Step
  • 1. Planning a Review
  • 2. Defining Your Question & Criteria
  • 3. Standards & Protocols

Designing Your Search Strategy

Search strategy checklists, pre-search tips, search strategies: filters & hedges, search terms, search strategies: and/or, phrase searching & truncation.

  • 5. Locating Published Research
  • 6. Locating Grey Literature
  • 7. Managing & Documenting Results
  • 8. Selecting & Appraising Studies
  • 9. Extracting Data
  • 10. Writing a Systematic Review
  • Tools & Software
  • Guides & Tutorials
  • Accessing Resources
  • Research Assistance

A well designed search strategy is essential to the success of your systematic review. Your strategy should be specific, unbiased, reproducible and will typically include subject headings along with a range of keywords/phrases for each of your concepts.  

Your searches should be designed to capture as many studies as possible that meet your criteria.

Chapter 4 of the  Cochrane Handbook for Systematic Reviews of Interventions  provides detailed guidance for searching and study selection; see  Supplement 3.8 Adapting search strategies across databases / sources  for translating your search across databases.

Systematic Reviews: Constructing a Search Strategy and Searching for Evidence  from the Joanna Briggs Institute provides step-by-step guidance using PubMed as an example database. 

General Steps:

  • Locate previous/ relevant searches
  • Identify your databases
  • Develop your search terms and design search
  • Evaluate and modify your search
  • Document your search ( PRISMA-S  Checklist)
  • Translate your search for other databases
  • Step by Step Systematic Review Search Checklist from MD Anderson Center Library
  • PRESS Peer Review Checklist for Search Strategies

Conduct a preliminary set of scoping searches in various databases to test out your search terms (keywords and subject headings) and locate additional terms for your concepts.

Try building a "gold set" of relevant references to help you identify search terms. Sources for this gold set may include:

  • Recommended key papers
  • Papers by known authors in the field
  • Results of preliminary searches from key databases such 
  • Reviewing references and "cited by" articles lists for key papers
  • Articles that have been published in authoritative journals 

Hedges/ Filters

  • PubMed Special Queries

Hedges are search strings created by experts to help you retrieve specific types of studies or topics; a hedge will filter your results by adding specific search terms, or specific combinations of search terms, to your search.  

Hedges can be good starting points but you may need to modify the search string to fit your research. Resources for hedges:

  • University of Texas, School of Public Health  (study type)
  • McMaster University Health Information Research Unit
  • The InterTASC Information Specialists' Sub-Group Search Filter Resource
  • Pubmed Search Strategies blog
  • PubMed Special Queries Topic-Specific PubMed Queries; includes keyword and search strategy examples.

Example:  Health Disparities & Minority Health Search Strategies

((ethnic disparities[TIAB] OR
ethnic disparity[TIAB] OR
health disparities[TIAB] OR
health disparity[TIAB] OR
healthcare disparities[MH] OR
health care disparities[TIAB] OR
healthcare disparities[TIAB] OR
health-care disparities[TIAB] OR
health care disparity[TIAB] OR
healthcare disparity[TIAB] OR
health-care disparity[TIAB] OR
health status disparities[MH] OR
(disparities in health[TIAB]) OR
J Health Care Poor Underserved[Journal] OR
J Health Dispar Res Pract[Journal] OR
J Racial Ethn Health Disparities[Journal]) OR
(culturally competent care[MH] OR
culturally competent care[TIAB] OR
delivery of health care[MH:noexp] OR
disparities[TIAB] OR
health behavior[MH] OR
health behavior[TIAB] OR
health behaviors[TIAB] OR
health inequality[TIAB] OR
health inequalities[TIAB] OR
health inequities[TIAB] OR
health inequity[TIAB] OR
health knowledge, attitudes, practice[MH] OR
health related quality of life[TIAB] OR
health-related quality of life[TIAB] OR
health services accessibility[MH] OR
health services, indigenous[MH] OR
health services needs and demand[MH] OR
health status disparities[MH] OR
mass screening[MH] OR
mass screening[TIAB] OR
mass screenings[TIAB] OR
patient acceptance of health care[MH] OR
patient selection[MH] OR
quality of health care[MeSH Major Topic:noexp] OR
social class[MH] OR
social class[TIAB] OR
social determinants of health[MH] OR
social determinants of health[TIAB] OR
social disparities[TIAB] OR
social disparity[TIAB] OR
social factors[TIAB] OR
social inequities[TIAB] OR
social inequity[TIAB] OR
socioeconomic factor[TIAB] OR
socioeconomic factors[MH] OR
socioeconomic factors[TIAB] OR
socioeconomically disadvantaged[TIAB]) AND
(African American[TIAB] OR
African Americans[TIAB] OR
African ancestry[TIAB] OR
african continental ancestry group[MH] OR
ageism[MH] OR
AIAN[TIAB] OR
Alaska Native[TIAB] OR
Alaska Natives[TIAB] OR
american native continental ancestry group[MH] OR
apartheid[MH] OR
Asian[TIAB] OR
asian continental ancestry group[MH] OR
Asians[TIAB] OR
Black American[TIAB] OR
Black Americans[TIAB] OR
Caucasian[TIAB] OR
Caucasians[TIAB] OR
disabled[TIAB] OR
disabled persons[MH] OR
disabled persons[TIAB] OR
diverse population[TIAB] OR
diverse populations[TIAB] OR
emigrants and immigrants[MH] OR

ethnic group[TIAB] OR
ethnic groups[MH] OR
ethnic groups[TIAB] OR
ethnic inequalities[TIAB] OR
ethnic population[TIAB] OR
ethnic populations[TIAB] OR
ghetto[TIAB] OR
ghettos[TIAB] OR
health services for persons with disabilities[MH] OR
Hispanic[TIAB] OR
hispanic americans[MH] OR
Hispanics[TIAB] OR
homeless[TIAB] OR
homeless persons[MH] OR


immigrant[TIAB] OR
immigrants[TIAB] OR
Indian[TIAB] OR
Indians[TIAB] OR
indians, north american[MH] OR
inmate[TIAB] OR
inmates[TIAB] OR
jail[TIAB] OR
jail population[TIAB] OR
jail populations[TIAB] OR
Latina[TIAB] OR
Latinas[TIAB] OR
Latino[TIAB] OR
Latinos[TIAB] OR
mexican americans[MH] OR
medically underserved area[MH] OR
medically uninsured[MH] OR
minorities' health[TIAB] OR
minority group[TIAB] OR
minority groups[MH] OR
minority groups[TIAB] OR
minority health[MH] OR
minority health[TIAB] OR
minority population[TIAB] OR
minority populations[TIAB] OR
migrant worker[TIAB] OR
migrant workers[TIAB] OR
Native American[TIAB] OR
Native Americans[TIAB] OR
Native Hawaiian[TIAB] OR
Native Hawaiians[TIAB] OR
oceanic ancestry group[MH] OR
Pacific Islander[TIAB] OR
Pacific Islanders[TIAB] OR
people of color[TIAB] OR
poverty[MH] OR
poverty[TIAB] OR
poverty areas[MH] OR
poverty area[TIAB] OR
poverty areas[TIAB] OR
prisoner[TIAB] OR
prisoners[MH] OR
prisoners[TIAB] OR
race factors[MH] OR
race factors[TIAB] OR
race and ethnicity[TIAB] OR
racial and ethnic minorities[TIAB] OR
racial discrimination[TIAB] OR
racial disparities[TIAB] OR
racial disparity[TIAB] OR
racial equality[TIAB] OR
racial equity[TIAB] OR
racial inequities[TIAB] OR
racial inequity[TIAB] OR
racial prejudice[TIAB] OR
racial segregation[TIAB] OR
racism[MH] OR

refugees[MH] OR
refugees[TIAB] OR
rural health[MH] OR
rural health[TIAB] OR
rural health services[MH] OR
rural population[MH] OR
rural population[TIAB] OR
rural populations[TIAB] OR
sexism[MH] OR
slum[TIAB] OR
slums[TIAB] OR
social discrimination[MH] OR
social marginalization[MH] OR
social segregation[MH] OR
transients and migrants[MH] OR
underserved[TIAB] OR
undocumented immigrants[MH] OR
medically uninsured[MH] OR
uninsured[TIAB] OR
urban health[MH] OR
urban health services[MH] OR
urban population[MH] OR
urban population[TIAB] OR


urban populations[TIAB] OR
vulnerable population[TIAB] OR
vulnerable populations[MH] OR
vulnerable populations[TIAB] OR
working poor[MH] OR
working poor[TIAB] OR
bisexuals[TIAB] OR
bisexual[TIAB] OR
bigender[TIAB] OR
disorders of sex development[MH] OR
disorders of sex development[TIAB] OR
female homosexuality[TIAB] OR
gay[TIAB] OR
gays[TIAB] OR
gender change[TIAB] OR
gender confirmation[TIAB] OR
gender disorder[TIAB] OR
gender disorders[TIAB] OR
gender dysphoria[TIAB] OR
gender diverse[TIAB] OR
gender-diverse[TIAB] OR
gender diversity[TIAB] OR
gender identity[MH] OR
gender identity[TIAB] OR
gender minorities[TIAB] OR
gender non conforming[TIAB] OR
gender non-conforming[TIAB] OR
gender orientation[TIAB] OR
genderqueer[TIAB] OR
gender reassignment[TIAB] OR
gender surgery[TIAB] OR
GLBT[TIAB] OR
GLBTQ[TIAB] OR
health services for transgender persons[MH] OR
homophile[TIAB] OR
homophilia[TIAB] OR
homosexual[TIAB] OR
homosexuality[MH] OR
homosexuality, female[MH] OR
homosexuality, male[MH] OR
homosexuals[TIAB] OR
intersex[TIAB] OR
lesbian[TIAB] OR
lesbianism[TIAB] OR
lesbians[TIAB] ORLGBBTQ[TIAB] OR
LGBT[TIAB] OR
LGBTI[TIAB] OR
LGBTQ[TIAB] OR
LGBTQI[TIAB] OR
LGBTQIA[TIAB] OR
men having sex with men[TIAB] OR
men who have sex with men[TIAB] OR
men who have sex with other men[TIAB] OR
nonheterosexual[TIAB] OR
non-heterosexual[TIAB] OR
non heterosexuals[TIAB] OR
nonheterosexuals[TIAB] OR
pansexual[TIAB] OR
polysexual[TIAB] OR
queer[All Fields] OR
same sex [TIAB] OR
sexual and gender disorders[MH] OR
sexual and gender minorities[MH] OR
sex change[TIAB] OR
sex reassignment[TIAB] OR
sex reassignment procedures[MH] OR
sex reassignment surgery[MH] OR
sex reassignment surgery[TIAB] OR
sexual diversity[TIAB] OR
sexual minorities[TIAB] OR
sexual minority[TIAB] OR
sexual orientation[TIAB] OR
transgender*[TIAB] OR
transgender persons[MH] OR
transsexual*[TIAB] OR
transman[TIAB] OR
trans men[TIAB] OR
transmen[TIAB] OR
transsexualism[MH] OR
transsexualism[TIAB] OR
transwoman[TIAB] OR
trans women[TIAB] OR
transwomen[TIAB] OR
two spirit[TIAB] OR
two-spirit[TIAB] OR
women who have sex with women[TIAB]))
  • Subject Headings
  • Keywords Vs. Subject Headings
  • Locating Subject Headings
  • Medical Subject Headings (MeSH)
  • Keyword & Subject Headings Logic Grid

You can use your PICOTS concepts as preliminary search terms. The important terms in this question:

In adults , is screening for depression and feedback of results to providers more effective than no screening and feedback in improving outcomes of major depression in primary care settings?

...might include:

Major depression

Primary Care

(From Lackey, M. (2013). Systematic reviews: Searching the literature [PowerPoint slides]. Retrieved from http://guides.lib.unc.edu/ld.php?content_id=258919 )

Your search will include both keywords and subject headings. Controlled vocabulary systems, such as the Medical Subject Headings (MeSH) or Library of Congress Subject Headings (LCSH) , use pre-set terms that are used to tag resources on similar subjects. See boxes below for more information on finding and using subject terms.

Not all databases will have subject heading searching and for those that do, the subject heading categories may differ between databases. This is because databases classify articles using different criteria.

Using the keywords from our example, here are some MeSH terms for:

Adults : Adult (A person having attained full growth or maturity. Adults are of 19 through 44 years of age. For a person between 19 and 24 years of age, YOUNG ADULT is available.)

Screening : Mass Screening (Organized periodic procedures performed on large groups of people for the purpose of detecting disease.)

Major depression : Depressive Disorder, Major (Marked depression appearing in the involution period and characterized by hallucinations, delusions, paranoia, and agitation.)

Here is a LCSH subject term for:

Depression : Depression, mental (Dejection ; Depression, Unipolar ; Depressive disorder ; Depressive psychoses ; Melancholia ; Mental depression ; Unipolar depression)

keywords vs subjects chart

  • Most EBSCO databases have a tool to help you discover subject terms . See Academic Search Complete > Subject Terms and Academic Search Complete > Subject Terms: Thesaurus
  • Most ProQuest databases have a tool to help you discover subject terms: See PsycInfo > Thesaurus
  • When you find a useful article, look at the article's Subject Headings (or Subject or Subject Terms) , and record them as possible terms to use in a subject term search.

Here is an example of the subject terms listed for a systematic review found in PsycINFO, " Primary care screening for and treatment of depression in pregnant and postpartum women: Evidence report and systematic review for the US Preventive Services Task Force " (2016).

MeSH are standardized terms that describe the main concepts of PubMed/MedLine articles. Searching with MeSH can increase the precision of your search by providing a consistent way to retrieve articles that may use different terminology or spelling variations. 

Note: new articles will not have MeSH terms; the indexing process may take up to a few weeks for newly ingested articles. 

Use the  MeSH  database  to locate and build a search using MeSH.

search terms for literature review

To search the MeSH database:

  • Search for 1 concept at a time.
  • If you do not see a relevant MeSH in the results, search again with a synonym or related term.
  • Click on the MeSH term to view to the complete record​, subheadings, broader and narrower terms. 

Build a search from the results list or from the MeSH term record to specify subheadings.

  • Select the box next to the MeSH term or subheadings that you wish to search and click Add to Search Builder.
  • ​You may need to switch  AND to OR , depending on how you would like to combine terms.
  • Repeat the above steps to add additional MeSH terms. When your search is ready, click  Search PubMed.

search terms for literature review

Logic Grid with Keywords and Index Terms or Subject Headings from Systematic Reviews: Constructing a Search Strategy and Searching for Evidence.

search terms for literature review

 Bhuiyan, M. U., Stiboy, E., Hassan, M. Z., Chan, M., Islam, M. S., Haider, N., Jaffe, A., & Homaira, N. (2021). Epidemiology of COVID-19 infection in young children under five years: A systematic review and meta-analysis.   Vaccine ,  39 (4), 667–677. https://doi.org/10.1016/j.vaccine.2020.11.078 

1

( TITLE-ABS-KEY ( "2019 nCoV"  OR  2019ncov  OR  "2019-nCoV"  OR  "2019 novel coronavirus"  OR  "Novel coronavirus 2019"  OR  "COVID 19"  OR  "COVID-19"  OR  "COVID19"  OR  "Wuhan coronavirus"  OR  "Wuhan pneumonia"  OR  "SARS CoV-2"  OR  "SARS-Cov-2" )  AND  TITLE-ABS-KEY ( children  OR  child*  OR  infant  OR  pediatric  OR  paediatric  OR  adolescent ) )

1

TS=("2019 nCoV") OR TS=(2019ncov) OR TS=("2019-nCoV") OR TS=("2019 novel coronavirus") OR TS=("Novel coronavirus 2019") OR TS=("COVID 19") OR TS=("COVID-19") OR TS=(COVID19) OR TS=("Wuhan coronavirus") OR TS=("Wuhan pneumonia") OR TS=("SARS CoV-2") OR TS=("SARS-Cov-2")

Indexes=SCI-EXPANDED, SSCI, A&HCI, CPCI-S, CPCI-SSH, BKCI-S, BKCI-SSH, ESCI, CCR-EXPANDED, IC Timespan=All years

2

TS=(infant) OR TS=(child) OR TS=(children) OR TS=(adolescent) OR TS=(paediatric) OR TS=(pediatric)

Indexes=SCI-EXPANDED, SSCI, A&HCI, CPCI-S, CPCI-SSH, BKCI-S, BKCI-SSH, ESCI, CCR-EXPANDED, IC Timespan=All years

3

#2 AND #1

Indexes=SCI-EXPANDED, SSCI, A&HCI, CPCI-S, CPCI-SSH, BKCI-S, BKCI-SSH, ESCI, CCR-EXPANDED, IC Timespan=All years

(((((((((((("2019 nCoV"[Title/Abstract] OR "2019ncov"[Title/Abstract]) OR "2019-nCoV"[Title/Abstract]) OR "2019 novel coronavirus"[Title/Abstract]) OR "Novel coronavirus 2019"[Title/Abstract]) OR "COVID 19"[Title/Abstract]) OR "COVID-19"[Title/Abstract]) OR "COVID19"[Title/Abstract]) OR "Wuhan coronavirus"[Title/Abstract]) OR "SARS CoV-2"[Title/Abstract]) OR "SARS-Cov-2"[Title/Abstract]) AND (((((((((((infant[Title/Abstract] OR "infant"[MeSH Terms]) OR child[Title/Abstract]) OR "child"[MeSH Terms]) OR children[Title/Abstract]) OR "child"[MeSH Terms]) OR adolescent[Title/Abstract]) OR "adolescent"[MeSH Terms]) OR paediatric[Title/Abstract]) OR "pediatrics"[MeSH Terms]) OR pediatric[Title/Abstract]) OR "pediatrics"[MeSH Terms])) AND "humans"[MeSH Terms]) AND (((((((((((("2019 nCoV"[Title/Abstract] OR "2019ncov"[Title/Abstract]) OR "2019-nCoV"[Title/Abstract]) OR "2019 novel coronavirus"[Title/Abstract]) OR "Novel coronavirus 2019"[Title/Abstract]) OR "COVID 19"[Title/Abstract]) OR "COVID-19"[Title/Abstract]) OR "COVID19"[Title/Abstract]) OR "Wuhan coronavirus"[Title/Abstract]) OR "SARS CoV-2"[Title/Abstract]) OR "SARS-Cov-2"[Title/Abstract]) AND (((((((((((infant[Title/Abstract] OR "infant"[MeSH Terms]) OR child[Title/Abstract]) OR "child"[MeSH Terms]) OR children[Title/Abstract]) OR "child"[MeSH Terms]) OR adolescent[Title/Abstract]) OR "adolescent"[MeSH Terms]) OR paediatric[Title/Abstract]) OR "pediatrics"[MeSH Terms]) OR pediatric[Title/Abstract]) OR "pediatrics"[MeSH Terms])) AND "humans"[MeSH Terms])

  • Boolean Logic: AND, OR, NOT
  • Phrase Searching " "
  • Truncation *
  • Proximity Searching

AND, OR, NOT

Join together search terms in a logical manner.

AND - narrows searches, used to join dissimilar terms OR - broadens searches, used to join similar terms

NOT -  removes results containing specified keywords

#1 "major depression" AND "primary care"

#2 screen* OR feedback

#3 (screen* OR feedback)

AND “major depression”

AND “primary care”

"major depression" NOT suicide

" "  To search for specific phrases, enclose them in quotation marks . The database will search for those words together in that order.

“ primary care ”

“ major depression ”

Truncate a word in order to search for different forms of the same word. Many databases use the asterisk * as the truncation symbol.

Add the truncation symbol to the word screen * to search for screen, screens, screening, etc.

You do have to be careful with truncation. If you add the truncation symbol to the word minor* , the database will search for minor, minors, minority, minorities, etc.

Not all databases support proximity searching. You can use these strategies in ProQuest databases such as  Sociological Abstracts .

pre/#  is used to search for terms in proximity to each other in a  specific order;  # is replaced with the number of words permitted between the search terms.

Sample Search: parent*  pre/2  educational (within 2 words &   in order )

  • This would retrieve articles with no more than two words between parent* and educational (in this order) e.g. " Parent  practices and  educational  achievement" OR " Parents  on  Educational  Attainment" OR " Parental  Values,  Educational  Attainment" etc.

w/#  is used to search for terms in proximity to each other in  any order ; # is replaced with the number of words permitted between the search terms.

Sample Search: parent*  w/3  educational (within 3 words & in  any order )

  • This would retrieve articles with no more than three words between parent* and educational (in any order)   e.g. "Educational practices of parents" OR "Parents value motivation and education" OR "Educational attainments of Latino parents"
  • << Previous: 3. Standards & Protocols
  • Next: 5. Locating Published Research >>
  • Last Updated: Jun 18, 2024 4:31 PM
  • URL: https://libguides.ucmerced.edu/systematic-reviews

University of California, Merced

search terms for literature review

Help us improve our Library guides with this 5 minute survey . We appreciate your feedback!

  • UOW Library
  • Key guides for students

Literature Review

How to search effectively.

  • Find examples of literature reviews
  • How to write a literature review
  • Grey literature

The  Literature searching interactive tutorial  includes self-paced, guided activities to assist you in developing  effective search skills..

1. Identify search words

Analyse your research topic or question.

  • What are the main ideas?
  • What concepts or theories have you already covered?
  • Write down your main ideas, synonyms, related words and phrases.
  • If you're looking for specific types of research, use these suggested terms: qualitative, quantitative, methodology, review, survey, test, trend (and more).
  • Be aware of UK and US spelling variations. E.g. organisation OR organization, ageing OR aging.
  • Interactive Keyword Builder
  • Identifying effective keywords

2. Connect your search words

Find results with one or more search words.

Use OR between words that mean the same thing.

E.g.  adolescent  OR  teenager

This search will find results with either (or both) of the search words.

Find results with two search words

Use AND between words which represent the main ideas in the question.

E.g. adolescent AND “physical activity”

This will find results with both of the search words.

Exclude search words

Use NOT to exclude words that you don’t want in your search results.

E.g. (adolescent OR teenager) NOT “young adult”

3. Use search tricks

Search for different word endings.

Truncation *

The asterisk symbol * will help you search for different word endings.

E.g. teen* will find results with the words: teen, teens, teenager, teenagers

Specific truncation symbols will vary. Check the 'Help' section of the database you are searching.

Search for common phrases

Phrase searching “...........”

Double quotation marks help you search for common phrases and make your results more relevant.

E.g. “physical activity” will find results with the words physical activity together as a phrase.

Search for spelling variations within related terms

Wildcards ?

Wildcard symbols allow you to search for spelling variations within the same or related terms.

E.g. wom?n will find results with women OR woman

Specific wild card symbols will vary. Check the 'Help' section of the database you are searching.

Search terms within specific ranges of each other

Proximity  w/#

Proximity searching allows you to specify where your search terms will appear in relation to each other.

E.g.  pain w/10 morphine will search for pain within ten words of morphine

Specific proximity symbols will vary. Check the 'Help' section of the database you are searching.

4. Improve your search results

All library databases are different and you can't always search and refine in the same way. Try to be consistent when transferring your search in the library databases you have chosen.

Narrow and refine your search results by:

  • year of publication or date range (for recent or historical research)
  • document or source type (e.g. article, review or book)
  • subject or keyword (for relevance). Try repeating your search using the 'subject' headings or 'keywords' field to focus your search
  • searching in particular fields, i.e. citation and abstract. Explore the available dropdown menus to change the fields to be searched.

When searching, remember to:

Adapt your search and keep trying.

Searching for information is a process and you won't always get it right the first time. Improve your results by changing your search and trying again until you're happy with what you have found.

Keep track of your searches

Keeping track of searches saves time as you can rerun them, store references, and set up regular alerts for new research relevant to your topic.

Most library databases allow you to register with a personal account. Look for a 'log in', 'sign in' or 'register' button to get started.

  • Literature review search tracker (Excel spreadsheet)

Manage your references

There are free and subscription reference management programs available on the web or to download on your computer.

  • EndNote - The University has a license for EndNote. It is available for all students and staff, although is recommended for postgraduates and academic staff.
  • Zotero - Free software recommended for undergraduate students.
  • Previous: How to write a literature review
  • Next: Where to search when doing a literature review
  • Last Updated: May 28, 2024 9:42 AM
  • URL: https://uow.libguides.com/literaturereview

Insert research help text here

LIBRARY RESOURCES

Library homepage

Library SEARCH

A-Z Databases

STUDY SUPPORT

Academic Skills Centre

Referencing and citing

Digital Skills Hub

MORE UOW SERVICES

UOW homepage

Student support and wellbeing

IT Services

search terms for literature review

On the lands that we study, we walk, and we live, we acknowledge and respect the traditional custodians and cultural knowledge holders of these lands.

search terms for literature review

Copyright & disclaimer | Privacy & cookie usage

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • How to Write a Literature Review | Guide, Examples, & Templates

How to Write a Literature Review | Guide, Examples, & Templates

Published on January 2, 2023 by Shona McCombes . Revised on September 11, 2023.

What is a literature review? A literature review is a survey of scholarly sources on a specific topic. It provides an overview of current knowledge, allowing you to identify relevant theories, methods, and gaps in the existing research that you can later apply to your paper, thesis, or dissertation topic .

There are five key steps to writing a literature review:

  • Search for relevant literature
  • Evaluate sources
  • Identify themes, debates, and gaps
  • Outline the structure
  • Write your literature review

A good literature review doesn’t just summarize sources—it analyzes, synthesizes , and critically evaluates to give a clear picture of the state of knowledge on the subject.

Instantly correct all language mistakes in your text

Upload your document to correct all your mistakes in minutes

upload-your-document-ai-proofreader

Table of contents

What is the purpose of a literature review, examples of literature reviews, step 1 – search for relevant literature, step 2 – evaluate and select sources, step 3 – identify themes, debates, and gaps, step 4 – outline your literature review’s structure, step 5 – write your literature review, free lecture slides, other interesting articles, frequently asked questions, introduction.

  • Quick Run-through
  • Step 1 & 2

When you write a thesis , dissertation , or research paper , you will likely have to conduct a literature review to situate your research within existing knowledge. The literature review gives you a chance to:

  • Demonstrate your familiarity with the topic and its scholarly context
  • Develop a theoretical framework and methodology for your research
  • Position your work in relation to other researchers and theorists
  • Show how your research addresses a gap or contributes to a debate
  • Evaluate the current state of research and demonstrate your knowledge of the scholarly debates around your topic.

Writing literature reviews is a particularly important skill if you want to apply for graduate school or pursue a career in research. We’ve written a step-by-step guide that you can follow below.

Literature review guide

Don't submit your assignments before you do this

The academic proofreading tool has been trained on 1000s of academic texts. Making it the most accurate and reliable proofreading tool for students. Free citation check included.

search terms for literature review

Try for free

Writing literature reviews can be quite challenging! A good starting point could be to look at some examples, depending on what kind of literature review you’d like to write.

  • Example literature review #1: “Why Do People Migrate? A Review of the Theoretical Literature” ( Theoretical literature review about the development of economic migration theory from the 1950s to today.)
  • Example literature review #2: “Literature review as a research methodology: An overview and guidelines” ( Methodological literature review about interdisciplinary knowledge acquisition and production.)
  • Example literature review #3: “The Use of Technology in English Language Learning: A Literature Review” ( Thematic literature review about the effects of technology on language acquisition.)
  • Example literature review #4: “Learners’ Listening Comprehension Difficulties in English Language Learning: A Literature Review” ( Chronological literature review about how the concept of listening skills has changed over time.)

You can also check out our templates with literature review examples and sample outlines at the links below.

Download Word doc Download Google doc

Before you begin searching for literature, you need a clearly defined topic .

If you are writing the literature review section of a dissertation or research paper, you will search for literature related to your research problem and questions .

Make a list of keywords

Start by creating a list of keywords related to your research question. Include each of the key concepts or variables you’re interested in, and list any synonyms and related terms. You can add to this list as you discover new keywords in the process of your literature search.

  • Social media, Facebook, Instagram, Twitter, Snapchat, TikTok
  • Body image, self-perception, self-esteem, mental health
  • Generation Z, teenagers, adolescents, youth

Search for relevant sources

Use your keywords to begin searching for sources. Some useful databases to search for journals and articles include:

  • Your university’s library catalogue
  • Google Scholar
  • Project Muse (humanities and social sciences)
  • Medline (life sciences and biomedicine)
  • EconLit (economics)
  • Inspec (physics, engineering and computer science)

You can also use boolean operators to help narrow down your search.

Make sure to read the abstract to find out whether an article is relevant to your question. When you find a useful book or article, you can check the bibliography to find other relevant sources.

You likely won’t be able to read absolutely everything that has been written on your topic, so it will be necessary to evaluate which sources are most relevant to your research question.

For each publication, ask yourself:

  • What question or problem is the author addressing?
  • What are the key concepts and how are they defined?
  • What are the key theories, models, and methods?
  • Does the research use established frameworks or take an innovative approach?
  • What are the results and conclusions of the study?
  • How does the publication relate to other literature in the field? Does it confirm, add to, or challenge established knowledge?
  • What are the strengths and weaknesses of the research?

Make sure the sources you use are credible , and make sure you read any landmark studies and major theories in your field of research.

You can use our template to summarize and evaluate sources you’re thinking about using. Click on either button below to download.

Take notes and cite your sources

As you read, you should also begin the writing process. Take notes that you can later incorporate into the text of your literature review.

It is important to keep track of your sources with citations to avoid plagiarism . It can be helpful to make an annotated bibliography , where you compile full citation information and write a paragraph of summary and analysis for each source. This helps you remember what you read and saves time later in the process.

To begin organizing your literature review’s argument and structure, be sure you understand the connections and relationships between the sources you’ve read. Based on your reading and notes, you can look for:

  • Trends and patterns (in theory, method or results): do certain approaches become more or less popular over time?
  • Themes: what questions or concepts recur across the literature?
  • Debates, conflicts and contradictions: where do sources disagree?
  • Pivotal publications: are there any influential theories or studies that changed the direction of the field?
  • Gaps: what is missing from the literature? Are there weaknesses that need to be addressed?

This step will help you work out the structure of your literature review and (if applicable) show how your own research will contribute to existing knowledge.

  • Most research has focused on young women.
  • There is an increasing interest in the visual aspects of social media.
  • But there is still a lack of robust research on highly visual platforms like Instagram and Snapchat—this is a gap that you could address in your own research.

There are various approaches to organizing the body of a literature review. Depending on the length of your literature review, you can combine several of these strategies (for example, your overall structure might be thematic, but each theme is discussed chronologically).

Chronological

The simplest approach is to trace the development of the topic over time. However, if you choose this strategy, be careful to avoid simply listing and summarizing sources in order.

Try to analyze patterns, turning points and key debates that have shaped the direction of the field. Give your interpretation of how and why certain developments occurred.

If you have found some recurring central themes, you can organize your literature review into subsections that address different aspects of the topic.

For example, if you are reviewing literature about inequalities in migrant health outcomes, key themes might include healthcare policy, language barriers, cultural attitudes, legal status, and economic access.

Methodological

If you draw your sources from different disciplines or fields that use a variety of research methods , you might want to compare the results and conclusions that emerge from different approaches. For example:

  • Look at what results have emerged in qualitative versus quantitative research
  • Discuss how the topic has been approached by empirical versus theoretical scholarship
  • Divide the literature into sociological, historical, and cultural sources

Theoretical

A literature review is often the foundation for a theoretical framework . You can use it to discuss various theories, models, and definitions of key concepts.

You might argue for the relevance of a specific theoretical approach, or combine various theoretical concepts to create a framework for your research.

Like any other academic text , your literature review should have an introduction , a main body, and a conclusion . What you include in each depends on the objective of your literature review.

The introduction should clearly establish the focus and purpose of the literature review.

Depending on the length of your literature review, you might want to divide the body into subsections. You can use a subheading for each theme, time period, or methodological approach.

As you write, you can follow these tips:

  • Summarize and synthesize: give an overview of the main points of each source and combine them into a coherent whole
  • Analyze and interpret: don’t just paraphrase other researchers — add your own interpretations where possible, discussing the significance of findings in relation to the literature as a whole
  • Critically evaluate: mention the strengths and weaknesses of your sources
  • Write in well-structured paragraphs: use transition words and topic sentences to draw connections, comparisons and contrasts

In the conclusion, you should summarize the key findings you have taken from the literature and emphasize their significance.

When you’ve finished writing and revising your literature review, don’t forget to proofread thoroughly before submitting. Not a language expert? Check out Scribbr’s professional proofreading services !

This article has been adapted into lecture slides that you can use to teach your students about writing a literature review.

Scribbr slides are free to use, customize, and distribute for educational purposes.

Open Google Slides Download PowerPoint

If you want to know more about the research process , methodology , research bias , or statistics , make sure to check out some of our other articles with explanations and examples.

  • Sampling methods
  • Simple random sampling
  • Stratified sampling
  • Cluster sampling
  • Likert scales
  • Reproducibility

 Statistics

  • Null hypothesis
  • Statistical power
  • Probability distribution
  • Effect size
  • Poisson distribution

Research bias

  • Optimism bias
  • Cognitive bias
  • Implicit bias
  • Hawthorne effect
  • Anchoring bias
  • Explicit bias

A literature review is a survey of scholarly sources (such as books, journal articles, and theses) related to a specific topic or research question .

It is often written as part of a thesis, dissertation , or research paper , in order to situate your work in relation to existing knowledge.

There are several reasons to conduct a literature review at the beginning of a research project:

  • To familiarize yourself with the current state of knowledge on your topic
  • To ensure that you’re not just repeating what others have already done
  • To identify gaps in knowledge and unresolved problems that your research can address
  • To develop your theoretical framework and methodology
  • To provide an overview of the key findings and debates on the topic

Writing the literature review shows your reader how your work relates to existing research and what new insights it will contribute.

The literature review usually comes near the beginning of your thesis or dissertation . After the introduction , it grounds your research in a scholarly field and leads directly to your theoretical framework or methodology .

A literature review is a survey of credible sources on a topic, often used in dissertations , theses, and research papers . Literature reviews give an overview of knowledge on a subject, helping you identify relevant theories and methods, as well as gaps in existing research. Literature reviews are set up similarly to other  academic texts , with an introduction , a main body, and a conclusion .

An  annotated bibliography is a list of  source references that has a short description (called an annotation ) for each of the sources. It is often assigned as part of the research process for a  paper .  

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, September 11). How to Write a Literature Review | Guide, Examples, & Templates. Scribbr. Retrieved August 21, 2024, from https://www.scribbr.com/dissertation/literature-review/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, what is a theoretical framework | guide to organizing, what is a research methodology | steps & tips, how to write a research proposal | examples & templates, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

  • UNC Libraries
  • HSL Academic Process
  • Searching PubMed
  • Literature Reviews

Searching PubMed: Literature Reviews

Created by health science librarians.

HSL Logo

  • Basic Searches
  • Filters and Narrowing Searches
  • Find Full-Text Articles
  • Save Search Results
  • Saving Searches & Creating Alerts
  • My NCBI Accounts

Section Objective

What is a literature review, clearly stated research question, search terms, searching worksheets, boolean and / or.

The content in the Literature Review section defines the literature review purpose and process, explains using the PICO format to ask a clear research question, and demonstrates how to evaluate and modify search results to improve the accuracy of the retrieval.

A literature review seeks to identify, analyze and summarize the published research literature about a specific topic.  Literature reviews are assigned as course projects; included as the introductory part of master's and PhD theses; and are conducted before undertaking any new scientific research project.

The purpose of a literature review is to establish what is currently known about a specific topic and to evaluate the strength of the evidence upon which that knowledge is based. A review of a clinical topic may identify implications for clinical practice. Literature reviews also identify areas of a topic that need further research.

A systematic review is a literature review that follows a rigorous process to find all of the research conducted on a topic and then critically appraises the research methods of the highest quality reports. These reviews track and report their search and appraisal methods in addition to providing a summary of the knowledge established by the appraised research.

The UNC Writing Center provides a nice summary of what to consider when writing a literature review for a class assignment. The online book, Doing a literature review in health and social care : a practical guide (2010), is a good resource for more information on this topic.

Obviously, the quality of the search process will determine the quality of all literature reviews. Anyone undertaking a literature review on a new topic would benefit from meeting with a librarian to discuss search strategies. A consultaiton with a librarian is strongly recommended for anyone undertaking a systematic review.

Use the email form on our Ask a Librarian page to arrange a meeting with a librarian.

The first step to a successful literature review search is to state your research question as clearly as possible.

It is important to:

  • be as specific as possible
  • include all aspects of your question

Clinical and social science questions often have these aspects (PICO):

  • People/population/problem  (What are the characteristics of the population?  What is the condition or disease?)
  • Intervention (What do you want to do with this patient?  i.e. treat, diagnose)
  • Comparisons [not always included]  (What is the alternative to this intervention?  i.e. placebo, different drug, surgery)
  • Outcomes  (What are the relevant outcomes?  i.e. morbidity, death, complications)

If the PICO model does not fit your question, try to use other ways to help be sure to articulate all parts of your question. Perhaps asking yourself Who, What, Why, How will help.  

Example Question:  Is acupuncture as effective of a therapy as triptans in the treament of adult migraine?

Note that this question fits the PICO model.

  • Population: Adults with migraines
  • Intervention: Acupuncture
  • Comparison: Triptans/tryptamines
  • Outcome: Fewer Headache days, Fewer migraines

A literature review search is an iterative process. Your goal is to find all of the articles that are pertinent to your subject. Successful searching requires you to think about the complexity of language. You need to match the words you use in your search to the words used by article authors and database indexers. A thorough PubMed search must identify the author words likely to be in the title and abstract or the indexer's selected MeSH (Medical Subject Heading) Terms.

Start by doing a preliminary search using the words from the key parts of your research question.

Step #1: Initial Search

Enter the key concepts from your research question combined with the Boolean operator AND. PubMed does automatically combine your terms with AND. However, it can be easier to modify your search if you start by including the Boolean operators.

migraine AND acupuncture AND tryptamines

The search retrieves a number of relevant article records, but probably not everything on the topic.

Step #2: Evaluate Results

Use the Display Settings drop down in the upper left hand corner of the results page to change to Abstract display.

Review the results and move articles that are directly related to your topic to the Clipboard .

Go to the Clipboard to examine the language in the articles that are directly related to your topic.

  • look for words in the titles and abstracts of these pertinent articles that differ from the words you used
  • look for relevant MeSH terms in the list linked at the bottom of each article

The following two articles were selected from the search results and placed on the Clipboard.

   

Here are word differences to consider:

  • Initial search used acupuncture. MeSH Terms use Acupuncture therapy.
  • Initial search used migraine.  Related word from MeSH Terms is Migraine without Aura and Migraine Disorders.
  • Initial search used tryptamines. Article title uses sumatriptan. Related word from MeSH is Sumatriptan or Tryptamines.

With this knowledge you can reformulate your search to expand your retrieval, adding synonyms for all concepts except for manual and plaque.

#3 Revise Search

Use the Boolean OR operator to group synonyms together and use parentheses around the OR groups so they will be searched properly. See the image below to review the difference between Boolean OR / Boolean AND.

Here is what the new search looks like:

(migraine OR migraine disorders) AND (acupuncture OR acupuncture therapy) AND (tryptamines OR sumatriptan)

  • Search Worksheet Example: Acupuncture vs. Triptans for Migraine
  • Search Worksheet

Venn diagram with all segments highlighted

Combining search words with the search.  
  • << Previous: My NCBI Accounts
  • Last Updated: Aug 13, 2024 11:48 AM
  • URL: https://guides.lib.unc.edu/search-pubmed
  • Open access
  • Published: 14 August 2018

Defining the process to literature searching in systematic reviews: a literature review of guidance and supporting studies

  • Chris Cooper   ORCID: orcid.org/0000-0003-0864-5607 1 ,
  • Andrew Booth 2 ,
  • Jo Varley-Campbell 1 ,
  • Nicky Britten 3 &
  • Ruth Garside 4  

BMC Medical Research Methodology volume  18 , Article number:  85 ( 2018 ) Cite this article

210k Accesses

223 Citations

117 Altmetric

Metrics details

Systematic literature searching is recognised as a critical component of the systematic review process. It involves a systematic search for studies and aims for a transparent report of study identification, leaving readers clear about what was done to identify studies, and how the findings of the review are situated in the relevant evidence.

Information specialists and review teams appear to work from a shared and tacit model of the literature search process. How this tacit model has developed and evolved is unclear, and it has not been explicitly examined before.

The purpose of this review is to determine if a shared model of the literature searching process can be detected across systematic review guidance documents and, if so, how this process is reported in the guidance and supported by published studies.

A literature review.

Two types of literature were reviewed: guidance and published studies. Nine guidance documents were identified, including: The Cochrane and Campbell Handbooks. Published studies were identified through ‘pearl growing’, citation chasing, a search of PubMed using the systematic review methods filter, and the authors’ topic knowledge.

The relevant sections within each guidance document were then read and re-read, with the aim of determining key methodological stages. Methodological stages were identified and defined. This data was reviewed to identify agreements and areas of unique guidance between guidance documents. Consensus across multiple guidance documents was used to inform selection of ‘key stages’ in the process of literature searching.

Eight key stages were determined relating specifically to literature searching in systematic reviews. They were: who should literature search, aims and purpose of literature searching, preparation, the search strategy, searching databases, supplementary searching, managing references and reporting the search process.

Conclusions

Eight key stages to the process of literature searching in systematic reviews were identified. These key stages are consistently reported in the nine guidance documents, suggesting consensus on the key stages of literature searching, and therefore the process of literature searching as a whole, in systematic reviews. Further research to determine the suitability of using the same process of literature searching for all types of systematic review is indicated.

Peer Review reports

Systematic literature searching is recognised as a critical component of the systematic review process. It involves a systematic search for studies and aims for a transparent report of study identification, leaving review stakeholders clear about what was done to identify studies, and how the findings of the review are situated in the relevant evidence.

Information specialists and review teams appear to work from a shared and tacit model of the literature search process. How this tacit model has developed and evolved is unclear, and it has not been explicitly examined before. This is in contrast to the information science literature, which has developed information processing models as an explicit basis for dialogue and empirical testing. Without an explicit model, research in the process of systematic literature searching will remain immature and potentially uneven, and the development of shared information models will be assumed but never articulated.

One way of developing such a conceptual model is by formally examining the implicit “programme theory” as embodied in key methodological texts. The aim of this review is therefore to determine if a shared model of the literature searching process in systematic reviews can be detected across guidance documents and, if so, how this process is reported and supported.

Identifying guidance

Key texts (henceforth referred to as “guidance”) were identified based upon their accessibility to, and prominence within, United Kingdom systematic reviewing practice. The United Kingdom occupies a prominent position in the science of health information retrieval, as quantified by such objective measures as the authorship of papers, the number of Cochrane groups based in the UK, membership and leadership of groups such as the Cochrane Information Retrieval Methods Group, the HTA-I Information Specialists’ Group and historic association with such centres as the UK Cochrane Centre, the NHS Centre for Reviews and Dissemination, the Centre for Evidence Based Medicine and the National Institute for Clinical Excellence (NICE). Coupled with the linguistic dominance of English within medical and health science and the science of systematic reviews more generally, this offers a justification for a purposive sample that favours UK, European and Australian guidance documents.

Nine guidance documents were identified. These documents provide guidance for different types of reviews, namely: reviews of interventions, reviews of health technologies, reviews of qualitative research studies, reviews of social science topics, and reviews to inform guidance.

Whilst these guidance documents occasionally offer additional guidance on other types of systematic reviews, we have focused on the core and stated aims of these documents as they relate to literature searching. Table  1 sets out: the guidance document, the version audited, their core stated focus, and a bibliographical pointer to the main guidance relating to literature searching.

Once a list of key guidance documents was determined, it was checked by six senior information professionals based in the UK for relevance to current literature searching in systematic reviews.

Identifying supporting studies

In addition to identifying guidance, the authors sought to populate an evidence base of supporting studies (henceforth referred to as “studies”) that contribute to existing search practice. Studies were first identified by the authors from their knowledge on this topic area and, subsequently, through systematic citation chasing key studies (‘pearls’ [ 1 ]) located within each key stage of the search process. These studies are identified in Additional file  1 : Appendix Table 1. Citation chasing was conducted by analysing the bibliography of references for each study (backwards citation chasing) and through Google Scholar (forward citation chasing). A search of PubMed using the systematic review methods filter was undertaken in August 2017 (see Additional file 1 ). The search terms used were: (literature search*[Title/Abstract]) AND sysrev_methods[sb] and 586 results were returned. These results were sifted for relevance to the key stages in Fig.  1 by CC.

figure 1

The key stages of literature search guidance as identified from nine key texts

Extracting the data

To reveal the implicit process of literature searching within each guidance document, the relevant sections (chapters) on literature searching were read and re-read, with the aim of determining key methodological stages. We defined a key methodological stage as a distinct step in the overall process for which specific guidance is reported, and action is taken, that collectively would result in a completed literature search.

The chapter or section sub-heading for each methodological stage was extracted into a table using the exact language as reported in each guidance document. The lead author (CC) then read and re-read these data, and the paragraphs of the document to which the headings referred, summarising section details. This table was then reviewed, using comparison and contrast to identify agreements and areas of unique guidance. Consensus across multiple guidelines was used to inform selection of ‘key stages’ in the process of literature searching.

Having determined the key stages to literature searching, we then read and re-read the sections relating to literature searching again, extracting specific detail relating to the methodological process of literature searching within each key stage. Again, the guidance was then read and re-read, first on a document-by-document-basis and, secondly, across all the documents above, to identify both commonalities and areas of unique guidance.

Results and discussion

Our findings.

We were able to identify consensus across the guidance on literature searching for systematic reviews suggesting a shared implicit model within the information retrieval community. Whilst the structure of the guidance varies between documents, the same key stages are reported, even where the core focus of each document is different. We were able to identify specific areas of unique guidance, where a document reported guidance not summarised in other documents, together with areas of consensus across guidance.

Unique guidance

Only one document provided guidance on the topic of when to stop searching [ 2 ]. This guidance from 2005 anticipates a topic of increasing importance with the current interest in time-limited (i.e. “rapid”) reviews. Quality assurance (or peer review) of literature searches was only covered in two guidance documents [ 3 , 4 ]. This topic has emerged as increasingly important as indicated by the development of the PRESS instrument [ 5 ]. Text mining was discussed in four guidance documents [ 4 , 6 , 7 , 8 ] where the automation of some manual review work may offer efficiencies in literature searching [ 8 ].

Agreement between guidance: Defining the key stages of literature searching

Where there was agreement on the process, we determined that this constituted a key stage in the process of literature searching to inform systematic reviews.

From the guidance, we determined eight key stages that relate specifically to literature searching in systematic reviews. These are summarised at Fig. 1 . The data extraction table to inform Fig. 1 is reported in Table  2 . Table 2 reports the areas of common agreement and it demonstrates that the language used to describe key stages and processes varies significantly between guidance documents.

For each key stage, we set out the specific guidance, followed by discussion on how this guidance is situated within the wider literature.

Key stage one: Deciding who should undertake the literature search

The guidance.

Eight documents provided guidance on who should undertake literature searching in systematic reviews [ 2 , 4 , 6 , 7 , 8 , 9 , 10 , 11 ]. The guidance affirms that people with relevant expertise of literature searching should ‘ideally’ be included within the review team [ 6 ]. Information specialists (or information scientists), librarians or trial search co-ordinators (TSCs) are indicated as appropriate researchers in six guidance documents [ 2 , 7 , 8 , 9 , 10 , 11 ].

How the guidance corresponds to the published studies

The guidance is consistent with studies that call for the involvement of information specialists and librarians in systematic reviews [ 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 ] and which demonstrate how their training as ‘expert searchers’ and ‘analysers and organisers of data’ can be put to good use [ 13 ] in a variety of roles [ 12 , 16 , 20 , 21 , 24 , 25 , 26 ]. These arguments make sense in the context of the aims and purposes of literature searching in systematic reviews, explored below. The need for ‘thorough’ and ‘replicable’ literature searches was fundamental to the guidance and recurs in key stage two. Studies have found poor reporting, and a lack of replicable literature searches, to be a weakness in systematic reviews [ 17 , 18 , 27 , 28 ] and they argue that involvement of information specialists/ librarians would be associated with better reporting and better quality literature searching. Indeed, Meert et al. [ 29 ] demonstrated that involving a librarian as a co-author to a systematic review correlated with a higher score in the literature searching component of a systematic review [ 29 ]. As ‘new styles’ of rapid and scoping reviews emerge, where decisions on how to search are more iterative and creative, a clear role is made here too [ 30 ].

Knowing where to search for studies was noted as important in the guidance, with no agreement as to the appropriate number of databases to be searched [ 2 , 6 ]. Database (and resource selection more broadly) is acknowledged as a relevant key skill of information specialists and librarians [ 12 , 15 , 16 , 31 ].

Whilst arguments for including information specialists and librarians in the process of systematic review might be considered self-evident, Koffel and Rethlefsen [ 31 ] have questioned if the necessary involvement is actually happening [ 31 ].

Key stage two: Determining the aim and purpose of a literature search

The aim: Five of the nine guidance documents use adjectives such as ‘thorough’, ‘comprehensive’, ‘transparent’ and ‘reproducible’ to define the aim of literature searching [ 6 , 7 , 8 , 9 , 10 ]. Analogous phrases were present in a further three guidance documents, namely: ‘to identify the best available evidence’ [ 4 ] or ‘the aim of the literature search is not to retrieve everything. It is to retrieve everything of relevance’ [ 2 ] or ‘A systematic literature search aims to identify all publications relevant to the particular research question’ [ 3 ]. The Joanna Briggs Institute reviewers’ manual was the only guidance document where a clear statement on the aim of literature searching could not be identified. The purpose of literature searching was defined in three guidance documents, namely to minimise bias in the resultant review [ 6 , 8 , 10 ]. Accordingly, eight of nine documents clearly asserted that thorough and comprehensive literature searches are required as a potential mechanism for minimising bias.

The need for thorough and comprehensive literature searches appears as uniform within the eight guidance documents that describe approaches to literature searching in systematic reviews of effectiveness. Reviews of effectiveness (of intervention or cost), accuracy and prognosis, require thorough and comprehensive literature searches to transparently produce a reliable estimate of intervention effect. The belief that all relevant studies have been ‘comprehensively’ identified, and that this process has been ‘transparently’ reported, increases confidence in the estimate of effect and the conclusions that can be drawn [ 32 ]. The supporting literature exploring the need for comprehensive literature searches focuses almost exclusively on reviews of intervention effectiveness and meta-analysis. Different ‘styles’ of review may have different standards however; the alternative, offered by purposive sampling, has been suggested in the specific context of qualitative evidence syntheses [ 33 ].

What is a comprehensive literature search?

Whilst the guidance calls for thorough and comprehensive literature searches, it lacks clarity on what constitutes a thorough and comprehensive literature search, beyond the implication that all of the literature search methods in Table 2 should be used to identify studies. Egger et al. [ 34 ], in an empirical study evaluating the importance of comprehensive literature searches for trials in systematic reviews, defined a comprehensive search for trials as:

a search not restricted to English language;

where Cochrane CENTRAL or at least two other electronic databases had been searched (such as MEDLINE or EMBASE); and

at least one of the following search methods has been used to identify unpublished trials: searches for (I) conference abstracts, (ii) theses, (iii) trials registers; and (iv) contacts with experts in the field [ 34 ].

Tricco et al. (2008) used a similar threshold of bibliographic database searching AND a supplementary search method in a review when examining the risk of bias in systematic reviews. Their criteria were: one database (limited using the Cochrane Highly Sensitive Search Strategy (HSSS)) and handsearching [ 35 ].

Together with the guidance, this would suggest that comprehensive literature searching requires the use of BOTH bibliographic database searching AND supplementary search methods.

Comprehensiveness in literature searching, in the sense of how much searching should be undertaken, remains unclear. Egger et al. recommend that ‘investigators should consider the type of literature search and degree of comprehension that is appropriate for the review in question, taking into account budget and time constraints’ [ 34 ]. This view tallies with the Cochrane Handbook, which stipulates clearly, that study identification should be undertaken ‘within resource limits’ [ 9 ]. This would suggest that the limitations to comprehension are recognised but it raises questions on how this is decided and reported [ 36 ].

What is the point of comprehensive literature searching?

The purpose of thorough and comprehensive literature searches is to avoid missing key studies and to minimize bias [ 6 , 8 , 10 , 34 , 37 , 38 , 39 ] since a systematic review based only on published (or easily accessible) studies may have an exaggerated effect size [ 35 ]. Felson (1992) sets out potential biases that could affect the estimate of effect in a meta-analysis [ 40 ] and Tricco et al. summarize the evidence concerning bias and confounding in systematic reviews [ 35 ]. Egger et al. point to non-publication of studies, publication bias, language bias and MEDLINE bias, as key biases [ 34 , 35 , 40 , 41 , 42 , 43 , 44 , 45 , 46 ]. Comprehensive searches are not the sole factor to mitigate these biases but their contribution is thought to be significant [ 2 , 32 , 34 ]. Fehrmann (2011) suggests that ‘the search process being described in detail’ and that, where standard comprehensive search techniques have been applied, increases confidence in the search results [ 32 ].

Does comprehensive literature searching work?

Egger et al., and other study authors, have demonstrated a change in the estimate of intervention effectiveness where relevant studies were excluded from meta-analysis [ 34 , 47 ]. This would suggest that missing studies in literature searching alters the reliability of effectiveness estimates. This is an argument for comprehensive literature searching. Conversely, Egger et al. found that ‘comprehensive’ searches still missed studies and that comprehensive searches could, in fact, introduce bias into a review rather than preventing it, through the identification of low quality studies then being included in the meta-analysis [ 34 ]. Studies query if identifying and including low quality or grey literature studies changes the estimate of effect [ 43 , 48 ] and question if time is better invested updating systematic reviews rather than searching for unpublished studies [ 49 ], or mapping studies for review as opposed to aiming for high sensitivity in literature searching [ 50 ].

Aim and purpose beyond reviews of effectiveness

The need for comprehensive literature searches is less certain in reviews of qualitative studies, and for reviews where a comprehensive identification of studies is difficult to achieve (for example, in Public health) [ 33 , 51 , 52 , 53 , 54 , 55 ]. Literature searching for qualitative studies, and in public health topics, typically generates a greater number of studies to sift than in reviews of effectiveness [ 39 ] and demonstrating the ‘value’ of studies identified or missed is harder [ 56 ], since the study data do not typically support meta-analysis. Nussbaumer-Streit et al. (2016) have registered a review protocol to assess whether abbreviated literature searches (as opposed to comprehensive literature searches) has an impact on conclusions across multiple bodies of evidence, not only on effect estimates [ 57 ] which may develop this understanding. It may be that decision makers and users of systematic reviews are willing to trade the certainty from a comprehensive literature search and systematic review in exchange for different approaches to evidence synthesis [ 58 ], and that comprehensive literature searches are not necessarily a marker of literature search quality, as previously thought [ 36 ]. Different approaches to literature searching [ 37 , 38 , 59 , 60 , 61 , 62 ] and developing the concept of when to stop searching are important areas for further study [ 36 , 59 ].

The study by Nussbaumer-Streit et al. has been published since the submission of this literature review [ 63 ]. Nussbaumer-Streit et al. (2018) conclude that abbreviated literature searches are viable options for rapid evidence syntheses, if decision-makers are willing to trade the certainty from a comprehensive literature search and systematic review, but that decision-making which demands detailed scrutiny should still be based on comprehensive literature searches [ 63 ].

Key stage three: Preparing for the literature search

Six documents provided guidance on preparing for a literature search [ 2 , 3 , 6 , 7 , 9 , 10 ]. The Cochrane Handbook clearly stated that Cochrane authors (i.e. researchers) should seek advice from a trial search co-ordinator (i.e. a person with specific skills in literature searching) ‘before’ starting a literature search [ 9 ].

Two key tasks were perceptible in preparing for a literature searching [ 2 , 6 , 7 , 10 , 11 ]. First, to determine if there are any existing or on-going reviews, or if a new review is justified [ 6 , 11 ]; and, secondly, to develop an initial literature search strategy to estimate the volume of relevant literature (and quality of a small sample of relevant studies [ 10 ]) and indicate the resources required for literature searching and the review of the studies that follows [ 7 , 10 ].

Three documents summarised guidance on where to search to determine if a new review was justified [ 2 , 6 , 11 ]. These focused on searching databases of systematic reviews (The Cochrane Database of Systematic Reviews (CDSR) and the Database of Abstracts of Reviews of Effects (DARE)), institutional registries (including PROSPERO), and MEDLINE [ 6 , 11 ]. It is worth noting, however, that as of 2015, DARE (and NHS EEDs) are no longer being updated and so the relevance of this (these) resource(s) will diminish over-time [ 64 ]. One guidance document, ‘Systematic reviews in the Social Sciences’, noted, however, that databases are not the only source of information and unpublished reports, conference proceeding and grey literature may also be required, depending on the nature of the review question [ 2 ].

Two documents reported clearly that this preparation (or ‘scoping’) exercise should be undertaken before the actual search strategy is developed [ 7 , 10 ]).

The guidance offers the best available source on preparing the literature search with the published studies not typically reporting how their scoping informed the development of their search strategies nor how their search approaches were developed. Text mining has been proposed as a technique to develop search strategies in the scoping stages of a review although this work is still exploratory [ 65 ]. ‘Clustering documents’ and word frequency analysis have also been tested to identify search terms and studies for review [ 66 , 67 ]. Preparing for literature searches and scoping constitutes an area for future research.

Key stage four: Designing the search strategy

The Population, Intervention, Comparator, Outcome (PICO) structure was the commonly reported structure promoted to design a literature search strategy. Five documents suggested that the eligibility criteria or review question will determine which concepts of PICO will be populated to develop the search strategy [ 1 , 4 , 7 , 8 , 9 ]. The NICE handbook promoted multiple structures, namely PICO, SPICE (Setting, Perspective, Intervention, Comparison, Evaluation) and multi-stranded approaches [ 4 ].

With the exclusion of The Joanna Briggs Institute reviewers’ manual, the guidance offered detail on selecting key search terms, synonyms, Boolean language, selecting database indexing terms and combining search terms. The CEE handbook suggested that ‘search terms may be compiled with the help of the commissioning organisation and stakeholders’ [ 10 ].

The use of limits, such as language or date limits, were discussed in all documents [ 2 , 3 , 4 , 6 , 7 , 8 , 9 , 10 , 11 ].

Search strategy structure

The guidance typically relates to reviews of intervention effectiveness so PICO – with its focus on intervention and comparator - is the dominant model used to structure literature search strategies [ 68 ]. PICOs – where the S denotes study design - is also commonly used in effectiveness reviews [ 6 , 68 ]. As the NICE handbook notes, alternative models to structure literature search strategies have been developed and tested. Booth provides an overview on formulating questions for evidence based practice [ 69 ] and has developed a number of alternatives to the PICO structure, namely: BeHEMoTh (Behaviour of interest; Health context; Exclusions; Models or Theories) for use when systematically identifying theory [ 55 ]; SPICE (Setting, Perspective, Intervention, Comparison, Evaluation) for identification of social science and evaluation studies [ 69 ] and, working with Cooke and colleagues, SPIDER (Sample, Phenomenon of Interest, Design, Evaluation, Research type) [ 70 ]. SPIDER has been compared to PICO and PICOs in a study by Methley et al. [ 68 ].

The NICE handbook also suggests the use of multi-stranded approaches to developing literature search strategies [ 4 ]. Glanville developed this idea in a study by Whitting et al. [ 71 ] and a worked example of this approach is included in the development of a search filter by Cooper et al. [ 72 ].

Writing search strategies: Conceptual and objective approaches

Hausner et al. [ 73 ] provide guidance on writing literature search strategies, delineating between conceptually and objectively derived approaches. The conceptual approach, advocated by and explained in the guidance documents, relies on the expertise of the literature searcher to identify key search terms and then develop key terms to include synonyms and controlled syntax. Hausner and colleagues set out the objective approach [ 73 ] and describe what may be done to validate it [ 74 ].

The use of limits

The guidance documents offer direction on the use of limits within a literature search. Limits can be used to focus literature searching to specific study designs or by other markers (such as by date) which limits the number of studies returned by a literature search. The use of limits should be described and the implications explored [ 34 ] since limiting literature searching can introduce bias (explored above). Craven et al. have suggested the use of a supporting narrative to explain decisions made in the process of developing literature searches and this advice would usefully capture decisions on the use of search limits [ 75 ].

Key stage five: Determining the process of literature searching and deciding where to search (bibliographic database searching)

Table 2 summarises the process of literature searching as reported in each guidance document. Searching bibliographic databases was consistently reported as the ‘first step’ to literature searching in all nine guidance documents.

Three documents reported specific guidance on where to search, in each case specific to the type of review their guidance informed, and as a minimum requirement [ 4 , 9 , 11 ]. Seven of the key guidance documents suggest that the selection of bibliographic databases depends on the topic of review [ 2 , 3 , 4 , 6 , 7 , 8 , 10 ], with two documents noting the absence of an agreed standard on what constitutes an acceptable number of databases searched [ 2 , 6 ].

The guidance documents summarise ‘how to’ search bibliographic databases in detail and this guidance is further contextualised above in terms of developing the search strategy. The documents provide guidance of selecting bibliographic databases, in some cases stating acceptable minima (i.e. The Cochrane Handbook states Cochrane CENTRAL, MEDLINE and EMBASE), and in other cases simply listing bibliographic database available to search. Studies have explored the value in searching specific bibliographic databases, with Wright et al. (2015) noting the contribution of CINAHL in identifying qualitative studies [ 76 ], Beckles et al. (2013) questioning the contribution of CINAHL to identifying clinical studies for guideline development [ 77 ], and Cooper et al. (2015) exploring the role of UK-focused bibliographic databases to identify UK-relevant studies [ 78 ]. The host of the database (e.g. OVID or ProQuest) has been shown to alter the search returns offered. Younger and Boddy [ 79 ] report differing search returns from the same database (AMED) but where the ‘host’ was different [ 79 ].

The average number of bibliographic database searched in systematic reviews has risen in the period 1994–2014 (from 1 to 4) [ 80 ] but there remains (as attested to by the guidance) no consensus on what constitutes an acceptable number of databases searched [ 48 ]. This is perhaps because thinking about the number of databases searched is the wrong question, researchers should be focused on which databases were searched and why, and which databases were not searched and why. The discussion should re-orientate to the differential value of sources but researchers need to think about how to report this in studies to allow findings to be generalised. Bethel (2017) has proposed ‘search summaries’, completed by the literature searcher, to record where included studies were identified, whether from database (and which databases specifically) or supplementary search methods [ 81 ]. Search summaries document both yield and accuracy of searches, which could prospectively inform resource use and decisions to search or not to search specific databases in topic areas. The prospective use of such data presupposes, however, that past searches are a potential predictor of future search performance (i.e. that each topic is to be considered representative and not unique). In offering a body of practice, this data would be of greater practicable use than current studies which are considered as little more than individual case studies [ 82 , 83 , 84 , 85 , 86 , 87 , 88 , 89 , 90 ].

When to database search is another question posed in the literature. Beyer et al. [ 91 ] report that databases can be prioritised for literature searching which, whilst not addressing the question of which databases to search, may at least bring clarity as to which databases to search first [ 91 ]. Paradoxically, this links to studies that suggest PubMed should be searched in addition to MEDLINE (OVID interface) since this improves the currency of systematic reviews [ 92 , 93 ]. Cooper et al. (2017) have tested the idea of database searching not as a primary search method (as suggested in the guidance) but as a supplementary search method in order to manage the volume of studies identified for an environmental effectiveness systematic review. Their case study compared the effectiveness of database searching versus a protocol using supplementary search methods and found that the latter identified more relevant studies for review than searching bibliographic databases [ 94 ].

Key stage six: Determining the process of literature searching and deciding where to search (supplementary search methods)

Table 2 also summaries the process of literature searching which follows bibliographic database searching. As Table 2 sets out, guidance that supplementary literature search methods should be used in systematic reviews recurs across documents, but the order in which these methods are used, and the extent to which they are used, varies. We noted inconsistency in the labelling of supplementary search methods between guidance documents.

Rather than focus on the guidance on how to use the methods (which has been summarised in a recent review [ 95 ]), we focus on the aim or purpose of supplementary search methods.

The Cochrane Handbook reported that ‘efforts’ to identify unpublished studies should be made [ 9 ]. Four guidance documents [ 2 , 3 , 6 , 9 ] acknowledged that searching beyond bibliographic databases was necessary since ‘databases are not the only source of literature’ [ 2 ]. Only one document reported any guidance on determining when to use supplementary methods. The IQWiG handbook reported that the use of handsearching (in their example) could be determined on a ‘case-by-case basis’ which implies that the use of these methods is optional rather than mandatory. This is in contrast to the guidance (above) on bibliographic database searching.

The issue for supplementary search methods is similar in many ways to the issue of searching bibliographic databases: demonstrating value. The purpose and contribution of supplementary search methods in systematic reviews is increasingly acknowledged [ 37 , 61 , 62 , 96 , 97 , 98 , 99 , 100 , 101 ] but understanding the value of the search methods to identify studies and data is unclear. In a recently published review, Cooper et al. (2017) reviewed the literature on supplementary search methods looking to determine the advantages, disadvantages and resource implications of using supplementary search methods [ 95 ]. This review also summarises the key guidance and empirical studies and seeks to address the question on when to use these search methods and when not to [ 95 ]. The guidance is limited in this regard and, as Table 2 demonstrates, offers conflicting advice on the order of searching, and the extent to which these search methods should be used in systematic reviews.

Key stage seven: Managing the references

Five of the documents provided guidance on managing references, for example downloading, de-duplicating and managing the output of literature searches [ 2 , 4 , 6 , 8 , 10 ]. This guidance typically itemised available bibliographic management tools rather than offering guidance on how to use them specifically [ 2 , 4 , 6 , 8 ]. The CEE handbook provided guidance on importing data where no direct export option is available (e.g. web-searching) [ 10 ].

The literature on using bibliographic management tools is not large relative to the number of ‘how to’ videos on platforms such as YouTube (see for example [ 102 ]). These YouTube videos confirm the overall lack of ‘how to’ guidance identified in this study and offer useful instruction on managing references. Bramer et al. set out methods for de-duplicating data and reviewing references in Endnote [ 103 , 104 ] and Gall tests the direct search function within Endnote to access databases such as PubMed, finding a number of limitations [ 105 ]. Coar et al. and Ahmed et al. consider the role of the free-source tool, Zotero [ 106 , 107 ]. Managing references is a key administrative function in the process of review particularly for documenting searches in PRISMA guidance.

Key stage eight: Documenting the search

The Cochrane Handbook was the only guidance document to recommend a specific reporting guideline: Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [ 9 ]. Six documents provided guidance on reporting the process of literature searching with specific criteria to report [ 3 , 4 , 6 , 8 , 9 , 10 ]. There was consensus on reporting: the databases searched (and the host searched by), the search strategies used, and any use of limits (e.g. date, language, search filters (The CRD handbook called for these limits to be justified [ 6 ])). Three guidance documents reported that the number of studies identified should be recorded [ 3 , 6 , 10 ]. The number of duplicates identified [ 10 ], the screening decisions [ 3 ], a comprehensive list of grey literature sources searched (and full detail for other supplementary search methods) [ 8 ], and an annotation of search terms tested but not used [ 4 ] were identified as unique items in four documents.

The Cochrane Handbook was the only guidance document to note that the full search strategies for each database should be included in the Additional file 1 of the review [ 9 ].

All guidance documents should ultimately deliver completed systematic reviews that fulfil the requirements of the PRISMA reporting guidelines [ 108 ]. The guidance broadly requires the reporting of data that corresponds with the requirements of the PRISMA statement although documents typically ask for diverse and additional items [ 108 ]. In 2008, Sampson et al. observed a lack of consensus on reporting search methods in systematic reviews [ 109 ] and this remains the case as of 2017, as evidenced in the guidance documents, and in spite of the publication of the PRISMA guidelines in 2009 [ 110 ]. It is unclear why the collective guidance does not more explicitly endorse adherence to the PRISMA guidance.

Reporting of literature searching is a key area in systematic reviews since it sets out clearly what was done and how the conclusions of the review can be believed [ 52 , 109 ]. Despite strong endorsement in the guidance documents, specifically supported in PRISMA guidance, and other related reporting standards too (such as ENTREQ for qualitative evidence synthesis, STROBE for reviews of observational studies), authors still highlight the prevalence of poor standards of literature search reporting [ 31 , 110 , 111 , 112 , 113 , 114 , 115 , 116 , 117 , 118 , 119 ]. To explore issues experienced by authors in reporting literature searches, and look at uptake of PRISMA, Radar et al. [ 120 ] surveyed over 260 review authors to determine common problems and their work summaries the practical aspects of reporting literature searching [ 120 ]. Atkinson et al. [ 121 ] have also analysed reporting standards for literature searching, summarising recommendations and gaps for reporting search strategies [ 121 ].

One area that is less well covered by the guidance, but nevertheless appears in this literature, is the quality appraisal or peer review of literature search strategies. The PRESS checklist is the most prominent and it aims to develop evidence-based guidelines to peer review of electronic search strategies [ 5 , 122 , 123 ]. A corresponding guideline for documentation of supplementary search methods does not yet exist although this idea is currently being explored.

How the reporting of the literature searching process corresponds to critical appraisal tools is an area for further research. In the survey undertaken by Radar et al. (2014), 86% of survey respondents (153/178) identified a need for further guidance on what aspects of the literature search process to report [ 120 ]. The PRISMA statement offers a brief summary of what to report but little practical guidance on how to report it [ 108 ]. Critical appraisal tools for systematic reviews, such as AMSTAR 2 (Shea et al. [ 124 ]) and ROBIS (Whiting et al. [ 125 ]), can usefully be read alongside PRISMA guidance, since they offer greater detail on how the reporting of the literature search will be appraised and, therefore, they offer a proxy on what to report [ 124 , 125 ]. Further research in the form of a study which undertakes a comparison between PRISMA and quality appraisal checklists for systematic reviews would seem to begin addressing the call, identified by Radar et al., for further guidance on what to report [ 120 ].

Limitations

Other handbooks exist.

A potential limitation of this literature review is the focus on guidance produced in Europe (the UK specifically) and Australia. We justify the decision for our selection of the nine guidance documents reviewed in this literature review in section “ Identifying guidance ”. In brief, these nine guidance documents were selected as the most relevant health care guidance that inform UK systematic reviewing practice, given that the UK occupies a prominent position in the science of health information retrieval. We acknowledge the existence of other guidance documents, such as those from North America (e.g. the Agency for Healthcare Research and Quality (AHRQ) [ 126 ], The Institute of Medicine [ 127 ] and the guidance and resources produced by the Canadian Agency for Drugs and Technologies in Health (CADTH) [ 128 ]). We comment further on this directly below.

The handbooks are potentially linked to one another

What is not clear is the extent to which the guidance documents inter-relate or provide guidance uniquely. The Cochrane Handbook, first published in 1994, is notably a key source of reference in guidance and systematic reviews beyond Cochrane reviews. It is not clear to what extent broadening the sample of guidance handbooks to include North American handbooks, and guidance handbooks from other relevant countries too, would alter the findings of this literature review or develop further support for the process model. Since we cannot be clear, we raise this as a potential limitation of this literature review. On our initial review of a sample of North American, and other, guidance documents (before selecting the guidance documents considered in this review), however, we do not consider that the inclusion of these further handbooks would alter significantly the findings of this literature review.

This is a literature review

A further limitation of this review was that the review of published studies is not a systematic review of the evidence for each key stage. It is possible that other relevant studies could help contribute to the exploration and development of the key stages identified in this review.

This literature review would appear to demonstrate the existence of a shared model of the literature searching process in systematic reviews. We call this model ‘the conventional approach’, since it appears to be common convention in nine different guidance documents.

The findings reported above reveal eight key stages in the process of literature searching for systematic reviews. These key stages are consistently reported in the nine guidance documents which suggests consensus on the key stages of literature searching, and therefore the process of literature searching as a whole, in systematic reviews.

In Table 2 , we demonstrate consensus regarding the application of literature search methods. All guidance documents distinguish between primary and supplementary search methods. Bibliographic database searching is consistently the first method of literature searching referenced in each guidance document. Whilst the guidance uniformly supports the use of supplementary search methods, there is little evidence for a consistent process with diverse guidance across documents. This may reflect differences in the core focus across each document, linked to differences in identifying effectiveness studies or qualitative studies, for instance.

Eight of the nine guidance documents reported on the aims of literature searching. The shared understanding was that literature searching should be thorough and comprehensive in its aim and that this process should be reported transparently so that that it could be reproduced. Whilst only three documents explicitly link this understanding to minimising bias, it is clear that comprehensive literature searching is implicitly linked to ‘not missing relevant studies’ which is approximately the same point.

Defining the key stages in this review helps categorise the scholarship available, and it prioritises areas for development or further study. The supporting studies on preparing for literature searching (key stage three, ‘preparation’) were, for example, comparatively few, and yet this key stage represents a decisive moment in literature searching for systematic reviews. It is where search strategy structure is determined, search terms are chosen or discarded, and the resources to be searched are selected. Information specialists, librarians and researchers, are well placed to develop these and other areas within the key stages we identify.

This review calls for further research to determine the suitability of using the conventional approach. The publication dates of the guidance documents which underpin the conventional approach may raise questions as to whether the process which they each report remains valid for current systematic literature searching. In addition, it may be useful to test whether it is desirable to use the same process model of literature searching for qualitative evidence synthesis as that for reviews of intervention effectiveness, which this literature review demonstrates is presently recommended best practice.

Abbreviations

Behaviour of interest; Health context; Exclusions; Models or Theories

Cochrane Database of Systematic Reviews

The Cochrane Central Register of Controlled Trials

Database of Abstracts of Reviews of Effects

Enhancing transparency in reporting the synthesis of qualitative research

Institute for Quality and Efficiency in Healthcare

National Institute for Clinical Excellence

Population, Intervention, Comparator, Outcome

Preferred Reporting Items for Systematic Reviews and Meta-Analyses

Setting, Perspective, Intervention, Comparison, Evaluation

Sample, Phenomenon of Interest, Design, Evaluation, Research type

STrengthening the Reporting of OBservational studies in Epidemiology

Trial Search Co-ordinators

Booth A. Unpacking your literature search toolbox: on search styles and tactics. Health Information & Libraries Journal. 2008;25(4):313–7.

Article   Google Scholar  

Petticrew M, Roberts H. Systematic reviews in the social sciences: a practical guide. Oxford: Blackwell Publishing Ltd; 2006.

Book   Google Scholar  

Institute for Quality and Efficiency in Health Care (IQWiG). IQWiG Methods Resources. 7 Information retrieval 2014 [Available from: https://www.ncbi.nlm.nih.gov/books/NBK385787/ .

NICE: National Institute for Health and Care Excellence. Developing NICE guidelines: the manual 2014. Available from: https://www.nice.org.uk/media/default/about/what-we-do/our-programmes/developing-nice-guidelines-the-manual.pdf .

Sampson M. MJ, Lefebvre C, Moher D, Grimshaw J. Peer Review of Electronic Search Strategies: PRESS; 2008.

Google Scholar  

Centre for Reviews & Dissemination. Systematic reviews – CRD’s guidance for undertaking reviews in healthcare. York: Centre for Reviews and Dissemination, University of York; 2009.

eunetha: European Network for Health Technology Assesment Process of information retrieval for systematic reviews and health technology assessments on clinical effectiveness 2016. Available from: http://www.eunethta.eu/sites/default/files/Guideline_Information_Retrieval_V1-1.pdf .

Kugley SWA, Thomas J, Mahood Q, Jørgensen AMK, Hammerstrøm K, Sathe N. Searching for studies: a guide to information retrieval for Campbell systematic reviews. Oslo: Campbell Collaboration. 2017; Available from: https://www.campbellcollaboration.org/library/searching-for-studies-information-retrieval-guide-campbell-reviews.html

Lefebvre C, Manheimer E, Glanville J. Chapter 6: searching for studies. In: JPT H, Green S, editors. Cochrane Handbook for Systematic Reviews of Interventions; 2011.

Collaboration for Environmental Evidence. Guidelines for Systematic Review and Evidence Synthesis in Environmental Management.: Environmental Evidence:; 2013. Available from: http://www.environmentalevidence.org/wp-content/uploads/2017/01/Review-guidelines-version-4.2-final-update.pdf .

The Joanna Briggs Institute. Joanna Briggs institute reviewers’ manual. 2014th ed: the Joanna Briggs institute; 2014. Available from: https://joannabriggs.org/assets/docs/sumari/ReviewersManual-2014.pdf

Beverley CA, Booth A, Bath PA. The role of the information specialist in the systematic review process: a health information case study. Health Inf Libr J. 2003;20(2):65–74.

Article   CAS   Google Scholar  

Harris MR. The librarian's roles in the systematic review process: a case study. Journal of the Medical Library Association. 2005;93(1):81–7.

PubMed   PubMed Central   Google Scholar  

Egger JB. Use of recommended search strategies in systematic reviews and the impact of librarian involvement: a cross-sectional survey of recent authors. PLoS One. 2015;10(5):e0125931.

Li L, Tian J, Tian H, Moher D, Liang F, Jiang T, et al. Network meta-analyses could be improved by searching more sources and by involving a librarian. J Clin Epidemiol. 2014;67(9):1001–7.

Article   PubMed   Google Scholar  

McGowan J, Sampson M. Systematic reviews need systematic searchers. J Med Libr Assoc. 2005;93(1):74–80.

Rethlefsen ML, Farrell AM, Osterhaus Trzasko LC, Brigham TJ. Librarian co-authors correlated with higher quality reported search strategies in general internal medicine systematic reviews. J Clin Epidemiol. 2015;68(6):617–26.

Weller AC. Mounting evidence that librarians are essential for comprehensive literature searches for meta-analyses and Cochrane reports. J Med Libr Assoc. 2004;92(2):163–4.

Swinkels A, Briddon J, Hall J. Two physiotherapists, one librarian and a systematic literature review: collaboration in action. Health Info Libr J. 2006;23(4):248–56.

Foster M. An overview of the role of librarians in systematic reviews: from expert search to project manager. EAHIL. 2015;11(3):3–7.

Lawson L. OPERATING OUTSIDE LIBRARY WALLS 2004.

Vassar M, Yerokhin V, Sinnett PM, Weiher M, Muckelrath H, Carr B, et al. Database selection in systematic reviews: an insight through clinical neurology. Health Inf Libr J. 2017;34(2):156–64.

Townsend WA, Anderson PF, Ginier EC, MacEachern MP, Saylor KM, Shipman BL, et al. A competency framework for librarians involved in systematic reviews. Journal of the Medical Library Association : JMLA. 2017;105(3):268–75.

Cooper ID, Crum JA. New activities and changing roles of health sciences librarians: a systematic review, 1990-2012. Journal of the Medical Library Association : JMLA. 2013;101(4):268–77.

Crum JA, Cooper ID. Emerging roles for biomedical librarians: a survey of current practice, challenges, and changes. Journal of the Medical Library Association : JMLA. 2013;101(4):278–86.

Dudden RF, Protzko SL. The systematic review team: contributions of the health sciences librarian. Med Ref Serv Q. 2011;30(3):301–15.

Golder S, Loke Y, McIntosh HM. Poor reporting and inadequate searches were apparent in systematic reviews of adverse effects. J Clin Epidemiol. 2008;61(5):440–8.

Maggio LA, Tannery NH, Kanter SL. Reproducibility of literature search reporting in medical education reviews. Academic medicine : journal of the Association of American Medical Colleges. 2011;86(8):1049–54.

Meert D, Torabi N, Costella J. Impact of librarians on reporting of the literature searching component of pediatric systematic reviews. Journal of the Medical Library Association : JMLA. 2016;104(4):267–77.

Morris M, Boruff JT, Gore GC. Scoping reviews: establishing the role of the librarian. Journal of the Medical Library Association : JMLA. 2016;104(4):346–54.

Koffel JB, Rethlefsen ML. Reproducibility of search strategies is poor in systematic reviews published in high-impact pediatrics, cardiology and surgery journals: a cross-sectional study. PLoS One. 2016;11(9):e0163309.

Article   PubMed   PubMed Central   CAS   Google Scholar  

Fehrmann P, Thomas J. Comprehensive computer searches and reporting in systematic reviews. Research Synthesis Methods. 2011;2(1):15–32.

Booth A. Searching for qualitative research for inclusion in systematic reviews: a structured methodological review. Systematic Reviews. 2016;5(1):74.

Article   PubMed   PubMed Central   Google Scholar  

Egger M, Juni P, Bartlett C, Holenstein F, Sterne J. How important are comprehensive literature searches and the assessment of trial quality in systematic reviews? Empirical study. Health technology assessment (Winchester, England). 2003;7(1):1–76.

Tricco AC, Tetzlaff J, Sampson M, Fergusson D, Cogo E, Horsley T, et al. Few systematic reviews exist documenting the extent of bias: a systematic review. J Clin Epidemiol. 2008;61(5):422–34.

Booth A. How much searching is enough? Comprehensive versus optimal retrieval for technology assessments. Int J Technol Assess Health Care. 2010;26(4):431–5.

Papaioannou D, Sutton A, Carroll C, Booth A, Wong R. Literature searching for social science systematic reviews: consideration of a range of search techniques. Health Inf Libr J. 2010;27(2):114–22.

Petticrew M. Time to rethink the systematic review catechism? Moving from ‘what works’ to ‘what happens’. Systematic Reviews. 2015;4(1):36.

Betrán AP, Say L, Gülmezoglu AM, Allen T, Hampson L. Effectiveness of different databases in identifying studies for systematic reviews: experience from the WHO systematic review of maternal morbidity and mortality. BMC Med Res Methodol. 2005;5

Felson DT. Bias in meta-analytic research. J Clin Epidemiol. 1992;45(8):885–92.

Article   PubMed   CAS   Google Scholar  

Franco A, Malhotra N, Simonovits G. Publication bias in the social sciences: unlocking the file drawer. Science. 2014;345(6203):1502–5.

Hartling L, Featherstone R, Nuspl M, Shave K, Dryden DM, Vandermeer B. Grey literature in systematic reviews: a cross-sectional study of the contribution of non-English reports, unpublished studies and dissertations to the results of meta-analyses in child-relevant reviews. BMC Med Res Methodol. 2017;17(1):64.

Schmucker CM, Blümle A, Schell LK, Schwarzer G, Oeller P, Cabrera L, et al. Systematic review finds that study data not published in full text articles have unclear impact on meta-analyses results in medical research. PLoS One. 2017;12(4):e0176210.

Egger M, Zellweger-Zahner T, Schneider M, Junker C, Lengeler C, Antes G. Language bias in randomised controlled trials published in English and German. Lancet (London, England). 1997;350(9074):326–9.

Moher D, Pham B, Lawson ML, Klassen TP. The inclusion of reports of randomised trials published in languages other than English in systematic reviews. Health technology assessment (Winchester, England). 2003;7(41):1–90.

Pham B, Klassen TP, Lawson ML, Moher D. Language of publication restrictions in systematic reviews gave different results depending on whether the intervention was conventional or complementary. J Clin Epidemiol. 2005;58(8):769–76.

Mills EJ, Kanters S, Thorlund K, Chaimani A, Veroniki A-A, Ioannidis JPA. The effects of excluding treatments from network meta-analyses: survey. BMJ : British Medical Journal. 2013;347

Hartling L, Featherstone R, Nuspl M, Shave K, Dryden DM, Vandermeer B. The contribution of databases to the results of systematic reviews: a cross-sectional study. BMC Med Res Methodol. 2016;16(1):127.

van Driel ML, De Sutter A, De Maeseneer J, Christiaens T. Searching for unpublished trials in Cochrane reviews may not be worth the effort. J Clin Epidemiol. 2009;62(8):838–44.e3.

Buchberger B, Krabbe L, Lux B, Mattivi JT. Evidence mapping for decision making: feasibility versus accuracy - when to abandon high sensitivity in electronic searches. German medical science : GMS e-journal. 2016;14:Doc09.

Lorenc T, Pearson M, Jamal F, Cooper C, Garside R. The role of systematic reviews of qualitative evidence in evaluating interventions: a case study. Research Synthesis Methods. 2012;3(1):1–10.

Gough D. Weight of evidence: a framework for the appraisal of the quality and relevance of evidence. Res Pap Educ. 2007;22(2):213–28.

Barroso J, Gollop CJ, Sandelowski M, Meynell J, Pearce PF, Collins LJ. The challenges of searching for and retrieving qualitative studies. West J Nurs Res. 2003;25(2):153–78.

Britten N, Garside R, Pope C, Frost J, Cooper C. Asking more of qualitative synthesis: a response to Sally Thorne. Qual Health Res. 2017;27(9):1370–6.

Booth A, Carroll C. Systematic searching for theory to inform systematic reviews: is it feasible? Is it desirable? Health Info Libr J. 2015;32(3):220–35.

Kwon Y, Powelson SE, Wong H, Ghali WA, Conly JM. An assessment of the efficacy of searching in biomedical databases beyond MEDLINE in identifying studies for a systematic review on ward closures as an infection control intervention to control outbreaks. Syst Rev. 2014;3:135.

Nussbaumer-Streit B, Klerings I, Wagner G, Titscher V, Gartlehner G. Assessing the validity of abbreviated literature searches for rapid reviews: protocol of a non-inferiority and meta-epidemiologic study. Systematic Reviews. 2016;5:197.

Wagner G, Nussbaumer-Streit B, Greimel J, Ciapponi A, Gartlehner G. Trading certainty for speed - how much uncertainty are decisionmakers and guideline developers willing to accept when using rapid reviews: an international survey. BMC Med Res Methodol. 2017;17(1):121.

Ogilvie D, Hamilton V, Egan M, Petticrew M. Systematic reviews of health effects of social interventions: 1. Finding the evidence: how far should you go? J Epidemiol Community Health. 2005;59(9):804–8.

Royle P, Milne R. Literature searching for randomized controlled trials used in Cochrane reviews: rapid versus exhaustive searches. Int J Technol Assess Health Care. 2003;19(4):591–603.

Pearson M, Moxham T, Ashton K. Effectiveness of search strategies for qualitative research about barriers and facilitators of program delivery. Eval Health Prof. 2011;34(3):297–308.

Levay P, Raynor M, Tuvey D. The Contributions of MEDLINE, Other Bibliographic Databases and Various Search Techniques to NICE Public Health Guidance. 2015. 2015;10(1):19.

Nussbaumer-Streit B, Klerings I, Wagner G, Heise TL, Dobrescu AI, Armijo-Olivo S, et al. Abbreviated literature searches were viable alternatives to comprehensive searches: a meta-epidemiological study. J Clin Epidemiol. 2018;102:1–11.

Briscoe S, Cooper C, Glanville J, Lefebvre C. The loss of the NHS EED and DARE databases and the effect on evidence synthesis and evaluation. Res Synth Methods. 2017;8(3):256–7.

Stansfield C, O'Mara-Eves A, Thomas J. Text mining for search term development in systematic reviewing: A discussion of some methods and challenges. Research Synthesis Methods.n/a-n/a.

Petrova M, Sutcliffe P, Fulford KW, Dale J. Search terms and a validated brief search filter to retrieve publications on health-related values in Medline: a word frequency analysis study. Journal of the American Medical Informatics Association : JAMIA. 2012;19(3):479–88.

Stansfield C, Thomas J, Kavanagh J. 'Clustering' documents automatically to support scoping reviews of research: a case study. Res Synth Methods. 2013;4(3):230–41.

PubMed   Google Scholar  

Methley AM, Campbell S, Chew-Graham C, McNally R, Cheraghi-Sohi S. PICO, PICOS and SPIDER: a comparison study of specificity and sensitivity in three search tools for qualitative systematic reviews. BMC Health Serv Res. 2014;14:579.

Andrew B. Clear and present questions: formulating questions for evidence based practice. Library Hi Tech. 2006;24(3):355–68.

Cooke A, Smith D, Booth A. Beyond PICO: the SPIDER tool for qualitative evidence synthesis. Qual Health Res. 2012;22(10):1435–43.

Whiting P, Westwood M, Bojke L, Palmer S, Richardson G, Cooper J, et al. Clinical effectiveness and cost-effectiveness of tests for the diagnosis and investigation of urinary tract infection in children: a systematic review and economic model. Health technology assessment (Winchester, England). 2006;10(36):iii-iv, xi-xiii, 1–154.

Cooper C, Levay P, Lorenc T, Craig GM. A population search filter for hard-to-reach populations increased search efficiency for a systematic review. J Clin Epidemiol. 2014;67(5):554–9.

Hausner E, Waffenschmidt S, Kaiser T, Simon M. Routine development of objectively derived search strategies. Systematic Reviews. 2012;1(1):19.

Hausner E, Guddat C, Hermanns T, Lampert U, Waffenschmidt S. Prospective comparison of search strategies for systematic reviews: an objective approach yielded higher sensitivity than a conceptual one. J Clin Epidemiol. 2016;77:118–24.

Craven J, Levay P. Recording database searches for systematic reviews - what is the value of adding a narrative to peer-review checklists? A case study of nice interventional procedures guidance. Evid Based Libr Inf Pract. 2011;6(4):72–87.

Wright K, Golder S, Lewis-Light K. What value is the CINAHL database when searching for systematic reviews of qualitative studies? Syst Rev. 2015;4:104.

Beckles Z, Glover S, Ashe J, Stockton S, Boynton J, Lai R, et al. Searching CINAHL did not add value to clinical questions posed in NICE guidelines. J Clin Epidemiol. 2013;66(9):1051–7.

Cooper C, Rogers M, Bethel A, Briscoe S, Lowe J. A mapping review of the literature on UK-focused health and social care databases. Health Inf Libr J. 2015;32(1):5–22.

Younger P, Boddy K. When is a search not a search? A comparison of searching the AMED complementary health database via EBSCOhost, OVID and DIALOG. Health Inf Libr J. 2009;26(2):126–35.

Lam MT, McDiarmid M. Increasing number of databases searched in systematic reviews and meta-analyses between 1994 and 2014. Journal of the Medical Library Association : JMLA. 2016;104(4):284–9.

Bethel A, editor Search summary tables for systematic reviews: results and findings. HLC Conference 2017a.

Aagaard T, Lund H, Juhl C. Optimizing literature search in systematic reviews - are MEDLINE, EMBASE and CENTRAL enough for identifying effect studies within the area of musculoskeletal disorders? BMC Med Res Methodol. 2016;16(1):161.

Adams CE, Frederick K. An investigation of the adequacy of MEDLINE searches for randomized controlled trials (RCTs) of the effects of mental health care. Psychol Med. 1994;24(3):741–8.

Kelly L, St Pierre-Hansen N. So many databases, such little clarity: searching the literature for the topic aboriginal. Canadian family physician Medecin de famille canadien. 2008;54(11):1572–3.

Lawrence DW. What is lost when searching only one literature database for articles relevant to injury prevention and safety promotion? Injury Prevention. 2008;14(6):401–4.

Lemeshow AR, Blum RE, Berlin JA, Stoto MA, Colditz GA. Searching one or two databases was insufficient for meta-analysis of observational studies. J Clin Epidemiol. 2005;58(9):867–73.

Sampson M, Barrowman NJ, Moher D, Klassen TP, Pham B, Platt R, et al. Should meta-analysts search Embase in addition to Medline? J Clin Epidemiol. 2003;56(10):943–55.

Stevinson C, Lawlor DA. Searching multiple databases for systematic reviews: added value or diminishing returns? Complementary Therapies in Medicine. 2004;12(4):228–32.

Suarez-Almazor ME, Belseck E, Homik J, Dorgan M, Ramos-Remus C. Identifying clinical trials in the medical literature with electronic databases: MEDLINE alone is not enough. Control Clin Trials. 2000;21(5):476–87.

Taylor B, Wylie E, Dempster M, Donnelly M. Systematically retrieving research: a case study evaluating seven databases. Res Soc Work Pract. 2007;17(6):697–706.

Beyer FR, Wright K. Can we prioritise which databases to search? A case study using a systematic review of frozen shoulder management. Health Info Libr J. 2013;30(1):49–58.

Duffy S, de Kock S, Misso K, Noake C, Ross J, Stirk L. Supplementary searches of PubMed to improve currency of MEDLINE and MEDLINE in-process searches via Ovid. Journal of the Medical Library Association : JMLA. 2016;104(4):309–12.

Katchamart W, Faulkner A, Feldman B, Tomlinson G, Bombardier C. PubMed had a higher sensitivity than Ovid-MEDLINE in the search for systematic reviews. J Clin Epidemiol. 2011;64(7):805–7.

Cooper C, Lovell R, Husk K, Booth A, Garside R. Supplementary search methods were more effective and offered better value than bibliographic database searching: a case study from public health and environmental enhancement (in Press). Research Synthesis Methods. 2017;

Cooper C, Booth, A., Britten, N., Garside, R. A comparison of results of empirical studies of supplementary search techniques and recommendations in review methodology handbooks: A methodological review. (In Press). BMC Systematic Reviews. 2017.

Greenhalgh T, Peacock R. Effectiveness and efficiency of search methods in systematic reviews of complex evidence: audit of primary sources. BMJ (Clinical research ed). 2005;331(7524):1064–5.

Article   PubMed Central   Google Scholar  

Hinde S, Spackman E. Bidirectional citation searching to completion: an exploration of literature searching methods. PharmacoEconomics. 2015;33(1):5–11.

Levay P, Ainsworth N, Kettle R, Morgan A. Identifying evidence for public health guidance: a comparison of citation searching with web of science and Google scholar. Res Synth Methods. 2016;7(1):34–45.

McManus RJ, Wilson S, Delaney BC, Fitzmaurice DA, Hyde CJ, Tobias RS, et al. Review of the usefulness of contacting other experts when conducting a literature search for systematic reviews. BMJ (Clinical research ed). 1998;317(7172):1562–3.

Westphal A, Kriston L, Holzel LP, Harter M, von Wolff A. Efficiency and contribution of strategies for finding randomized controlled trials: a case study from a systematic review on therapeutic interventions of chronic depression. Journal of public health research. 2014;3(2):177.

Matthews EJ, Edwards AG, Barker J, Bloor M, Covey J, Hood K, et al. Efficient literature searching in diffuse topics: lessons from a systematic review of research on communicating risk to patients in primary care. Health Libr Rev. 1999;16(2):112–20.

Bethel A. Endnote Training (YouTube Videos) 2017b [Available from: http://medicine.exeter.ac.uk/esmi/workstreams/informationscience/is_resources,_guidance_&_advice/ .

Bramer WM, Giustini D, de Jonge GB, Holland L, Bekhuis T. De-duplication of database search results for systematic reviews in EndNote. Journal of the Medical Library Association : JMLA. 2016;104(3):240–3.

Bramer WM, Milic J, Mast F. Reviewing retrieved references for inclusion in systematic reviews using EndNote. Journal of the Medical Library Association : JMLA. 2017;105(1):84–7.

Gall C, Brahmi FA. Retrieval comparison of EndNote to search MEDLINE (Ovid and PubMed) versus searching them directly. Medical reference services quarterly. 2004;23(3):25–32.

Ahmed KK, Al Dhubaib BE. Zotero: a bibliographic assistant to researcher. J Pharmacol Pharmacother. 2011;2(4):303–5.

Coar JT, Sewell JP. Zotero: harnessing the power of a personal bibliographic manager. Nurse Educ. 2010;35(5):205–7.

Moher D, Liberati A, Tetzlaff J, Altman DG, The PG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7):e1000097.

Sampson M, McGowan J, Tetzlaff J, Cogo E, Moher D. No consensus exists on search reporting methods for systematic reviews. J Clin Epidemiol. 2008;61(8):748–54.

Toews LC. Compliance of systematic reviews in veterinary journals with preferred reporting items for systematic reviews and meta-analysis (PRISMA) literature search reporting guidelines. Journal of the Medical Library Association : JMLA. 2017;105(3):233–9.

Booth A. "brimful of STARLITE": toward standards for reporting literature searches. Journal of the Medical Library Association : JMLA. 2006;94(4):421–9. e205

Faggion CM Jr, Wu YC, Tu YK, Wasiak J. Quality of search strategies reported in systematic reviews published in stereotactic radiosurgery. Br J Radiol. 2016;89(1062):20150878.

Mullins MM, DeLuca JB, Crepaz N, Lyles CM. Reporting quality of search methods in systematic reviews of HIV behavioral interventions (2000–2010): are the searches clearly explained, systematic and reproducible? Research Synthesis Methods. 2014;5(2):116–30.

Yoshii A, Plaut DA, McGraw KA, Anderson MJ, Wellik KE. Analysis of the reporting of search strategies in Cochrane systematic reviews. Journal of the Medical Library Association : JMLA. 2009;97(1):21–9.

Bigna JJ, Um LN, Nansseu JR. A comparison of quality of abstracts of systematic reviews including meta-analysis of randomized controlled trials in high-impact general medicine journals before and after the publication of PRISMA extension for abstracts: a systematic review and meta-analysis. Syst Rev. 2016;5(1):174.

Akhigbe T, Zolnourian A, Bulters D. Compliance of systematic reviews articles in brain arteriovenous malformation with PRISMA statement guidelines: review of literature. Journal of clinical neuroscience : official journal of the Neurosurgical Society of Australasia. 2017;39:45–8.

Tao KM, Li XQ, Zhou QH, Moher D, Ling CQ, Yu WF. From QUOROM to PRISMA: a survey of high-impact medical journals' instructions to authors and a review of systematic reviews in anesthesia literature. PLoS One. 2011;6(11):e27611.

Wasiak J, Tyack Z, Ware R. Goodwin N. Jr. Poor methodological quality and reporting standards of systematic reviews in burn care management. International wound journal: Faggion CM; 2016.

Tam WW, Lo KK, Khalechelvam P. Endorsement of PRISMA statement and quality of systematic reviews and meta-analyses published in nursing journals: a cross-sectional study. BMJ Open. 2017;7(2):e013905.

Rader T, Mann M, Stansfield C, Cooper C, Sampson M. Methods for documenting systematic review searches: a discussion of common issues. Res Synth Methods. 2014;5(2):98–115.

Atkinson KM, Koenka AC, Sanchez CE, Moshontz H, Cooper H. Reporting standards for literature searches and report inclusion criteria: making research syntheses more transparent and easy to replicate. Res Synth Methods. 2015;6(1):87–95.

McGowan J, Sampson M, Salzwedel DM, Cogo E, Foerster V, Lefebvre C. PRESS peer review of electronic search strategies: 2015 guideline statement. J Clin Epidemiol. 2016;75:40–6.

Sampson M, McGowan J, Cogo E, Grimshaw J, Moher D, Lefebvre C. An evidence-based practice guideline for the peer review of electronic search strategies. J Clin Epidemiol. 2009;62(9):944–52.

Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, et al. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ (Clinical research ed). 2017;358.

Whiting P, Savović J, Higgins JPT, Caldwell DM, Reeves BC, Shea B, et al. ROBIS: a new tool to assess risk of bias in systematic reviews was developed. J Clin Epidemiol. 2016;69:225–34.

Relevo R, Balshem H. Finding evidence for comparing medical interventions: AHRQ and the effective health care program. J Clin Epidemiol. 2011;64(11):1168–77.

Medicine Io. Standards for Systematic Reviews 2011 [Available from: http://www.nationalacademies.org/hmd/Reports/2011/Finding-What-Works-in-Health-Care-Standards-for-Systematic-Reviews/Standards.aspx .

CADTH: Resources 2018.

Download references

Acknowledgements

CC acknowledges the supervision offered by Professor Chris Hyde.

This publication forms a part of CC’s PhD. CC’s PhD was funded through the National Institute for Health Research (NIHR) Health Technology Assessment (HTA) Programme (Project Number 16/54/11). The open access fee for this publication was paid for by Exeter Medical School.

RG and NB were partially supported by the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care South West Peninsula.

The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.

Author information

Authors and affiliations.

Institute of Health Research, University of Exeter Medical School, Exeter, UK

Chris Cooper & Jo Varley-Campbell

HEDS, School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK

Andrew Booth

Nicky Britten

European Centre for Environment and Human Health, University of Exeter Medical School, Truro, UK

Ruth Garside

You can also search for this author in PubMed   Google Scholar

Contributions

CC conceived the idea for this study and wrote the first draft of the manuscript. CC discussed this publication in PhD supervision with AB and separately with JVC. CC revised the publication with input and comments from AB, JVC, RG and NB. All authors revised the manuscript prior to submission. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Chris Cooper .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional file

Additional file 1:.

Appendix tables and PubMed search strategy. Key studies used for pearl growing per key stage, working data extraction tables and the PubMed search strategy. (DOCX 30 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Cooper, C., Booth, A., Varley-Campbell, J. et al. Defining the process to literature searching in systematic reviews: a literature review of guidance and supporting studies. BMC Med Res Methodol 18 , 85 (2018). https://doi.org/10.1186/s12874-018-0545-3

Download citation

Received : 20 September 2017

Accepted : 06 August 2018

Published : 14 August 2018

DOI : https://doi.org/10.1186/s12874-018-0545-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Literature Search Process
  • Citation Chasing
  • Tacit Models
  • Unique Guidance
  • Information Specialists

BMC Medical Research Methodology

ISSN: 1471-2288

search terms for literature review

Charles Sturt University

Literature Review: Developing a search strategy

  • Traditional or narrative literature reviews
  • Scoping Reviews
  • Systematic literature reviews
  • Annotated bibliography
  • Keeping up to date with literature
  • Finding a thesis
  • Evaluating sources and critical appraisal of literature
  • Managing and analysing your literature
  • Further reading and resources

From research question to search strategy

Keeping a record of your search activity

Good search practice could involve keeping a search diary or document detailing your search activities (Phelps et. al. 2007, pp. 128-149), so that you can keep track of effective search terms, or to help others to reproduce your steps and get the same results. 

This record could be a document, table or spreadsheet with:

  • The names of the sources you search and which provider you accessed them through - eg Medline (Ovid), Web of Science (Thomson Reuters). You should also include any other literature sources you used.
  • how you searched (keyword and/or subject headings)
  • which search terms you used (which words and phrases)
  • any search techniques you employed (truncation, adjacency, etc)
  • how you combined your search terms (AND/OR). Check out the Database Help guide for more tips on Boolean Searching.
  • The number of search results from each source and each strategy used. This can be the evidence you need to prove a gap in the literature, and confirms the importance of your research question.
you will be doing a number of searches as your initial search evolves. As your thesis, discussions and argument develops you will search for further evidence and support from the literature. Each search should be included in your search record.

A search planner may help you to organise you thoughts prior to conducting your search. If you have any problems with organising your thoughts prior, during and after searching please contact your Library  Faculty Team   for individual help.

  • Literature search - a librarian's handout to introduce tools, terms and techniques Created by Elsevier librarian, Katy Kavanagh Web, this document outlines tools, terms and techniques to think about when conducting a literature search.
  • Search planner

Literature search cycle

search terms for literature review

Diagram text description

This diagram illustrates the literature search cycle. It shows a circle in quarters. Top left quarter is identify main concepts with rectangle describing how to do this by identifying:controlled vocabulary terms, synonyms, keywords and spelling. Top right quarter select library resources to search and rectangle describing resources to search library catalogue relevant journal articles and other resource. Bottom right corner of circle search resources and in rectangle consider using boolean searching proximity searching and truncated searching techniques. Bottom left quarter of circle review and refine results. In rectangle evaluate results, rethink keywords and create alerts.

Have a search framework

Search frameworks are mnemonics which can help you focus your research question. They are also useful in helping you to identify the concepts and terms you will use in your literature search.

PICO is a search framework commonly used in the health sciences to focus clinical questions.  As an example, you work in an aged care facility and are interested in whether cranberry juice might help reduce the common occurrence of urinary tract infections.  The PICO framework would look like this:

opulation/ atient/ roblem

  People living in aged care facilities

ntervention  

  Cranberry juice

omparison

  No cranberry juice (status quo)

utcome

  Prevention of UTIs

Now that the issue has been broken up to its elements, it is easier to turn it into an answerable research question: “Does cranberry juice help reduce urinary tract infections in people living in aged care facilities?”

Other frameworks may be helpful, depending on your question and your field of interest. PICO can be adapted to PICOT (which adds T ime) or PICOS (which adds S tudy design), or PICOC (adding C ontext).

For qualitative questions you could use

  • SPIDER : S ample,  P henomenon of  I nterest,  D esign,  E valuation,  R esearch type  

For questions about causes or risk,

  • PEO : P opulation,  E xposure,  O utcomes

For evaluations of interventions or policies, 

  • SPICE: S etting,  P opulation or  P erspective,  I ntervention,  C omparison,  E valuation or
  • ECLIPSE: E xpectation,  C lient group,  L ocation,  I mpact,  P rofessionals,  SE rvice 

See the University of Notre Dame Australia’s examples of some of these frameworks. 

You can also try some PICO examples in the National Library of Medicine's PubMed training site: Using PICO to frame clinical questions.

If you use the elements of your search framework to combine terms, you may find you have narrowed the search too much and will struggle to find relevant studies. Try using only the most critical elements from the mnemonic for concepts to search. For example, in a PICO search, you would sometimes exclude the O (outcome) terms in your search strategy as the outcomes may come from combining the other terms. If the C (comparison) is the status quo, you wouldn't use those terms either. Try to avoid concepts that have vague or broad meanings, such as benefits or health effects.

Ask your for help and advice!

Contact Your Faculty Team Librarian

Faculty librarians are here to provide assistance to students, researchers and academic staff by providing expert searching advice, research and curriculum support.

  • Faculty of Arts & Education team
  • Faculty of Business, Justice & Behavioural Science team
  • Faculty of Science team

Further reading

Cover Art

  • << Previous: Annotated bibliography
  • Next: Keeping up to date with literature >>
  • Last Updated: Aug 11, 2024 4:07 PM
  • URL: https://libguides.csu.edu.au/review

Acknowledgement of Country

Charles Sturt University is an Australian University, TEQSA Provider Identification: PRV12018. CRICOS Provider: 00005F.

Literature Searching

In this guide.

  • Introduction
  • Steps for searching the literature in PubMed
  • Step 1 - Formulate a search question
  • Step 2- Identify primary concepts and gather synonyms
  • Step 3 - Locate subject headings (MeSH)
  • Step 4 - Combine concepts using Boolean operators
  • Step 5 - Refine search terms and search in PubMed
  • Step 6 - Apply limits

Students and researchers in the health sciences are often required to conduct literature searches for a number of reasons including identifying appropriate studies and methods to include in a literature review manuscript. Understanding the basics of database searching can allow you to effectively and efficiently find the information you need. This guide takes you through the process of developing an advanced, robust literature search in PubMed . 

While the guide is based on searching in the PubMed database, the strategies can be applied appropriately to other databases, such as Embase , CINAHL , PsycINFO , etc. (see search syntax for more information on search translation). For more information on searching in other databases, attend Lane's Literature Reviews Beyond PubMed: Crafting Effective Searches in Other Databases  course.

If you're interested in conducting a systematic review, please visit Introduction to Systematic Reviews . 

Research Services

  • Literature Searches
  • Data Service
  • Research Metrics Service
  • Authoring and Publishing Support

Literature or reviews of a systematic nature is a value-added service available for current members of the Stanford Medicine community. To qualify for the service, the project lead and point-of-contact to Lane must be a Stanford affiliate.

Lane Medical Library staff can help you with your search strategy creation, collaborate on systematic reviews and other knowledge syntheses, provide guidance on documentation, processes, and tools, among others. Reviews or projects the team can help you with include but are not limited to:

  • Systematic Review
  • Meta-Analysis
  • Literature Review for Grant Application
  • Thesis or Dissertation
  • Course Assignment
  • Scoping Review
  • Book Chapter
  • Thesis Report

If you are interested in collaborating with a Lane Medical Librarian on a review project, please submit a  literature search request .

Data management and sharing is a component of Open Science, which aims to make scientific research more transparent and accessible. Proper data management and sharing benefit you as an individual and the research community as a whole. Lane's data service provides: 

  • Best practices related to data management and sharing
  • Assistance in complying with requirements related to the management and sharing of research data (e.g. from a publisher or funder)
  • Consultations related to research data management, data security, data publishing, data curation, and long-term preservation
  • Workshops and classes related to best practices in data management and sharing

For more information, visit the Data Management and Sharing guide .

Research metrics measure the impact of a scholar, article, book, journal, or research institution. Metrics can be informed through different approaches, such as citation counts, that can add to a broader understanding of impact. Lane's research metrics service assists with:

  • Understanding, measuring and using bibliometrics and other statistical analysis of publications 
  • Using alternative metrics (also known as altmetrics) 
  • Common tools for assessing research impact

For more information on research metrics, visit the Research Impact guide.

Lane Librarians can help you with the authoring and publishing of your manuscript such as:

  • Learn about unique author identifiers and how to create an ORCID ID
  • How to find journals relevant to areas of research/interest for possible submission
  • How to find journal impact factors
  • Help find instructions for authors
  • Explore alternate publishing models including open access journals
  • How to verify citations

We also provide discounted Open Access author processing fees. For more information, visit our Understanding Open Access guide.  

  • Next: Steps for searching the literature in PubMed >>
  • Last Updated: Jul 10, 2024 2:17 PM
  • URL: https://laneguides.stanford.edu/LitSearch

MCPHS Library Logo

Literature Reviews & Search Strategies

  • Defining the Literature Review
  • Types of Literature Reviews
  • Choosing Databases

Overview of Search Strategies

Search strategies, subject searching, example: iteratively developing + using keywords, demonstration: developing keywords from a question, demonstration: an advanced search.

  • Organizing Your Literature
  • Books: Research Design & Scholarly Writing
  • Recommended Tutorials

There are many ways to find literature for your review, and we recommend that you use a combination of strategies - keeping in mind that you're going to be searching multiple times in a variety of ways, using different databases and resources. Searching the literature is not a straightforward, linear process - it's iterative (translation: you'll search multiple times, modifying your strategies as you go, and sometimes it'll be frustrating). 

  • Known Item Searching
  • Citation Jumping

Some form of a keyword search is the way most of us get at scholarly articles in database - it's a great approach! Make sure you're familiar with these librarian strategies to get the most out of your searches.

Figuring out the best keywords for your research topic/question is a process - you'll start with one or a few words and then shift, adapt, and expand them as you start finding source that describe the topic using other words. Your search terms are the bridge between known topics and the unknowns of your research question - so sometimes one specific word will be enough, sometimes you'll need several different words to describe a concept AND you'll need to connect that concept to a second (and/or third) concept.

The number and specificity of your search terms depend on your topic and the scope of your literature review.

Connect Keywords Using Boolean

Boolean Connector Purpose

Connects different concepts (keywords).

Narrows down the number of results.

Connects synonyms.

Expands the number of results.

Excludes a concept.

Use with extreme caution (even librarians don't use this one much).

Make the Database Work More

...uses the asterisk (*) to end a word at its core, allowing you to retrieve many more documents containing variations of the search term.  Example: educat* will find educate, educates, education, educators, educating and more.

Phrase Searching

...is when you put quotations marks around two or more words, so that the database looks for those words in that exact order. Examples: "higher education," "public health" and "pharmaceutical industry."

Controlled Vocabulary

... is when you use the terms the database uses to describe what each article is about as search terms. Searching using controlled vocabularies is a great way to get at everything on a topic in a database.  

Databases and search engines are probably going to bring back a lot of results - more than a human can realistically go through. Instead of trying to manually read and sort them all, use the filters in each database to remove the stuff you wouldn't use anyway (ie it's outside the scope of your project).

To make sure you're consistent between searches and databases, write down the filters you're using.

A Few Filters to Try

Filter How it's Useful
This a filter you'll see in MCPHS Smart Search and the Catalog, and it's way narrow your results to things immediately available through the MCPHS collections, or (when removed) means you'll also see previews of results that we'll get for you through Interlibrary Loan. If you're doing graduate-level or more advanced work, you'll want to removed any location limiters, because relevance to your topic is more important.
You may want to limit the search results you're seeing based on when they were published. For example, evidence-based medicine often involves looking at research from the last five years, while a project taking a historical perspective will want to include work going further back in time.
The Library collects resources in English, but various databases index (include the abstract) of articles in a variety of languages. Consider limiting your search results to just those published in languages that you can read research in. (Note: your professor/advisor may have additional language restrictions, so if you're including research in multiple languages make sure that works for them too.)
Literature reviews usually rely rather narrowly on various scholarly or academic sources, rather than the full spectrum of sources available to you in the world. Consider limiting your results to Academic or Scholarly articles.
Sometimes you just need a systematic review, empirical study, or some other form of research. Subject specific databases will almost always offer a way to narrow down your results by methodology (article type). When the filter isn't available you can add your method of choice as an additional keyword!
These are just a start! When you get to a new database, pause and take a look around. Figure out how the database can do some of the work for you - you can always turn off a filter if you don't like what it does.

Once you know you have a good article , there are a lot of useful parts to it - far beyond the content.

Not sure where to start? Try course readings and other required materials.

Useful Parts of a Good Article

Useful Part Explanation
Look at the author-generated keywords, the database subject headings, the title, abstract and introduction for words that may be great additional/alternative search terms. You don't have to know everything about a topic before you start searching - let what you find introduce you to the language of the field.
Author(s) If they're written one article on this topic, they may have written more. Click on the author names to see what else they have in the database, or use their names (individually) as a search term elsewhere.
Journal They may have published other articles on your topic; sometimes there's even a special issue wholly focused on a single topic. Consider browsing or searching within a specific publication. Oftentimes you'll end up searching in the journal's website.
Instruments If authors have already created and validated an instrument (survey, tests, and measures), consider if you can use/adapt it for your own work. Look for details in the methods section, an original citation in the reference, and/or a copy in the appendix.
Experts on this topic have gathered and evaluated these sources, make sure you look through them for potential sources for your own work.

Ways to Use Citations

By using the references from the end of an article you'll move backwards and laterally in time to connected literature in the field. This is a great way to find other relevant articles as well as foundational research in the field.
By using a citation searching database (eg Scopus or Google Scholar) you can more forward and laterally in time to connect to newer literature in the field. This is a great way to find more relevant articles in a fields as well as get as sense of how significant the article you're starting from is to the field as a whole.
  • Interactive Tutorial: Searching Cited and Citing Practice starting your search at an article and using the references to gather additional sources.

Older sources eat into the found article as references, and the found article is cited by more recent publications.

Your search results don't have to be frozen in the moment you search! There are a few things you can set up to keep your search going automatically.

Alerts What it Does for You Example
  Receive an email each time a new issue is published. This is a great way to read the most current research being published in leading journals in your field. Or consider following journals on social media, via an RSS feed, or app like If This Then That - these tools are great ways to stay up on the new research out there.      Set up an account with the to receive eAlerts.  
You can create a personal account in most databases so that once you've fine-tuned your search terms and filters on a topic, you can easily rerun the search manually (going back to the database), or set it so the database automatically runs the search on a schedule and emails you any results. Make those tools work for you!  In PubMed, set up to start saving your searching and creating alerts.
Similar to the database alerts, you can tell Google Scholar to email you whenever certain words or phrases (including authors, institutions, methods, keywords, etc.) appear in new search results.  In settings, and then enter the word/phrase you want them to email you about. 

Searching using subject headings is a comprehensive search strategy that requires some planning and topic knowledge. Work through this PubMed tutorial for an introduction to this important approach to searching.

tutorial on PubMed Subject Search: How it Works

Through these videos and the accompanying PDF, you'll see an example of starting with a potential research question and developing search terms through brainstorming and keyword searching.

  • Slidedeck: Keywords and Advanced Search PowerPoint slides to accompany the two demonstration videos on developing keywords from a question, and doing an advanced search.
  • << Previous: Choosing Databases
  • Next: Organizing Your Literature >>
  • Last Updated: Jun 14, 2023 11:18 AM
  • URL: https://mcphs.libguides.com/litreviews
  • Subject guides
  • Researching for your literature review
  • Develop a search strategy

Researching for your literature review: Develop a search strategy

  • Literature reviews
  • Literature sources
  • Before you start
  • Keyword search activity
  • Subject search activity
  • Combined keyword and subject searching
  • Online tutorials
  • Apply search limits
  • Run a search in different databases
  • Supplementary searching
  • Save your searches
  • Manage results

Identify key terms and concepts

Start developing a search strategy by identifying the key words and concepts within your research question. 

For example:   How do s t udents view inclusive educational practices in schools ?

Treat each component as a separate concept (there are usually between 2-4 concepts).

For each concept list the key words derived from your research question, as well as any other relevant terms or synonyms that you have found in your preliminary searches. Also consider singular and plural forms of words, variant spellings, acronyms and relevant index terms (subject headings).  

schools
pupil perception  special education needs education
young people SEN primary
children belonging high school
perspective disability secondary

As part of the process of developing a search strategy, it is recommended that you keep a master list of search terms for each key concept. This will make it easier when it comes to translating your search strategy across multiple database platforms. 

Concept map template for documenting search terms

Combine search terms and concepts

Boolean operators are used to combine the different concepts in your topic to form a search strategy. The main operators used to connect your terms are AND and OR . See an explanation below:

  • Link keywords related to a single concept with OR
  • Linking with OR broadens a search (increases the number of results) by searching for any of the alternative keywords

Example: perspective  OR attitude

  • Link different concepts with AND
  • Linking with AND narrows a search (reduces the number of results) by retrieving only those records that include all of your specified keywords

Example: inclusive education  AND student perspective

  • using NOT narrows a search by excluding certain search terms
  • Most searches do not require the use of the NOT operator

Example: education  NOT higher education  will retrieve all results that include the word education  but don’t contain the phrase  higher education .

See the website for venn diagrams demonstrating the function of AND/OR/NOT:

Combine the search terms using Boolean

Advanced search operators - truncation and wildcards

Use symbols to retrieve word variations:

The truncation symbol is commonly an asterisk * and is added at the end of a word.

  • The asterisk applied to the root of a word captures other endings to that root word making it useful for retrieving singular, plural and other variations of a keyword.

Example:  educat *  will retrieve educat ion, educat ors, educat ional , etc

Note: If you don't want to retrieve all possible variations, an easy alternative is to utilise the OR operator instead e.g. education OR educational.

The wildcard symbols include the question mark ? and hash #. They replace zero, one or more characters in the middle of a word.

Example:  wom # n finds woman or women, behavio ? r finds behaviour or behavior.

The symbols may vary in different databases - See the Database search tips guide for details or check the Help link in any database.

Phrase searching

Use quotes to keep word order when searching for phrases.

For phrase searching, place two or more words in "inverted commas" or "quote marks".

Example: “inclusive education”

In some databases, words may be searched separately if the quote marks are not used. In other databases, word order may be maintained without the need for quote marks.

See the Database search tips for details on phrase searching in key databases, or check the Help link in any database.

Subject headings (index terms)

Identify appropriate subject headings (index terms).

Many databases use subject headings to index content. These are selected from a controlled list and describe what the article is about. 

A comprehensive search strategy is often best achieved by using a combination of keywords and subject headings where possible.

In-depth knowledge of subject headings is not required for users to benefit from improved search performance using them in their searches.

Advantages of subject searching:

  • Helps locate articles that use synonyms, variant spellings, plurals
  • Search terms don’t have to appear in the title or abstract

Note: Subject headings are often unique to a particular database, so you will need to look for appropriate subject headings in each database you intend to use.

Subject headings are not available for every topic, and it is best to only select them if they relate closely to your area of interest.

Create a gold set

It is useful to build a ‘sample set’ or ‘gold set’ of relevant references before you develop your search strategy.  .

Sources for a 'gold set' may include:

  • key papers recommended by subject experts or supervisors
  • citation searching - looking at a reference list to see who has been cited, or using a citation database (eg. Scopus, Web of Science) to see who has cited a known relevant article
  • results of preliminary scoping searches.

The papers in your 'gold set' can then be used to help you identify relevant search terms

  • Look up your 'sample set' articles in a database that you will use for your literature review. For the articles indexed in the database, look at the records to see what keywords and/or subject headings are listed.

The 'gold set' will also provide a means of testing your search strategy

  • When an indexed article is not retrieved, your search strategy can be revised in order to include it (see what concepts or keywords can be incorporated into your search strategy so that the article is retrieved).
  • If your search strategy is retrieving a lot of irrelevant results, look at the irrelevant records to determine why they are being retrieved. What keywords or subject headings are causing them to appear? Can you change these without losing any relevant articles from your results?

Example search strategy

An example of a search strategy incorporating all three concepts that could be applied to different databases is shown below:.

screenshot of search strategy entered into a database Advanced search screen

The above search strategy in a nested format (for use in a single search box) would look like:

(student* OR pupil* OR "young people" OR learner*) AND (perception* OR experience OR voice OR perspective*) AND (inclusi* OR "special education" OR belonging OR disabilit*)

  • << Previous: Search strategies - Education/Social sciences topic example
  • Next: Keyword search activity >>

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

How to undertake a literature search: a step-by-step guide

Affiliation.

  • 1 Literature Search Specialist, Library and Archive Service, Royal College of Nursing, London.
  • PMID: 32279549
  • DOI: 10.12968/bjon.2020.29.7.431

Undertaking a literature search can be a daunting prospect. Breaking the exercise down into smaller steps will make the process more manageable. This article suggests 10 steps that will help readers complete this task, from identifying key concepts to choosing databases for the search and saving the results and search strategy. It discusses each of the steps in a little more detail, with examples and suggestions on where to get help. This structured approach will help readers obtain a more focused set of results and, ultimately, save time and effort.

Keywords: Databases; Literature review; Literature search; Reference management software; Research questions; Search strategy.

PubMed Disclaimer

Similar articles

  • Undertaking a literature review: a step-by-step approach. Cronin P, Ryan F, Coughlan M. Cronin P, et al. Br J Nurs. 2008 Jan 10-23;17(1):38-43. doi: 10.12968/bjon.2008.17.1.28059. Br J Nurs. 2008. PMID: 18399395 Review.
  • Using the Internet to conduct a literature search. Younger P. Younger P. Nurs Stand. 2004 Oct 20-26;19(6):45-51; quiz 52. doi: 10.7748/ns2004.10.19.6.45.c3728. Nurs Stand. 2004. PMID: 15544191
  • Conducting efficient literature searches. Cleary M, Hunt GE, Horsfall J. Cleary M, et al. J Psychosoc Nurs Ment Health Serv. 2009 Nov;47(11):34-41. doi: 10.3928/02793695-20090930-03. J Psychosoc Nurs Ment Health Serv. 2009. PMID: 19921761
  • The importance of the literature search. Schmelzer M. Schmelzer M. Gastroenterol Nurs. 2008 Mar-Apr;31(2):151-3. doi: 10.1097/01.SGA.0000316536.61861.fd. Gastroenterol Nurs. 2008. PMID: 18391805 No abstract available.
  • Searching the literature by design. Littleton D, Marsalis S, Bliss DZ. Littleton D, et al. West J Nurs Res. 2004 Dec;26(8):891-908. doi: 10.1177/0193945903258729. West J Nurs Res. 2004. PMID: 15539534 Review.
  • Understanding the Experiences of Clinicians Accessing Electronic Databases to Search for Evidence on Pain Management Using a Mixed Methods Approach. Arumugam V, MacDermid JC, Walton D, Grewal R. Arumugam V, et al. Healthcare (Basel). 2023 Jun 12;11(12):1728. doi: 10.3390/healthcare11121728. Healthcare (Basel). 2023. PMID: 37372845 Free PMC article.
  • A systematic review of the qualitative literature on older individuals' experiences of care and well-being during physical rehabilitation for acquired brain injury. Lafiatoglou P, Ellis-Hill C, Gouva M, Ploumis A, Mantzoukas S. Lafiatoglou P, et al. J Adv Nurs. 2022 Feb;78(2):377-394. doi: 10.1111/jan.15016. Epub 2021 Aug 16. J Adv Nurs. 2022. PMID: 34397112 Free PMC article. Review.
  • Search in MeSH

LinkOut - more resources

Full text sources.

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Systematic Review

  • Systematic reviews

Being systematic

Search terms, choosing databases, finding additional resources.

  • Search techniques
  • Systematically search databases
  • Appraisal & synthesis
  • Reporting findings
  • Systematic review tools

Searching literature systematically is useful for all types of literature reviews!

However, if you are writing a systematic literature review the search needs to be particularly well planned and structured to ensure it is:

  • comprehensive
  • transparent

These help ensure bias is eliminated and the review is methodologically sound.

To achieve the above goals, you will need to:

  • create a search strategy and ensure it is reviewed by your research group
  • document each stage of your literature searching
  • report each stage of quality appraisal 

Identify the key concepts in your research question

The first step in developing your search strategy is identifying the key concepts your research question covers.

  • A preliminary search is often done to understand the topic and to refine your research question. 

Identify search terms

Use an iterative process to identify useful search terms for conducting your search. 

  • Brainstorm keywords and phrases that can describe each concept you have identified in your research question.
  • Create a table to record these keywords
  • Select your keywords carefully
  • Check against inclusion/exclusion criteria
  • Repeated testing   is required to create a robust search strategy for a systematic review
  • Run your search on your primary database and evaluate the first page of records to see how suitable your search is
  • Identify reasons for irrelevant results and adjust your keywords accordingly 
  • Consider whether it would be useful to use broader or narrower terms for your concepts
  • Identify keywords in relevant results that you could add to your search to retrieve more relevant resources

Using a concept map or a mind map may help you clarify concepts and the relationships between or within concepts. Watch these YouTube videos for some ideas: 

  • How to make a concept map  (by Lucidchart)
  • Make sense of this mess world - mind maps  (by Sheng Huang)

Example keywords table:

Research question: What is the relationship between adverse childhood experiences and depression in mothers during the perinatal period? 

adverse childhood experiences

 

perinatal depression

 

mothers

ACE postpartum depression     women                          
childhood trauma                 postnatal depression  
  maternal mental health   
    maternal psychological distress  

Revise your strategy/search terms until :

  • the results match your research question
  • you are confident you will find all the relevant literature on your topic

See Creating search strings for information on how to enter your search terms into databases. 

Example search string (using Scopus's Advanced search option) for the terms in the above table:

(TITLE-ABS-KEY("advserse childhood experienc*" OR ACE OR "childhood trauma") AND TITLE-ABS-KEY("perinatal depress*" OR "postpartum depress*" OR "postnatal depress*" OR "maternal mental health" OR "maternal psychological distress") AND TITLE-ABS-KEY(mother* OR women*))

See Subject headings  for information on including these database specific terms to your search terms.

Systematic reviewers usually use several databases to search for literature. This ensures that the searching is comprehensive and biases are minimised. 

Use both subject-specific and multidisciplinary databases to find resources relevant to your research question:

  • Subject-specific databases: in-depth coverage of literature specific to a research field.
  • Multi-disciplinary databases: literature from many research fields - help you find resources from disciplines you may not have considered.

Check for databases in your subject area via the Databases tab > Find by subject on the library homepage .

Find the  key databases that are often used for systematic reviews in this guide. 

Test searches to determine database usefulness. You can consult your Liaison Librarians to finalise the list of databases for your review.

Recommendations:

For all systematic reviews we recommend using Scopus , a high-quality, multidisciplinary database:

  • Scopus is an abstract and citation database with links to full text on publisher websites or in other databases.
  • Scopus indexes a curated collection of high quality journals along with books and conference proceedings.
  • Research outputs are across a range of fields - science, technology, medicine, social science, arts and humanities.

For systematic reviews within the health/biomedical field, we recommend including Medline as one of the databases for your review:

MEDLINE  (via Ebsco, via Ovid, via PubMed)

  • Medline is the National Library of Medicine’s (NLM) article citation database.
  • Medline is hosted individually on a variety of platforms (EBSCO, OVID) and comprises the majority of PubMed.
  • Articles in Medline are indexed using MeSH headings. See Subject headings for more information on MeSH.

Note: PubMed contains all of Medline and additional citations, e.g. books, manuscripts, citations that predate Medline.

To ensure your search is comprehensive you may need to search beyond academic databases when conducting a systematic review, particularly to find grey literature  (literature not published commercially and outside traditional academic sources such as journals).

Google Scholar

Google Scholar contains academic resources across disciplines and sources types. These come from academic publishers, professional societies, online repositories, universities and web sites.

Use Google Scholar

  • as an additional tool to locate relevant publications not included in high-level academic databases
  • for finding grey literature such as postgraduate theses and conference proceedings

You can limit your search to the type of websites by using site:ac . nz; site:edu

Note that Google Scholar searches are not as replicable or transparent as academic database searches, and may find large numbers of results.

Other sources of grey literature

  • Grey literature checklist  (health related grey literature)
  • OpenGrey  
  • Public health Ontario guide to appraising grey literature
  • Institutional Repository for Information Sharing (IRIS)
  • Google search: use it for finding government reports, policies, theses, etc. You can limit your search to a particular type of websites by including site : govt.nz, site: . gov, site: . ac . nz, site: . edu, in your search

Watch our Finding grey literature  video (3.49 mins) online.

  • << Previous: Planning
  • Next: Search techniques >>
  • Last Updated: Aug 8, 2024 4:20 PM
  • URL: https://aut.ac.nz.libguides.com/systematic_reviews

University of Leeds logo

  • Study and research support
  • Literature searching

Literature searching explained

Develop a search strategy.

A search strategy is an organised structure of key terms used to search a database. The search strategy combines the key concepts of your search question in order to retrieve accurate results.

Your search strategy will account for all:

  • possible search terms;
  • keywords and phrases;
  • truncated and wildcard variations of search terms; and
  • subject headings (where applicable).

Each database works differently so you need to adapt your search strategy for each database. You may wish to develop a number of separate search strategies if your research covers several different areas.

It is a good idea to test your strategies and refine them after you have reviewed the search results.

How a search strategy looks in practice

Take a look at this example literature search in PsycINFO (PDF) about self-esteem.

The example shows the subject heading and keyword searches that have been carried out for each concept within our research question and how they have been combined using Boolean operators. It also shows where keyword techniques like truncation, wildcards and adjacency searching have been used.

Search strategy techniques

The next sections show some techniques you can use to develop your search strategy.

Skip straight to:

  • Choosing search terms
  • Searching with keywords
  • Searching for exact phrases
  • Using truncated and wildcard searches

Searching with subject headings

  • Using Boolean logic

Citation searching

Choose search terms.

Concepts can be expressed in different ways eg “self-esteem” might be referred to as “self-worth”. Your aim is to consider each of your concepts and come up with a list of the different ways they could be expressed.

To find alternative keywords or phrases for your concepts try the following:

  • Use a thesaurus to identify synonyms.
  • Search for your concepts on a search engine like Google Scholar, scanning the results for alternative words and phrases.
  • Examine relevant abstracts or articles for alternative words, phrases and subject headings (if the database uses subject headings).

When you've done this, you should have lists of words and phrases for each concept as in this completed PICO model (PDF) or this example concept map (PDF).

Watch this short video on how to plan a search and generating good search terms:

As you search and scan articles and abstracts, you may discover different key terms to enhance your search strategy.

Using truncation and wildcards can save you time and effort by finding alternative keywords.

Search with keywords

Keywords are free text words and phrases. Database search strategies use a combination of free text and subject headings (where applicable).

A keyword search usually looks for your search terms in the title and abstract of a reference. You may wish to search in title fields only if you want a small number of specific results.

Some databases will find the exact word or phrase, so make sure your spelling is accurate or you will miss references.

Search for the exact phrase

If you want words to appear next to each other in an exact phrase, use quotation marks, eg “self-esteem”.

Phrase searching decreases the number of results you get and makes your results more relevant. Most databases allow you to search for phrases, but check the database guide if you are unsure.

Truncation and wildcard searches

You can use truncated and wildcard searches to find variations of your search term. Truncation is useful for finding singular and plural forms of words and variant endings.

Many databases use an asterisk (*) as their truncation symbol. Check the database help section if you are not sure which symbol to use. For example, “therap*” will find therapy, therapies, therapist or therapists.

A wildcard finds variant spellings of words. Use it to search for a single character, or no character. Check the database help section to see which symbol to use as a wildcard.

Wildcards are useful for finding British and American spellings, for example: “behavio?r” in Medline will find both behaviour and behavior.

There are sometimes different symbols to find a variable single character. For example, in the Medline database, “wom#n” will find woman and also women.

Watch this video to learn more about searching using truncation and wildcards:

Use adjacency searching for more accurate results

You can specify how close two words appear together in your search strategy. This can make your results more relevant; generally the closer two words appear to each other, the closer the relationship is between them.

Commands for adjacency searching differ among databases, so make sure you consult database guides.

In OvidSP databases (like Medline), searching for “physician ADJ3 relationship” will find both physician and relationship within two major words of each other, in any order. This finds more papers than "physician relationship".

Using this adjacency retrieves papers with phrases like "physician patient relationship", "patient physician relationship", "relationship of the physician to the patient" and so on.

Database subject headings are controlled vocabulary terms that a database uses to describe what an article is about.

Watch this video to learn more about using subject headings in database searching:

Using appropriate subject headings enhances your search and will help you to find more results on your topic. This is because subject headings find articles according to their subject, even if the article does not use your chosen key words.

You should combine both subject headings and keywords in your search strategy for each of the concepts you identify. This is particularly important if you are undertaking a systematic review or an in-depth piece of work

Subject headings may vary between databases, so you need to investigate each database separately to find the subject headings they use. For example, for Medline you can use MeSH (Medical Subject Headings) and for Embase you can use the EMTREE thesaurus.

SEARCH TIP: In Ovid databases, search for a known key paper by title, select the "complete reference" button to see which subject headings the database indexers have given that article, and consider adding relevant ones to your own search strategy.

Use Boolean logic to combine search terms

Boolean operators (AND, OR and NOT) allow you to try different combinations of search terms or subject headings.

Databases often show Boolean operators as buttons or drop-down menus that you can click to combine your search terms or results.

The main Boolean operators are:

OR is used to find articles that mention either of the topics you search for.

AND is used to find articles that mention both of the searched topics.

NOT excludes a search term or concept. It should be used with caution as you may inadvertently exclude relevant references.

For example, searching for “self-esteem NOT eating disorders” finds articles that mention self-esteem but removes any articles that mention eating disorders.

Watch this video to learn more about how to combine keywords efficiently to find relevant sources:

Citation searching is a method to find articles that have been cited by other publications.

Use citation searching (or cited reference searching) to:

  • find out whether articles have been cited by other authors
  • find more recent papers on the same or similar subject
  • discover how a known idea or innovation has been confirmed, applied, improved, extended, or corrected
  • help make your literature review more comprehensive.

You can use cited reference searching in:

  • OvidSP databases
  • Google Scholar
  • Web of Science

Cited reference searching can complement your literature search. However be careful not to just look at papers that have been cited in isolation. A robust literature search is also needed to limit publication bias.

Watch this video to learn more about citation search:

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Anaesth
  • v.60(9); 2016 Sep

Literature search for research planning and identification of research problem

Anju grewal.

Department of Anaesthesiology, Dayanand Medical College and Hospital, Ludhiana, Punjab, India

Hanish Kataria

1 Department of Surgery, Government Medical College and Hospital, Chandigarh, India

2 Department of Cardiac Anaesthesia, All India Institute of Medical Sciences, New Delhi, India

Literature search is a key step in performing good authentic research. It helps in formulating a research question and planning the study. The available published data are enormous; therefore, choosing the appropriate articles relevant to your study in question is an art. It can be time-consuming, tiring and can lead to disinterest or even abandonment of search in between if not carried out in a step-wise manner. Various databases are available for performing literature search. This article primarily stresses on how to formulate a research question, the various types and sources for literature search, which will help make your search specific and time-saving.

INTRODUCTION

Literature search is a systematic and well-organised search from the already published data to identify a breadth of good quality references on a specific topic.[ 1 ] The reasons for conducting literature search are numerous that include drawing information for making evidence-based guidelines, a step in the research method and as part of academic assessment.[ 2 ] However, the main purpose of a thorough literature search is to formulate a research question by evaluating the available literature with an eye on gaps still amenable to further research.

Research problem[ 3 ] is typically a topic of interest and of some familiarity to the researcher. It needs to be channelised by focussing on information yet to be explored. Once we have narrowed down the problem, seeking and analysing existing literature may further straighten out the research approach.

A research hypothesis[ 4 ] is a carefully created testimony of how you expect the research to proceed. It is one of the most important tools which aids to answer the research question. It should be apt containing necessary components, and raise a question that can be tested and investigated.

The literature search can be exhaustive and time-consuming, but there are some simple steps which can help you plan and manage the process. The most important are formulating the research questions and planning your search.

FORMULATING THE RESEARCH QUESTION

Literature search is done to identify appropriate methodology, design of the study; population sampled and sampling methods, methods of measuring concepts and techniques of analysis. It also helps in determining extraneous variables affecting the outcome and identifying faults or lacunae that could be avoided.

Formulating a well-focused question is a critical step for facilitating good clinical research.[ 5 ] There can be general questions or patient-oriented questions that arise from clinical issues. Patient-oriented questions can involve the effect of therapy or disease or examine advantage versus disadvantage for a group of patients.[ 6 ]

For example, we want to evaluate the effect of a particular drug (e.g., dexmedetomidine) for procedural sedation in day care surgery patients. While formulating a research question, one should consider certain criteria, referred as ‘FINER’ (F-Feasible, I-Interesting, N-Novel, E-Ethical, R-Relevant) criteria.[ 5 ] The idea should be interesting and relevant to clinical research. It should either confirm, refute or add information to already done research work. One should also keep in mind the patient population under study and the resources available in a given set up. Also the entire research process should conform to the ethical principles of research.

The patient or study population, intervention, comparison or control arm, primary outcome, timing of measurement of outcome (PICOT) is a well-known approach for framing a leading research question.[ 7 , 8 ] Dividing the questions into key components makes it easy and searchable. In this case scenario:

  • Patients (P) – What is the important group of patients? for example, day care surgery
  • Intervention (I) – What is the important intervention? for example, intravenous dexmedetomidine
  • Comparison (C) – What is the important intervention of comparison? for example, intravenous ketamine
  • Outcome (O) – What is the effect of intervention? for example, analgesic efficacy, procedural awareness, drug side effects
  • Time (T) – Time interval for measuring the outcome: Hourly for first 4 h then 4 hourly till 24 h post-procedure.

Multiple questions can be formulated from patient's problem and concern. A well-focused question should be chosen for research according to significance for patient interest and relevance to our knowledge. Good research questions address the lacunae in available literature with an aim to impact the clinical practice in a constructive manner. There are limited outcome research and relevant resources, for example, electronic database system, database and hospital information system in India. Even when these factors are available, data about existing resources is not widely accessible.[ 9 ]

TYPES OF MEDICAL LITERATURE

(Further details in chapter ‘Types of studies and research design’ in this issue).

Primary literature

Primary sources are the authentic publication of an expert's new evidence, conclusions and proposals (case reports, clinical trials, etc) and are usually published in a peer-reviewed journal. Preliminary reports, congress papers and preprints also constitute primary literature.[ 2 ]

Secondary literature

Secondary sources are systematic review articles or meta-analyses where material derived from primary source literature are infererred and evaluated.[ 2 ]

Tertiary literature

Tertiary literature consists of collections that compile information from primary or secondary literature (eg., reference books).[ 2 ]

METHODS OF LITERATURE SEARCH

There are various methods of literature search that are used alone or in combination [ Table 1 ]. For past few decades, searching the local as well as national library for books, journals, etc., was the usual practice and still physical literature exploration is an important component of any systematic review search process.[ 10 , 11 ] With the advancement of technology, the Internet is now the gateway to the maze of vast medical literature.[ 12 ] Conducting a literature review involves web-based search engines, i.e., Google, Google Scholar, etc., [ Table 2 ], or using various electronic research databases to identify materials that describe the research topic or those homologous to it.[ 13 , 14 ]

Methods of literature search

An external file that holds a picture, illustration, etc.
Object name is IJA-60-635-g001.jpg

Web based methods of literature search

An external file that holds a picture, illustration, etc.
Object name is IJA-60-635-g002.jpg

The various databases available for literature search include databases for original published articles in the journals [ Table 2 ] and evidence-based databases for integrated information available as systematic reviews and abstracts [ Table 3 ].[ 12 , 14 ] Most of these are not freely available to the individual user. PubMed ( http://www.ncbi.nlm.nih.gov/pubmed/ ) is the largest available resource since 1996; however, a large number of sources now provide free access to literature in the biomedical field.[ 15 ] More than 26 million citations from Medline, life science journals and online books are included in PubMed. Links to the full-text material are included in citations from PubMed Central and publisher web sites.[ 16 ] The choice of databases depends on the subject of interest and potential coverage by the different databases. Education Resources Information Centre is a free online digital library of education research and information sponsored by the Institute of Education Sciences of the U.S. Department of Education, available at http://eric.ed.gov/ . No one database can search all the medical literature. There is need to search several different databases. At a minimum, PubMed or Medline, Embase and the Cochrane central trials Registry need to be searched. When searching these databases, emphasis should be given to meta-analysis, systematic reviews randomised controlled trials and landmark studies.

Electronic source of Evidence-Based Database

An external file that holds a picture, illustration, etc.
Object name is IJA-60-635-g003.jpg

Time allocated to the search needs attention as exploring and selecting data are early steps in the research method and research conducted as part of academic assessment have narrow timeframes.[ 17 ] In Indian scenario, limited outcome research and accessibility to data leads to less thorough knowledge of nature of research problem. This results in the formulation of the inappropriate research question and increases the time to literature search.

TYPES OF SEARCH

Type of search can be described in different forms according to the subject of interest. It increases the chances of retrieving relevant information from a search.

Translating research question to keywords

This will provide results based on any of the words specified; hence, they are the cornerstone of an effective search. Synonyms/alternate terms should be considered to elicit further information, i.e., barbiturates in place of thiopentone. Spellings should also be taken into account, i.e., anesthesia in place of anaesthesia (American and British). Most databases use controlled word-stock to establish common search terms (or keywords). Some of these alternative keywords can be looked from database thesaurus.[ 4 ] Another strategy is combining keywords with Boolean operators. It is important to keep a note of keywords and methods used in exploring the literature as these will need to be described later in the design of search process.

‘Medical Subject Heading (MeSH) is the National Library of Medicine's controlled hierarchical vocabulary that is used for indexing articles in PubMed, with more specific terms organised underneath more general terms’.[ 17 ] This provides a reliable way to retrieve citations that use different terminology for identical ideas, as it indexes articles based on content. Two features of PubMed that can increase yield of specific articles are ‘Automatic term mapping’ and ‘automatic term explosion’.[ 4 ]

For example, if the search keyword is heart attack, this term will match with MeSH transcription table heading and then explode into various subheadings. This helps to construct the search by adding and selecting MeSH subheadings and families of MeSH by use of hyperlinks.[ 4 ]

We can set limits to a clinical trial for retrieving higher level of evidence (i.e., randomised controlled clinical trial). Furthermore, one can browse through the link entitled ‘Related Articles’. This PubMed feature searches for similar citations using an intricate algorithm that scans titles, abstracts and MeSH terms.[ 4 ]

Phrase search

This will provide pages with only the words typed in the phrase, in that exact order and with no words in between them.

Boolean operators

AND, OR and NOT are the three Boolean operators named after the mathematician George Boole.[ 18 ] Combining two words using ‘AND’ will fetch articles that mention both the words. Using ‘OR’ will widen the search and fetch more articles that mention either subject. While using the term ‘NOT’ to combine words will fetch articles containing the first word but not the second, thus narrowing the search.

Filters can also be used to refine the search, for example, article types, text availability, language, age, sex and journal categories.

Overall, the recommendations for methodology of literature search can be as below (Creswell)[ 19 ]

  • Identify keywords and use them to search articles from library and internet resources as described above
  • Search several databases to search articles related to your topic
  • Use thesaurus to identify terms to locate your articles
  • Find an article that is similar to your topic; then look at the terms used to describe it, and use them for your search
  • Use databases that provide full-text articles (free through academic libraries, Internet or for a fee) as much as possible so that you can save time searching for your articles
  • If you are examining a topic for the first time and unaware of the research on it, start with broad syntheses of the literature, such as overviews, summaries of the literature on your topic or review articles
  • Start with the most recent issues of the journals, and look for studies about your topic and then work backward in time. Follow-up on references at the end of the articles for more sources to examine
  • Refer books on a single topic by a single author or group of authors or books that contain chapters written by different authors
  • Next look for recent conference papers. Often, conference papers report the latest research developments. Contact authors of pertinent studies. Write or phone them, asking if they know of studies related to your area of interest
  • The easy access and ability to capture entire articles from the web make it attractive. However, check these articles carefully for authenticity and quality and be cautious about whether they represent systematic research.

The whole process of literature search[ 20 ] is summarised in Figure 1 .

An external file that holds a picture, illustration, etc.
Object name is IJA-60-635-g004.jpg

Process of literature search

Literature search provides not only an opportunity to learn more about a given topic but provides insight on how the topic was studied by previous analysts. It helps to interpret ideas, detect shortcomings and recognise opportunities. In short, systematic and well-organised research may help in designing a novel research.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

Harvey Cushing/John Hay Whitney Medical Library

  • Collections
  • Research Help

YSN Doctoral Programs: Steps in Conducting a Literature Review

  • Biomedical Databases
  • Global (Public Health) Databases
  • Soc. Sci., History, and Law Databases
  • Grey Literature
  • Trials Registers
  • Data and Statistics
  • Public Policy
  • Google Tips
  • Recommended Books
  • Steps in Conducting a Literature Review

What is a literature review?

A literature review is an integrated analysis -- not just a summary-- of scholarly writings and other relevant evidence related directly to your research question.  That is, it represents a synthesis of the evidence that provides background information on your topic and shows a association between the evidence and your research question.

A literature review may be a stand alone work or the introduction to a larger research paper, depending on the assignment.  Rely heavily on the guidelines your instructor has given you.

Why is it important?

A literature review is important because it:

  • Explains the background of research on a topic.
  • Demonstrates why a topic is significant to a subject area.
  • Discovers relationships between research studies/ideas.
  • Identifies major themes, concepts, and researchers on a topic.
  • Identifies critical gaps and points of disagreement.
  • Discusses further research questions that logically come out of the previous studies.

APA7 Style resources

Cover Art

APA Style Blog - for those harder to find answers

1. Choose a topic. Define your research question.

Your literature review should be guided by your central research question.  The literature represents background and research developments related to a specific research question, interpreted and analyzed by you in a synthesized way.

  • Make sure your research question is not too broad or too narrow.  Is it manageable?
  • Begin writing down terms that are related to your question. These will be useful for searches later.
  • If you have the opportunity, discuss your topic with your professor and your class mates.

2. Decide on the scope of your review

How many studies do you need to look at? How comprehensive should it be? How many years should it cover? 

  • This may depend on your assignment.  How many sources does the assignment require?

3. Select the databases you will use to conduct your searches.

Make a list of the databases you will search. 

Where to find databases:

  • use the tabs on this guide
  • Find other databases in the Nursing Information Resources web page
  • More on the Medical Library web page
  • ... and more on the Yale University Library web page

4. Conduct your searches to find the evidence. Keep track of your searches.

  • Use the key words in your question, as well as synonyms for those words, as terms in your search. Use the database tutorials for help.
  • Save the searches in the databases. This saves time when you want to redo, or modify, the searches. It is also helpful to use as a guide is the searches are not finding any useful results.
  • Review the abstracts of research studies carefully. This will save you time.
  • Use the bibliographies and references of research studies you find to locate others.
  • Check with your professor, or a subject expert in the field, if you are missing any key works in the field.
  • Ask your librarian for help at any time.
  • Use a citation manager, such as EndNote as the repository for your citations. See the EndNote tutorials for help.

Review the literature

Some questions to help you analyze the research:

  • What was the research question of the study you are reviewing? What were the authors trying to discover?
  • Was the research funded by a source that could influence the findings?
  • What were the research methodologies? Analyze its literature review, the samples and variables used, the results, and the conclusions.
  • Does the research seem to be complete? Could it have been conducted more soundly? What further questions does it raise?
  • If there are conflicting studies, why do you think that is?
  • How are the authors viewed in the field? Has this study been cited? If so, how has it been analyzed?

Tips: 

  • Review the abstracts carefully.  
  • Keep careful notes so that you may track your thought processes during the research process.
  • Create a matrix of the studies for easy analysis, and synthesis, across all of the studies.
  • << Previous: Recommended Books
  • Last Updated: Jun 20, 2024 9:08 AM
  • URL: https://guides.library.yale.edu/YSNDoctoral

Literature Searching

Phillips-Wangensteen Building.

Literature Searching vs. Literature Review

You may hear about conducting a literature search and literature review inter-changeably. In general, a literature search is the process of seeking out and identifying the existing literature related to a topic or question of interest, while a literature review is the organized synthesis of the information found in the existing literature.

In research, a literature search is typically the first step of a literature review. The search identifies relevant existing studies and articles, and the review is the end result of analyzing, synthesizing, and organizing the information found in the search.

When writing a research paper, the literature review gives you a chance to:

  • Demonstrate your familiarity with the topic and its scholarly context
  • Develop a theoretical framework and methodology for your research
  • Show how your research addresses a knowledge gap or contributes to a debate
  • Evaluate the current state of research and demonstrate your knowledge of the scholarly debates around your topic. 

References/Additional Resources

  Baker, J. D. (2016). T he Purpose, Process, and Methods of Writing a Literature Review . AORN Journal, 103(3), 265–269.

  Patrick, L. J., & Munro, S. (2004). The Literature Review: Demystifying the Literature Search. The Diabetes Educator, 30(1), 30–38. 

  • << Previous: Home
  • Next: Major Steps in a Literature Search >>
  • Open access
  • Published: 21 August 2024

Evaluating the effectiveness of large language models in abstract screening: a comparative analysis

  • Michael Li 1 ,
  • Jianping Sun 2 &
  • Xianming Tan   ORCID: orcid.org/0000-0002-5478-2269 1 , 3  

Systematic Reviews volume  13 , Article number:  219 ( 2024 ) Cite this article

122 Accesses

Metrics details

This study aimed to evaluate the performance of large language models (LLMs) in the task of abstract screening in systematic review and meta-analysis studies, exploring their effectiveness, efficiency, and potential integration into existing human expert-based workflows.

We developed automation scripts in Python to interact with the APIs of several LLM tools, including ChatGPT v4.0, ChatGPT v3.5, Google PaLM 2, and Meta Llama 2, and latest tools including ChatGPT v4.0 turbo, ChatGPT v3.5 turbo, Google Gemini 1.0 pro, Meta Llama 3, and Claude 3. This study focused on three databases of abstracts and used them as benchmarks to evaluate the performance of these LLM tools in terms of sensitivity, specificity, and overall accuracy. The results of the LLM tools were compared to human-curated inclusion decisions, gold standard for systematic review and meta-analysis studies.

Different LLM tools had varying abilities in abstract screening. Chat GPT v4.0 demonstrated remarkable performance, with balanced sensitivity and specificity, and overall accuracy consistently reaching or exceeding 90%, indicating a high potential for LLMs in abstract screening tasks. The study found that LLMs could provide reliable results with minimal human effort and thus serve as a cost-effective and efficient alternative to traditional abstract screening methods.

While LLM tools are not yet ready to completely replace human experts in abstract screening, they show great promise in revolutionizing the process. They can serve as autonomous AI reviewers, contribute to collaborative workflows with human experts, and integrate with hybrid approaches to develop custom tools for increased efficiency. As technology continues to advance, LLMs are poised to play an increasingly important role in abstract screening, reshaping the workflow of systematic review and meta-analysis studies.

Peer Review reports

Introduction

A systematic review is a rigorous form of research that collates and synthesizes all existing evidence on a specific research question [ 1 ]. It stands as a cornerstone not just in medical research but across diverse academic disciplines. Unlike traditional literature reviews, systematic reviews follow a comprehensive and standardized process, such as the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guideline [ 2 ], designed to minimize bias and ensure reproducibility. Therefore, these reviews are recognized as one of the zenith levels of evidence in evidence-based research [ 3 ] and play a pivotal role in shaping clinical guidelines, healthcare policies, and informing medical decisions [ 4 ].

Commencing with a well-articulated research question, a typical systematic review launches an exhaustive search strategy that subsequently sweeps through databases, such as PubMed and Embase, supplemented by additional sources such as clinical trial registries and pertinent article reference lists, with aims to capture all relevant studies and mitigate bias. Predetermined inclusion and exclusion criteria guide the subsequent screening and selection of studies, encompassing facts like study design, patient demographics, and intervention types. Reviewers, working independently, appraise each study’s eligibility, reconciling disagreements through discussions or third-party reviews. Data extraction and synthesis follow, either through meta-analysis or narrative synthesis, depending on the heterogeneity of the selected studies.

The practice of conducting systematic reviews has gained substantial popularity with considerable demand within the academic community. A notable reference [ 5 ] to this trend found that approximately 75 trials and potentially 11 systematic reviews are disseminated daily, based on data from around 2010. Moreover, a query for “systematic review” in the Google Scholar database yields approximately 17,000 entries that have been published within the year 2023 alone, which translates into about 51 systematic reviews per day, as observed on November 26, 2023. This expanding volume of literature underscores the critical role that systematic reviews play in the consolidation of research findings across various fields of study.

Despite their pivotal role, executing systematic reviews remains a formidable task due to the abstract screening, a key phase that can be overwhelmingly time-consuming due to its volume. For example, in Polanin et al. [ 6 ], the authors reported that the research staff screened 29,846 abstracts independently (14,923 unique citations were double screened) over the course of 189 days. In addition, the variability in reporting standards, use of jargon, and varied study designs can further complicate the abstract screening process [ 7 ]. Nevertheless, the repetitive nature of the task, combined with cognitive fatigue, can lead to human errors [ 8 , 9 ]. Recent advancements in machine learning (ML) and deep learning propose possible solutions to these challenges. However, traditional ML models, while promising, require domain-specific training, a time-consuming process that often demands manual labeling of datasets [ 10 ].

Advancements in natural language processing (NLP) and artificial intelligence (AI) are opening doors to address challenges in systematic reviews. Large language models (LLMs) like ChatGPT [ 11 ], PaLM [ 12 ], Llama [ 13 ], and Claude [ 14 ] are capturing the research community’s attention. Their collective potential, especially their capability to operate without exhaustive domain-specific training, makes them prime candidates for revolutionizing the systematic review process.

While each of the aforementioned AI tools brings its unique capabilities to the table, the fundamental question remains: How do they stack up, both individually and collectively, in the realm of abstract screening, against the human expert-based process? In pursuit of answers, this research seeks to thoroughly investigate the potential of ChatGPT, Google PaLM, Llama, and Claude in automating the crucial abstract screening phase integral to systematic reviews. Our goal is to rigorously compare the performance of these advanced AI-driven methods with existing machine learning (ML)-based approaches. In doing so, we aim to develop AI strategies that masterfully balance efficiency and accuracy with minimal human intervention, ultimately transforming systematic review practice across disciplines.

The use of NLP for abstract screening is established [ 10 , 15 ]. However, the application of LLMs specifically for this task is a nascent field [ 16 , 17 ]. This emerging area offers significant potential to improve efficiency and accuracy. Our study aims to fill this gap by providing a comprehensive analysis of LLM capabilities in abstract screening, laying the groundwork for future research and application. This is particularly relevant considering the rapid evolution of this technology, highlighting its potential to streamline systematic reviews now and in the future.

The remainder of this paper is structured to provide a comprehensive exploration of our topic. We begin with an in-depth examination of existing methods for abstract screening, including both manual and NLP-based approaches, laying the groundwork for understanding the current state of the field. We then introduce the use of large language model (LLM) tools for abstract screening, detailing our experimental design to meticulously evaluate their performance in this context. Subsequent sections present our empirical findings and results, shedding light on the capabilities and limitations of the AI tools in question. Finally, we engage in a thoughtful discussion, reflecting on the implications of our findings and considering the future trajectory of abstract screening in systematic reviews.

Existing approaches to abstract screening in systematic reviews

In the vast realm of systematic reviews, the critical task of abstract screening serves as a foundational step in curating the highest quality of evidence [ 2 ]. However, this process often presents significant challenges due to the involvement of sifting through large volumes of literature to identify those that align with predefined criteria. Over time, various methodologies, ranging from manual evaluations to sophisticated AI-driven techniques, have been proposed to address the complexities of this task. In this section, we first describe the existing approaches on their operational mechanisms and associated advantages and disadvantages.

Manual approach

Historically, the process of abstract screening was firmly rooted in manual evaluations. In this conventional approach, individual reviewers would scrutinize each abstract against predefined criteria [ 1 ]. The meticulous nature of this method required that multiple experts independently evaluate the literature to ensure both reliability and reduced biases [ 8 ]. While the depth of human expertise brought about nuanced understanding, the manual nature of this method made it both time-consuming and, at times, prone to human error [ 6 , 9 ].

NLP-based approach

As technology evolved, the field witnessed the incorporation of natural language processing (NLP) to automate abstract screening [ 10 ]. In this framework, abstract text undergoes preprocessing and vectorization. Supervised machine learning models, notably the support vector machine (SVM) and the random forest (RF), are then trained on this vectorized data to classify literature based on specific criteria [ 15 ]. The strength of this approach lies in its potential for efficiency. However, its efficacy and accuracy hinges heavily on the availability of a well-curated, labeled training set.

Zero-shot classification

A more recent and innovative approach is zero-shot classification, which was notably highlighted by Xian et al. [ 18 ]. Eschewing the need for an extensive labeled training dataset, zero-shot classification offers the allure of categorizing abstracts without prior domain-specific training. By calculating a probability score for each abstract, researchers obtain a dynamic measure of its alignment with predefined categories. While it does not require training for the machine learning model, the tradeoff is a loss of sensitivity and potential omissions of pertinent studies [ 19 ]. In this study, for a given set of abstracts, we first obtain embeddings (using OpenAI’s text embedding API, the babbage model) for abstracts and for a pre-specified description of an ideal study to include. We indeed use the inclusion/exclusion criteria (see “Examples of abstract screening by using LLMs,” “Automated workflow for streamlining abstract screening via ChatGPT and other tools,” and “ Study design ” sections) as the description of an ideal study to include. We then compute the cosine similarity score between each abstract (its embedding) and the pre-specified description (its embedding). We claim the top 10% (in terms of similarity) of abstracts as positive (i.e., a study should be included).

Hybrid approach

To reconcile the strengths and limitations of these models, a hybrid methodology has emerged [ 18 ]. It cleverly marries the rapid categorization capabilities of zero-shot classification with the precision of traditional machine learning models such as SVM and RF. Here, zero-shot classification provides an initial curation of all abstracts in the training set. Then, based on zero-shot classification’s curation, the abstracts with classification score higher than a pre-specified threshold are selected to undergo manual review to rectify zero-shot classification. The rectified classification then serves as a foundation for training traditional models, which are then employed for screening the broader dataset (e.g., in the testing set). The hybrid approach balances the speed of zero-shot and the precision of traditional ML, and potentially offers enhanced accuracy at reduced human efforts. However, this approach involves the use of multiple methodologies and still relies on well-curated, labeled training data (in this case, a subset of the whole training set). In this study, for a given set of abstracts, we rectify the “positive” abstracts (i.e., the top 10% of abstracts) identified by zero-shot, and then fit a classification model for rectified labels to the text embeddings of the corresponding abstracts using SVM. This learned classification model is then used to predict the class of other abstracts. We added the R code for the zero-shot and hybrid approaches to the GitHub repository https://github.com/mikeli380/LLMAbstractScreening .

Active learning approach

Active learning [ 20 , 21 , 22 ] is an innovative approach to machine learning that optimizes the training process by allowing the model to selectively query a human annotator for labels on the most informative data points. This method is particularly advantageous in scenarios where labeled data is scarce or expensive to obtain. Active learning models can start with minimal datasets, often requiring only one relevant and one irrelevant abstract, making them particularly suitable for tasks such as abstract screening, where the acquisition of large labeled datasets can be prohibitive.

The main advantage of active learning is its efficiency. By focusing on the most informative samples, it reduces the amount of data that needs to be labeled while still effectively training the model. This can significantly reduce the time and resources required for the annotation process. However, the effectiveness of active learning depends heavily on the initial selection of samples and the criteria used to determine the informativeness of subsequent data points. If not well calibrated, the model may request labels for data that do not significantly improve its performance, resulting in an inefficient use of resources. In addition, the iterative nature of querying and updating the model based on new labels can introduce complexity into the training process, potentially requiring more sophisticated infrastructure and oversight than traditional supervised learning methods.

While this study did not test active learning approaches for abstract screening, readers interested in exploring this methodology further are referred to [ 23 ] for detailed information on the application of active learning in abstract screening contexts.

  • Large language models

Amidst the evolving methodologies described in the “ Existing approaches to abstract screening in systematic reviews ” section, modern AI tools based on large language models (LLMs), such as ChatGPT, PaLM, Llama, and Claude, are emerging as potential game-changers. Grounded in advanced language processing capabilities, these tools can be tailored to evaluate abstracts against nuanced criteria and offer detailed assessments and classifications. Their prowess signals transformative potential for abstract screening. In this section, we first present two examples to illustrate the potential of LLMs in the context of abstract screening, and then precede with our in-depth discussion on the study’s workflow and design that aims to critically investigate the performance of LLMs in this domain.

Examples of abstract screening by using LLMs

In this subsection, we demonstrate the process of LLM-based abstract screening using ChatGPT with two specified abstracts: (1) Millard et al. [ 24 ] and (2) Zawia et al. [ 25 ]. Using other LLM tools such as Llama, Google PaLM, or Claude for abstract screening is similar and will be skipped for brevity.

In constructing the prompts, we adopted a standardized approach to mimic a typical interaction between a senior researcher and a research assistant. Each prompt was designed to include three critical elements: a brief statement of the topic under review, the exact eligibility criteria as specified in the study guidelines, and the text of the abstract to be reviewed. This approach was designed to simulate a realistic and practical scenario that reflects common academic practices. In addition, to ensure consistency and objectivity in the responses, we required that the results of the LLM be in a binary “yes” or “no” format, directly indicating whether each abstract met the inclusion criteria. This format was chosen to eliminate any ambiguity in the model’s decisions and to facilitate straightforward analysis of its accuracy.

Abstract (1)

ChatGPT prompt: “I would like you to help me with conducting a systematic review on animal depression. I will provide the title and abstract for one journal article and would like you to screen the paper for inclusion. Here are the inclusion criteria: Does the study provide primary data of an animal model of depression or depressive-like phenotype with an appropriate control group? Here is the abstract:

Wistar-Kyoto (WKY) rats show endogenous depressive behavior that can be reversed by antidepressants. Given that WKYs exhibit decreased sensitivity to some antidepressants and treatment-resistant depressed patients often show hypothalamic-pituitary-thyroid (HPT) dysregulation, we examined the behavioral and HPT hormonal responses of WKYs to altered thyroid status. “Euthyroid” WKYs had elevated basal plasma TSH and T(3) levels as compared to Wistars. Hypothyroidism increased TSH levels more in WKYs than in Wistars and increased response latency in the open field test (OFT) of WKYs only. Administration of T(4) and T(3) suppressed plasma TSH equally in both strains. Wistars responded to increased T(3) levels with decreased response latency and increased activity in the OFT, but increased immobility in the forced swim test. In contrast, WKYs responded only to the high T(3) levels with decreased response latency in the OFT. These results suggest the existence of a decreased central nervous system sensitivity to thyroid hormones in WKYs that could be related to their depressive behavior.”

ChatGPT v4 response (accessed on 10/15/2023):

Given the abstract and the inclusion criteria, this study appears to meet the inclusion criteria for your systematic review on animal models of depression.

Consideration for inclusion:

 1. Animal model and depressive-like phenotype:

  • The study utilizes Wistar-Kyoto (WKY) rats which are mentioned to show endogenous depressive behavior.

  • This study investigates behaviors and hormonal responses associated with depression in this particular animal model, with a focus on the role of thyroid hormones.

 2. Primary data and appropriate control group:

  • The study provides primary data on behavioral and hormonal responses of WKY rats, comparing them to Wistars (likely serving as the control group), under altered thyroid status, allowing for comparisons between the depressive-like phenotype and controls.

  • The utilization of both “Euthyroid” WKYs and Wistars as controls helps in effectively investigating the behavioral changes and responses in WKY rats due to alterations in thyroid status.

 3. Behavioral test indicating depression:

  • Depressive behavior and responses are assessed using standardized tests like the open field test (OFT) and the forced swim test, which are conventional methods to investigate depressive-like behaviors in animal models.

  • The mention of antidepressant’s impact on WKY rats indicates a focus on depressive-like states.

 4. Relevance to human condition:

  • Insights gained from this study on decreased central nervous system sensitivity to thyroid hormones and its possible relation to depressive behavior in WKYs can have implications in understanding treatment-resistant depression in humans, making it valuable for the review.

Conclusion:

Given the provided abstract, this study does seem to fit the inclusion criteria for a systematic review on animal depression, as it provides primary data on an animal model demonstrating depressive-like behavior, includes an appropriate control group, and employs standardized tests to study the depressive-like states and their relation to thyroid hormones. The detailed exploration of depressive behaviors and hormonal imbalances in relation to altered thyroid status can offer valuable insights for understanding and treating depression.

Abstract (2)

The induction of the activity of cerebral ornithine decarboxylase (EC 4.1.1.17) and mRNA by electrical stimulation exhibits regional differences. The effects of the enzyme inhibitor difluoromethylornithine on these regional variations was examined. Administration of this inhibitor resulted in pronounced depression of both basal and induced activity of ornithine decarboxylase in the hippocampus. Basal activity of the enzyme in the neocortex and the cerebellum appeared to be resistant to difluoromethylornithine but the induced enzyme activity was sensitive to the effects of this inhibitor. Susceptibility to difluoromethylornithine may be directly correlated with a slower turnover rate for ornithine decarboxylase. These results suggest that ornithine decarboxylase in the hippocampus may possess a longer half-life than its counterparts in other regions of the brain. Pretreatment with difluoromethylornithine had no effect on the induced ornithine decarboxylase mRNA in the neocortex. Thus, elevated activity of ornithine decarboxylase enzyme, due to electrical stimulation, appears to not have any effect on either the transcription or the decay rate of the induced ornithine decarboxylase mRNA. These findings support the concept of region-specific regulation of cerebral ornithine decarboxylase. ”

Given the provided abstract and the inclusion criteria, this study does not appear to meet the criteria for inclusion in a systematic review on animal models of depression or depressive-like phenotypes with an appropriate control group.

Consideration for inclusion:

 1. Depression or depressive-like phenotype:

  • The abstract does not provide any information or data related to depression or depressive-like behavior or phenotypes in animals. It is mainly focused on the activity of cerebral ornithine decarboxylase and its regional differences in response to electrical stimulation and inhibitor administration.

 2. Animal model:

  • There is no mention of the use of any specific animal model or any behavioral assays to assess depressive-like symptoms or states in the provided abstract.

 3. Behavioral assessment and primary data:

  • The abstract does not contain any details about behavioral assessments related to depression or depressive-like states. It primarily provides data on the effects of enzyme inhibitor difluoromethylornithine on cerebral ornithine decarboxylase activity and mRNA, but does not relate these findings to depression or depressive-like behaviors.

 4. Relevance to depression:

  • While the study explores biochemical aspects in brain regions like the hippocampus, which is implicated in mood disorders like depression, the connection between the findings and depressive symptoms or states is not established in the provided abstract.

Conclusion:

This article does not seem to fit the inclusion criteria for a systematic review on animal models of depression or depressive-like phenotypes as it lacks primary data on depressive-like behaviors or phenotypes in animals, and there is no mention of the use of animal models representative of depression or an appropriate control group for such a comparison.

From the above two examples presented, it is evident that ChatGPT adeptly comprehends the posed queries and efficiently extracted the inclusion and exclusion criteria. It not only makes clear decisions regarding study inclusion but also articulates its rationale by cross-referencing each criterion.

It is worth noting that the same ChatGPT prompt might result in slightly different responses between different users due to the variability in its response generation, but the conclusion for the eligibility of an abstract is typically consistent. In addition, enhancing the specificity of the criteria within the prompts can further refine the screening process. To illustrate, consider the following enhanced prompts:

Enhanced ChatGPT prompt for abstract (1): “I seek assistance with a systematic review focused on animal depression. I’ll provide the abstract of a particular journal article and would appreciate an assessment for its inclusion based on the following criteria:

The study employs an animal model reflecting depression or a depressive-like phenotype.

It incorporates an appropriate control group.

Primary data is presented within the study.

and here are the exclusion criteria:

The study bypasses the use of an animal model.

The utilized animal model doesn’t characterize depression or a depressive-like phenotype.

Absence of a control group within the study.

The study is primarily a review, devoid of primary data.

Here is the abstract: …”.

Let’s assess the provided abstract based on the inclusion and exclusion criteria:

Inclusion criteria:

 1. Animal model reflecting depression: The study uses Wistar-Kyoto (WKY) rats, which are described as showing “endogenous depressive behavior.” Thus, this criterion is met.

 2. Appropriate control group: The study compares WKY rats to Wistars (another strain), which serves as a comparison or control group. This criterion is met.

 3. Primary data: The abstract mentions specific results such as the elevated basal plasma TSH and T(3) levels in “Euthyroid” WKYs compared to Wistars and the different responses in the open field test (OFT). This suggests the presence of primary data. Thus, this criterion is met.

Exclusion criteria:

 1. Bypasses the use of an animal model: The study uses WKY rats as an animal model. Thus, this criterion is not met, which is good for inclusion.

 2. Animal model doesn’t characterize depression: The WKY rats show “endogenous depressive behavior.” Thus, this criterion is not met, which again supports inclusion.

 3. Absence of a control group: The study does use a control group (Wistars). Thus, this criterion is not met, supporting inclusion.

 4. Study is primarily a review: The abstract presents experimental data and does not indicate that the study is a review. Thus, this criterion is not met, which is favorable for inclusion.

Conclusion:

Given that the abstract meets all the inclusion criteria and none of the exclusion criteria, it should be considered in your systematic review on animal depression.

With these revised prompts applied to the previously discussed abstracts, the conclusions remain consistent. Nonetheless, we contend that this enhanced structure not only provides a more rigorous framework but also aligns more closely with widely accepted practices in systematic review criteria definition.

Automated workflow for streamlining abstract screening via ChatGPT and other tools

For systematic reviews, particularly when dealing with voluminous data, efficient workflows are paramount. The ChatGPT API (application programming interface) offers a dynamic solution, enabling the automation of abstract screening on a large scale, which circumvents the labor-intensive process of manually inputting abstracts into a chatbot interface. In this subsection, we present an automated workflow for streamlining abstract screening via ChatGPT. Note, though this automated workflow uses ChatGPT as the platform, analogous workflows work for other AI platforms like PaLM, Llama, and Claude.

Automated workflow:

Data collection: The preliminary step entails accumulating a list of titles and abstracts. By utilizing carefully crafted keywords, we retrieve these from PubMed and other pertinent databases. This comprehensive approach ensures the potential inclusion of all relevant studies for a subsequent detailed screening. It is worth noting that while this list is expansive, most of these studies may not find their way into the final meta-analysis post-screening.

Automation through Python: We have devised a python script aimed at harnessing the capabilities of ChatGPT for evaluating the amassed abstracts.

This script interacts with the ChatGPT API (specifically, the GPT-4 version) and, when furnished with tailored prompts, extracts structured responses from ChatGPT.

Typically, the AI’s response commences with a succinct summary, delves into explanations aligned with each criterion, and concludes with a decisive judgment, as exemplified by the examples in the “ Examples of abstract screening by using LLMs ” section.

This automated process efficiently saves ChatGPT’s verdicts on each abstract for ensuing analyses. For instance, it extracts the final decisions regarding the inclusion or exclusion of studies and determines the stance on each pre-specified criterion for every abstract, as exemplified by the last example in the “ Examples of abstract screening by using LLMs ” section.

Additionally, to ascertain the efficiency and cost-effectiveness of this methodology, the script also monitors the time, token usage, and the financial implications of querying the OpenAI API.

In essence, we envision this procedure as delegating the meticulous task of poring over scientific summaries to an AI assistant. This virtual entity meticulously sifts through each summary, determining its alignment with stipulated criteria.

Tuning parameters in the ChatGPT API: The effectiveness of the ChatGPT API is not only dependent on the input data; it is also significantly influenced by adjustable parameters that can refine the responses. Parameters such as temperature, top k, and top p critically affect model performance by modulating the randomness and focus of the output. While fine-tuning these parameters can improve results, it requires significant technical expertise and resources. Defaults, which are rigorously tested by developers, strike a balance between output quality and ease of use, making LLMs more accessible to a wider range of users without the need for complex parameter optimization. While customization holds promise for custom applications, the default settings provide an efficient and practical solution that facilitates wider adoption of LLM technologies.

Given the complexity of fine-tuning these parameters to optimize performance specifically for abstract screening, our study primarily used the recommended default settings for these parameters as provided by the respective platforms (detailed in Table 1 ). This approach was chosen to maintain the feasibility of our experiments and to ensure that our findings are applicable to typical deployment scenarios.

The automated workflow described applies to other LLM tools or different versions of the same tools. Throughout our study, we have tested several popular LLM tools available to us. The specific LLM models and their versions used at the time of our first submission are detailed in Table  1 , along with the tuning parameters.

The field of large language models (LLMs) has evolved rapidly since we started this study in 2023. New models are frequently released, and existing versions are constantly updated. To account for these advances, we have expanded our analysis to include results from the latest versions of previously studied models, as well as a few entirely new ones. As a best practice, we set the temperature parameter to 0 for all latest models. Table 1 now includes these latest models and their versions.

A brief explanation of these parameters is as follows:

Temperature: The temperature controls the randomness of the outputs, with a range from 0 to 2. A temperature value greater than 1 is random and 0 is deterministic. The maximum temperature of 2 gives the most creative and variable outputs.

Max length: The max length is the maximum number of tokens for the model to generate as a response. A single word is generally 2–3 tokens.

Stop sequences: This parameter controls which tokens or phrases will stop the LLM from generating more text.

Top p: When generating text, the model samples from the top p percentage of most likely tokens to generate. The top p is the cumulative probability cutoff for the model’s selection of tokens to generate. Lower top p values mean sampling from a smaller, more top-weighted nucleus.

Top k: When generating text, the model samples from the top k most likely tokens. When deciding the next word to output, the model will select the most likely word when top k is lower.

Frequency penalty: This frequency penalty parameter controls how the LLM penalizes tokens that have already appeared in the input and output text. A frequency penalty of 0 implies that the frequency of the tokens does not impact the LLM and will generate them based on their probability.

Presence penalty: This parameter controls how the LLM penalizes tokens that have not been used recently. A presence penalty of 0 means that the LLM does not care about the recency of the tokens and will generate them based on their probability.

Study design

In our pursuit to assess ChatGPT’s proficiency in abstract screening, we selected certain benchmark databases that have existing performance data from other methodologies. This selection aids in a comparative analysis of performance.

In selecting the systematic reviews for our study, we used a systematic approach guided by specific criteria to ensure relevance and reliability. These studies were selected from the publicly available SYNERGY [ 23 ] dataset, which contains 26 systematic reviews from different disciplines. Key selection criteria included:

Clarity and conciseness of eligibility criteria: The selected studies had well-defined and explicit eligibility criteria. This clarity is essential for accurate replication of the study selection process, which is critical for assessing the performance of LLM tools in an analogous real-world application.

Completeness and cleanliness of data: We ensured that the selected reviews had complete datasets, with all necessary information on included and excluded studies clearly documented, minimizing the risk of ambiguities affecting our analysis.

In addition, to comply with the AMSTAR-2 [ 26 ] guidelines, in particular point 5, we reviewed the methodologies of these reviews to confirm the selection of studies was performed in duplicate and disagreements were resolved by consensus. While our analysis assumes that these systematic reviews adhere to high standards, we recognize the inherent limitations of using pre-existing datasets as a proxy for gold standards in the discussion section.

We picked the following 3 databases from the publicly available SYNERGY dataset [ 23 ]:

Bannach-Brown 2016 [ 27 ]—topic: use of animal models to study depressive behavior

Human-curated (gold standard) results: 1258 excluded abstracts and 230 included abstracts.

We randomly selected 100 excluded abstracts and 100 included abstracts for screening by LLM tools.

Meijboom 2021 [ 28 ]—topic: retransitioning of etanercept in patients with a rheumatic disease

Human-curated (gold standard) results: 599 excluded abstracts and all 32 included abstracts.

We randomly selected 100 excluded abstracts and 32 included abstracts for screening by LLM tools.

Menon 2022 [ 29 ]—topic: methodological rigor of systematic reviews in environmental health

Human-curated (gold standard) results: 896 excluded abstracts and 73 included abstracts.

We randomly selected 100 excluded abstracts and all 73 included abstracts for screening by LLM tools.

For each chosen database, abstracts were categorized as either “cases” (those included based on a gold standard) or “controls” (those excluded per the gold standard). From each category, we randomly selected 100 abstracts (we use all abstracts if there are less than 100 abstracts in that category in a database). These abstracts underwent evaluation by ChatGPT (v4.0) as per our established workflow. Subsequently, ChatGPT’s decisions were juxtaposed against the gold standard to determine sensitivity, specificity, and overall accuracy. The same abstracts were also processed using other LLM tools as listed in Table  1 to record their respective verdicts.

Statistical analysis

To quantify the efficacy of ChatGPT and other AI tools for each database, we calculated the following metrics: (1) sensitivity, (2) specificity, and (3) overall accuracy, where sensitivity is defined as the number of true positives divided by the sum of true positives and false negatives, specificity as the number of true negatives divided by the sum of true negatives and false positives, and accuracy as sum of true positives and true negatives divided by the total number of abstracts. For each metric, associated 95% confidence intervals were also determined. Although it is very common in the field to report F1 score, recall rate, and precision, we believe it is more appropriate to report sensitivity and specificity given this study design. In addition, F1 score, recall rate, and precision can be derived from sensitivity and specificity.

Furthermore, to explore the potential of a unified decision-making process, we combined the decisions from all AI tools using a voting mechanism. The majority decision across the tools was taken as the final verdict for each abstract. For this consolidated approach, we again computed sensitivity, specificity, overall accuracy, and the associated 95% CIs for each database. We also explore the use of latent class analysis (LCA), a model-based clustering approach, to derive consolidated decisions. More details on this LCA approach are provided in the “ Beyond majority voting ” section.

For a given database, 100 cases and 100 controls yield a two-sided 95% confidence interval with a half-width equal to 0.048 when the underline sensitivity (specificity) is approximately 95%.

All statistical analyses were conducted using the R statistical software (version 4.3.1). All tests were two-sided with an alpha level set at 0.05 unless otherwise mentioned.

To improve the transparency and reproducibility of studies using AI tools, we have included the TRIPOD + AI checklist [ 30 ] in our report. This checklist has been adapted to reflect the specifics of our research, which focuses on the evaluation of large language models for abstract screening rather than diagnostic or prognostic modeling. The completed checklist is presented in Table S 1 , to provide readers with a comprehensive overview of our study’s adherence to established reporting standards.

We present the results for each of the 3 databases. For each, we first present the prompts we used when we called LLM tools to screen an abstract, then present the performance data (accuracy, sensitivity, and specificity for each method or LLM tool), followed by a summary of the performance, and a comparison of the performance of the different methods to that based on ChatGPT v4.0.

Results on the Bannach-Brown 2016 database (see Table  2 )

The prompts we used for screening abstracts in this database are as follows:

Conduct a systematic review on animal depression. I provide the title and abstract for one journal article. Provide an overall assessment based on eligibility criteria with only one word answer yes or no with no explanation. Then, for each inclusion or exclusion criterion, answer with only one word, yes if it is included by the inclusion criterion or excluded by the exclusion criterion, and answer no if it does not meet the inclusion criterion or not excluded by the exclusion criterion. After answering all the criteria with yes or no, then provide an overall explanation.

Here is the eligibility criteria: Inclusion Criteria: 1. Any article providing primary data of an animal model of depression or depressive-like phenotype with an appropriate control group (specified above). 2. Animals of all ages, sexes and species, where depression-like phenotype intended to mimic the human condition have been induced. Including animal models where depressive-like phenotypes are induced in the presence of a comorbidity (e.g. obesity or cancer). 3. All studies that claim to model depression or depressive-like phenotypes in animals. Studies that induce depressive behavior or model depression and that also test a treatment or intervention (prior or subsequent to model induction), with no exclusion criteria based on dosage, timing or frequency. 4. Studies measuring behavioral, anatomical and structural, electrophysiological, histological and/or neurochemical outcomes and where genomic, proteomic or metabolomic outcomes are measured in addition to behavioral, anatomical, electrophysiological, histological or neurochemical outcomes. Exclusion Criteria: 1. Review article, editorials, case reports, letters or comments, conference or seminar abstracts, studies providing primary data but not appropriate control group. 2. Human studies and ex vivo, in vitro or in silico studies. Studies will be excluded if authors state an intention to induce or investigate only anxiety or anxious behavior. Studies will be excluded if there is no experimental intervention on the animals (e.g. purely observational studies). 3. Studies that investigate treatments or interventions, but no depressive behavior or model of depression is induced (e.g. toxicity and side-effect studies). 4. Where metabolic outcome measures are the primary outcome measure of a study. Where genomic, proteomic, metabolic or metabolomic outcomes are the sole outcome measures in a study, they will be excluded.

Here is the abstract:

Among all the LLM tools we tested, ChatGPT v4.0 stood out with the highest accuracy (0.945) and specificity (0.960), and satisfactory sensitivity (0.930). Combined decision using major voting seemed to improve sensitivity (0.970) considerably but did not improve specificity (0.870) much. Comparatively, the zero-shot method was less effective across these metrics, while the hybrid method attained superior accuracy (0.961) and specificity (0.982), albeit with reduced sensitivity (0.843). We acknowledge that our findings regarding the zero-shot and hybrid methods differ from those reported in [ 18 ] for the same set of abstracts. This discrepancy could be attributed to the use of distinct sets of embeddings in our analysis compared to the previous study.

For this database, the newer versions of ChatGPT (3.5-Turbo and 4-Turbo) did not improve performance over ChatGPT (v4.0). Gemini-1.0-pro (vs. PaLM 2) and Llama 3 (vs. Llama 2) improved over their older versions, but did not surpass the performance of ChatGPT (v4.0). Claude 3 performed well, but still did not surpass the performance of ChatGPT (v4.0).

Comparison between LLM tools. We compared the performance (sensitivity and specificity) between ChatGPTv4.0 and other LLM tools using the McNemar test and found that ChatGPTv4.0 performed significantly better ( p value = 0.002) than Google PaLM 2 in terms of sensitivity; ChatGPTv4.0 performed significantly better than ChatGPTv3.5 ( p value = 0.008) and better than Llama-2 ( p value < 0.001) in terms of specificity. Combining the decisions of different LLM tools using majority voting did not improve the overall accuracy compared to ChatGPTv4.0. Specifically, there was no statistically significant difference ( p value = 0.134) in sensitivity between the combined decision (majority voting) and ChatGPTv4.0, and the combined decision was significantly worse ( p value = 0.008) than ChatGPTv4.0 in terms of specificity.

Comparison between ChatGPT v4.0 and zero-shot and hybrid methods: We assessed the performance (sensitivity and specificity) of ChatGPT v4.0 against both the zero-shot and hybrid approaches using the McNemar test. Specifically, we aligned the screening results from 100 cases and 100 controls as per the ChatGPT v4.0 method and similarly for the zero-shot and hybrid methods, testing for inconsistencies between these approaches as previously done. Our analysis revealed that ChatGPT v4.0 significantly outperformed the zero-shot method in sensitivity ( p value < 0.001) but showed comparable effectiveness in specificity ( p value = 0.37). Additionally, ChatGPT v4.0 demonstrated marginally superior sensitivity compared to the hybrid method ( p value = 0.07), while its performance in specificity was similar ( p value = 1.00).

Results on the Meijboom 2021 database (see Table  3 )

Conduct a systematic review on transitioning patients from an originator to a corresponding biosimilar.

I provide the title and abstract for one journal article. Provide an overall assessment based on eligibility criteria with only one word answer yes or no, with no explanation. Then, for each inclusion or exclusion criterion, answer with only one word, yes if it is included by the inclusion criterion or excluded by the exclusion criterion, and answer no if it does not meet the inclusion criterion or not excluded by the exclusion criterion.

After answering all the criteria with yes or no, then provide an overall explanation.

Here is the eligibility criteria:

Articles were included if they met the following criteria:

Study involved transitioning from a TNFα inhibitor (including etanercept, infliximab, and adalimumab) originator to a biosimilar

The number of patients who retransitioned was reported or could be calculated

The article was an original research article published in a peer-reviewed journal

The article included baseline characteristics of the patients who transitioned

The article was written in English

The full-text version of the article could be obtained.

Transitioning was defined as patients in whom the biosimilar was introduced after the originator, without treatment with other drugs in between. Retransitioning was defined as restarting the originator directly after discontinuing a biosimilar, without treatment with other drugs in between. In summary, transitioning was defined as switching from the originator to a biosimilar; retransitioning was defined as switching from the originator to a biosimilar and back to the originator. Both transitioning and retransitioning involved changes with the same active biological substance.

Among all the LLM tools we tested, ChatGPT v4.0 stood out with the highest accuracy (0.840), but not with specificity (0.860) or satisfactory sensitivity (0.812). Compared to ChatGPTv4.0, combined decision using major voting did not improve overall accuracy (0.720), but improved sensitivity (1.000) at the sacrifice of specificity (0.630).

Comparison between LLM tools. We compared the performance (sensitivity and specificity) between ChatGPTv4.0 and other LLM tools using the McNemar test and found that ChatGPTv4.0 performed significantly better ( p value < 0.001) than Google PaLM 2, but significantly worse than ChatGPT3.5 ( p value = 0.001) and Llama 2 in terms of sensitivity; ChatGPTv4.0 performed significantly better than ChatGPTv3.5 ( p value < 0.001) and better than Llama 2 ( p value < 0.001), but worse than Google PaLM 2 ( p value = 0.002), in terms of specificity. Combining the decisions of different LLM tools using majority voting did not improve the overall accuracy compared to ChatGPTv4.0. Specifically, there was statistically significant difference ( p value = 0.008) in sensitivity between the combined decision (majority voting) and ChatGPTv4.0, and the combined decision was not significantly worse ( p value > 0.50) than ChatGPTv4.0 in terms of specificity.

For this database, the newer versions of ChatGPT (3.5-Turbo and 4-Turbo) did not improve performance over ChatGPT (v4.0), and Gemini-1.0-pro (vs. PaLM 2) did not improve performance either. However, Llama 3 (vs. Llama 2) improved over its older version and surpassed the performance of ChatGPT (v4.0). Claude 3 also slightly surpassed the performance of ChatGPT (v4.0).

Comparison between ChatGPT v4.0 and zero-shot and hybrid methods: We evaluated the performance of ChatGPT v4.0, focusing on sensitivity and specificity, in comparison with the zero-shot and hybrid approaches, employing the McNemar test as described above. In this analysis, we aligned the screening results from 32 cases and 100 controls for the tests. Our findings indicated that ChatGPT v4.0 significantly surpassed the zero-shot method in sensitivity ( p value = 0.0002) and exhibited marginally improved specificity ( p value = 0.099). Furthermore, ChatGPT v4.0 showed notably higher sensitivity than the hybrid method ( p value < 0.001), although its specificity was comparatively lower.

Results on the Menon 2022 database (see Table  4 )

“Conduct a systematic review on the methodological rigour of systematic reviews in environmental health.

I provide the title and abstract for one journal article.

Provide an overall assessment based on eligibility criteria with only one word answer yes or no, with no explanation.

Then, for each inclusion or exclusion criterion, answer with only one word, yes if it is included by the inclusion criterion or excluded by the exclusion criterion and answer no if it does not meet the inclusion criterion or not excluded by the exclusion criterion.

Here are the eligibility criteria:

To be eligible for inclusion in the SR sample, documents had to fulfill the following criteria:

Identify explicitly as a “systematic review” in their title

Assess the effect of a non-acute, non-communicable, environmental exposure on a health outcome. Environmental exposures can include air and water pollutants, radiation, noise, occupational hazards, lifestyle factors (like diet or physical activity) and lifestyle choices influenced by family and peers (like substance use), social and economic factors (like stress from work or living conditions).

Include studies in people or mammalian models

Be available in HTML format

Abstract X”

Among all the LLM tools we tested, ChatGPT v4.0 stood out with the highest accuracy (0.913) and specificity (0.932), but not with specificity (0.900). Compared to ChatGPTv4.0, combined decision using major voting did not improve overall accuracy (0.884) or sensitivity (0.808), but improved specificity (0.940).

Comparison between LLM tools. We compared the performance (sensitivity and specificity) between ChatGPTv4.0 and other LLM tools using the McNemar test and found that ChatGPTv4.0 performed significantly better than ChatGPT3.5 ( p value < 0.001), Google PaLM 2, and Llama 2 ( p value = 0.02) in terms of sensitivity; ChatGPTv4.0 performed worse than ChatGPTv3.5 and Google PaLM 2, in terms of specificity. Combining the decisions of different LLM tools using majority voting did not improve the overall accuracy compared to ChatGPTv4.0. Specifically, there was statistically significant difference ( p value = 0.008) in sensitivity between the combined decision (majority voting) and ChatGPTv4.0, and the combined decision was not significantly different ( p value = 0.134) than ChatGPTv4.0 in terms of specificity.

For this database, the newer versions of ChatGPT (3.5-Turbo and 4-Turbo) did not improve performance over ChatGPT (v4.0). However, both Gemini-1.0-pro (vs. PaLM 2) and Llama 3 (vs. Llama 2) improved over their older versions and surpassed the performance of ChatGPT (v4.0). Claude 3 also performed well but did not surpass the performance of ChatGPT (v4.0).

Comparison between ChatGPT v4.0 and zero-shot and hybrid methods: We aligned the screening results from 73 cases and 100 controls based on the ChatGPT v4.0 method, and similarly for the zero-shot and hybrid methods, to test for inconsistencies between these approaches, using the McNemar test as done in previous assessments. Our analysis showed that ChatGPT v4.0 significantly outperformed the zero-shot method in both sensitivity ( p value < 0.001) and specificity ( p value = 0.016). In comparison with the hybrid method, ChatGPT v4.0 also demonstrated superior sensitivity ( p value < 0.001) and better specificity ( p value = 0.04).

Monetary cost and time cost

To use the ChatGPT API or other LLM tools, the owners of these platforms charge a predetermined rate for access to the corresponding APIs. These fees are calculated in USD per thousand tokens, where tokens are the basic units used by these LLM platforms to quantify text length. In this context, a token can represent a word, a punctuation mark, or a character. The financial cost of screening 200 abstracts was approximately $6 for ChatGPT v4.0, $0.2 for ChatGPT v3.5, $10 for Llama 2 (using Replicate), while Google PaLM 2 offered its services for free to invited developers. Thus, the cumulative cost of evaluating 200 abstracts across all platforms was approximately $16.2. The cumulative cost of evaluating 200 abstracts across all latest models ($3 for GPT-4-Turbo, $0.05 for GPT-3.5-Turbo, free for Gemini-1.0-pro, $0.05 for Llama-3, $4 for Claude) was less, approximately $7.1. In terms of time efficiency, processing 200 abstracts with each of these LLM tools took approximately 10–20 min using a single thread. However, it is imperative to recognize that abstract screening lends itself well to parallelization. Consequently, one could significantly speed up the process by setting up multiple threads to simultaneously screen different subsets of abstracts, thereby reducing the overall time required for completion. This parallel approach not only increases efficiency, but also ensures that large amounts of data can be processed in a timely manner, making LLM tools even more attractive for large-scale abstract screening tasks. In summary, the monetary and time costs of using LLM tools for abstract screening are negligible compared to manual labeling.

Beyond majority voting

We have expanded our analysis to include a variety of approaches for synthesizing decisions across different LLM tools. Our methodology is inspired by the concept of combining multiple diagnostic tests in the absence of a gold standard, akin to situations where human expert consensus is not available. There are several publications discussing such scenarios [ 31 , 32 ], among which we proposed to use the latent class analysis (LCA) models.

Latent class analysis (LCA) is a statistical method used to identify subgroups within a population, which are not directly observed (hence “latent”) [ 33 , 34 , 35 ]. It is particularly useful when the research interest lies in categorizing individuals into mutually exclusive groups based on their responses to multiple observed variables. In the context of abstract screening, LCA can offer a sophisticated means of integrating decisions from different LLM tools without relying on a gold standard, typically provided by human expert consensus. This approach assumes that the unobserved subgroups (or “latent classes”) explain the dependence between the observed decisions made by each of the LLM tools.

Utilizing the LCA model, we treat the decisions from all LLM tools as dichotomous variables, corresponding to the adherence to each inclusion or exclusion criterion, as well as the overall decision. For instance, within the Bannach-Brown 2016 database (BB2016), there are eight criteria in total with four criteria each for inclusion and exclusion and one overall decision for inclusion, resulting in a total of nine binary items per LLM tool. In our analysis, we incorporated decisions from GPT v4.0, v3.5, and Llama 2. Decisions from Google PaLM 2 were excluded due to a high frequency (10% or more) of incomplete responses. Consequently, for the Bannach-Brown 2016 database, we worked with 27 binary items. For other databases such as Meijboom 2021 (Mj2021) and Menon 2022 (Me2022), the binary items totaled 21 and 15, respectively. It is important to note that LCA models were fitted to the binary data of each database independently.

The LCA model fitting process enables us to calculate the posterior probabilities of each abstract belonging to specific latent classes or subgroups. Abstracts are then categorized based on these probabilities, with assignment to the class for which an abstract has the highest posterior membership probability. The determination of the number of latent classes is a critical step in the LCA model fitting, which requires a priori specification. In our evaluation, we explored models with class numbers ranging from 2 to 6 and utilized the Bayesian information criterion (BIC) to identify the most “optimal” LCA model for our datasets.

Table 5 shows that after applying the Bayesian information criterion (BIC) to determine the most appropriate model for our data, we identified a 3-class model as the best fit for the binary decisions derived from the BB2016 database. Similarly, a 4-class model was optimal for the Mj2021 database, while a 3-class model was again best for the Me2022 database. The confusion matrices generated by the selected LCA models for each database provided a clear juxtaposition between the LLM-assigned classes and the actual labels of the abstracts (see Table  6 ).

The performance metrics derived from these models are noteworthy. For the BB2016 database, should we categorize abstracts within class with label 2 as “included” and assign the others in classes with label 1 or 3 to the “excluded” category, the LCA model achieved a sensitivity rate of 93% and a specificity rate of 96%, indicating a high degree of accuracy in classifying relevant and irrelevant abstracts. In the case of the Mj2021 database, if we view class 1 as “included” group and classes 2 and 3 as “excluded” group, the model achieved a perfect sensitivity rate of 100%, meaning that it correctly identified all relevant abstracts, although the specificity was lower at 79%, indicating a higher rate of false positives. Similarly, for the Me2022 database, considering class 1 to be the “excluded” category and classes 2 and 3 to be the “included” group, the model showed a sensitivity of 94.5% and a specificity of 83%, a good balance between identifying relevant abstracts and avoiding false inclusions.

These results highlight the robustness of the latent class analysis approach to the abstract screening task, providing an effective method for classifying abstracts when a gold standard is not available. The varying levels of sensitivity and specificity across databases underscore the need to tailor the LCA model to the specific characteristics of each dataset as well as further future research.

We have incorporated results from the latest LLM models to increase the robustness and relevance of our findings. However, to maintain consistency and comparability with our original analyses, we have not incorporated these new results into the previously established majority voting or latent class analysis (LCA) approaches. Instead, we have chosen to make all raw data, including results from these newer models, freely available in our GitHub repository. This approach allows interested researchers and practitioners to conduct further investigations or apply alternative methods of analysis. By providing access to this additional data, we aim to promote transparency and enable the broader community to engage with and potentially build upon our work.

This study began with a rigorous exploration of the capabilities of large language models (LLMs) in abstract screening. We used automation scripts developed in Python to interact with the APIs of several LLM tools, including Chat GPT v4.0, Google PaLM 2, and Meta Llama 2, as well as latest versions of these tools. Our central goal was to evaluate the efficiency and accuracy of these tools across three different databases of abstracts, leading us to a complex understanding of their potential in this context.

Large language models (LLMs), particularly ChatGPT, have garnered global attention since their inception. Employing LLMs for abstract screening in systematic reviews is an innovative concept [ 32 , 33 ] and remains underexplored. This study presents the first comprehensive evaluation of LLMs applied to systematic review processes. The findings are encouraging, suggesting that LLMs could revolutionize abstract screening. Specifically, ChatGPT v4.0 exhibited stellar performance across three test scenarios, achieving an accuracy of at least 85%. Furthermore, it attained sensitivity and specificity rates ranging from 80% to an impressive 95%. These exceptional outcomes highlight the substantial promise of LLMs in abstract screening, offering an efficient and capable alternative to the conventional, laborious approaches that typically necessitate extensive human annotation.

However, it is important to acknowledge that we are still in the early stages of integrating LLM tools into the abstract screening field, and they are not without their imperfections—for example, occasionally excluding many (20%) relevant studies, even with the best-performed LLM tools, ChatGPT v4.0. These tools are not a universal solution to all the challenges associated with abstract screening, and they are not ready to completely replace human expertise in this area. Instead, they should be embraced as invaluable assistants in the abstract screening process.

In discussing the limitations of our study, it is important to recognize the constraints associated with our dataset selection and model configuration. We used only three databases from the SYNERGY set, limiting the generalizability of our findings across disciplines and datasets. In addition, the reliance on human-curated labels as a gold standard, while necessary, introduces potential biases due to the retrospective nature of our analysis. These labels may contain errors, and the methodology used in the original reviews to resolve discrepancies may affect the validity of our conclusions.

The parameters chosen for our LLMs—temperature, top k, top p, and prompts—were set to defaults or based on natural conversations to balance output quality with ease of use. While this approach minimizes the need for technical expertise, it may not be optimal for all screening scenarios. In addition, the reliance of LLMs on abstracts alone, as opposed to full-text evaluations, presents a fundamental challenge; critical data influencing inclusion or exclusion may not be present in the abstracts, potentially compromising screening accuracy.

In addition, the rapid development of LLMs and their “black box” nature pose challenges to the transparency that is essential in scientific settings. The environmental impact of using these computationally intensive models is also significant [ 36 ], requiring sustainability considerations. Future research should focus on refining these tools to increase transparency and efficiency, and ensure their responsible development and use in systematic reviews.

Our research suggests that LLM tools are ready to take on a role in abstract screening and are poised to have an immediate and positive impact on the process. Their integration into abstract screening can manifest itself in a few ways. They can serve as autonomous AI reviewers, adding an extra layer of scrutiny and ensuring thoroughness. Our findings suggest that a collective decision, such as one derived from majority voting, can sometimes improve sensitivity, underscoring the potential of LLM tools as a reliable safeguard against oversight, ensuring both comprehensiveness and accuracy.

In addition, LLM tools can facilitate a synergistic partnership with human experts. They are adept at identifying “high-risk” abstracts where different LLM tools have reached different judgments, flagging them for further human evaluation, and promoting a careful and error-free screening process while minimizing human effort.

Another exciting prospect is the integration of LLM tools with hybrid approaches and active learning approach. In this scenario, LLM tools could autonomously annotate abstracts in the training set, minimizing the need for human labeling. Subsequently, these labeled abstracts could be used to train custom NLP models, paving the way for a streamlined and efficient abstract screening process, resulting in significant time and resource savings. Further research is needed to understand how the “uncertainty” in LLM-based decisions, when human-curated labels are not used, affects the performance of the hybrid approach. We also note from the test examples that the performance of hybrid approaches varies widely from example to example and depending on the text embedding tools used. Extensive research is needed to perfect the hybrid approaches.

Another future research direction is to explore how to fine-tune different versions of LLM tools and how to derive collective decisions. One idea is that by using different tuning parameters, one could propose different versions of LLM tools, and if they perform similarly in terms of accuracy but give different decisions (e.g., not highly correlated), we expect that better collective decisions would be obtained, as observed in many ensemble learning experiences [ 37 ]. However, this could be costly and require a lot of exploration.

More importantly, it would be particularly valuable to prospectively test these LLM tools, or integration of these tools with other approaches like zero-shot, active learning, in real-world meta-analysis and systematic review projects. This would provide deeper insights into their practical utility and effectiveness, and a more comprehensive understanding of their impact in live research environments.

In summary, while LLM tools may not be able to fully replace human experts in abstract screening, their ability to transform the screening process is both undeniable and imminent. With continued advances in technology and ongoing refinement, these tools will play a critical role in the future of abstract screening, ushering in a new era of efficiency and effectiveness.

Availability of data and materials

The datasets analyzed during the current study are available in the SYNERGY repository, https://github.com/asreview/synergy-dataset . We have created a GitHub repository https://github.com/mikeli380/LLMAbstractScreening to host our code (R/Python), abstracts, and intermediate results so that interested readers can replicate our findings.

Higgins JPT, et al. Cochrane handbook for systematic reviews of interventions. 2019.

Book   Google Scholar  

Moher D, et al. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7). https://doi.org/10.1371/journal.pmed.1000097 .

Sackett DL, Rosenberg WMC, Gray JAM, Haynes RB,  Richardson WS. Evidence based medicine: what it is and what it isn’t. 1996. Clin Orthop Relat Res. 2007;455.  https://doi.org/10.1136/bmj.312.7023.71 .

Lavis JN, Posada FB, Haines PA, Osei E. Use of research to inform public policymaking. Lancet. 2004;364(9445). https://doi.org/10.1016/S0140-6736(04)17317-0 .

Bastian H, Glasziou P,  Chalmers I. Seventy-five trials and eleven systematic reviews a day: how will we ever keep up?. PLoS Med. 2010;7(9).   https://doi.org/10.1371/journal.pmed.1000326 .

Polanin JR, Pigott TD, Espelage DL, Grotpeter JK. Best practice guidelines for abstract screening large-evidence systematic reviews and meta-analyses. Res Synth Methods. 2019;10(3). https://doi.org/10.1002/jrsm.1354 .

Turner EH, Matthews AM, Linardatos E, Tell RA, Rosenthal R. Selective publication of antidepressant trials and its influence on apparent efficacy. N Engl J Med. 2008;358(3). https://doi.org/10.1056/nejmsa065779 .

Edwards P, Clarke M, DiGuiseppi C, Pratap S, Roberts I, Wentz R. Identification of randomized controlled trials in systematic reviews: accuracy and reliability of screening records. Stat Med. 2002;21(11). https://doi.org/10.1002/sim.1190 .

Wang Z, Nayfeh T, Tetzlaff J, O’Blenis P, Murad MH. Error rates of human reviewers during abstract screening in systematic reviews. PLoS ONE. 2020;15(1). https://doi.org/10.1371/journal.pone.0227742 .

O’Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev. 2015;4(1). https://doi.org/10.1186/2046-4053-4-5 .

OpenAI. Introducing ChatGPT Plus. OpenAI Blog; 2023. https://openai.com/blog/chatgpt-plus . Accessed 8 Nov 2023.

Google DeepMind. Introducing PaLM 2. Google Blog; 2023.  https://blog.google/technology/ai/google-palm-2-ai-large-language-model/ . Accessed 8 Nov 2023.

Meta AI Research. Meta and Microsoft introduce the next generation of Llama. Facebook AI Blog; 2023. https://ai.meta.com/blog/llama-2/ . Accessed 8 Nov 2023.

Claude.ai. Claude AI API reference. 2024. https://docs.anthropic.com/en/api/getting-started . Accessed 4 Jun 2024.

Wallace BC, Small K, Brodley CE, Lau J, Trikalinos TA. Deploying an interactive machine learning system in an evidence-based practice center: Abstrackr. 2012. https://doi.org/10.1145/2110363.2110464 .

Alshami A, Elsayed M, Ali E, Eltoukhy AEE, Zayed T. Harnessing the power of ChatGPT for automating systematic review process: methodology, case study, limitations, and future directions. Systems. 2023;11(7). https://doi.org/10.3390/systems11070351 .

Qureshi R, Shaughnessy D, Gill KAR, Robinson KA, Li T, Agai E. Are ChatGPT and large language models ‘the answer’ to bringing us closer to systematic review automation?  Syst Rev. 2023;12(1).   https://doi.org/10.1186/s13643-023-02243-z .

Xian Y, Lampert CH, Schiele B, Akata Z. Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans Pattern Anal Mach Intell. 2019;41(9). https://doi.org/10.1109/TPAMI.2018.2857768 .

Moreno-Garcia CF, Jayne C, Elyan E, Aceves-Martins M. A novel application of machine learning and zero-shot classification methods for automated abstract screening in systematic reviews. Decis Anal J. 2023;6.  https://doi.org/10.1016/j.dajour.2023.100162 .

Ferdinands G, et al. Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records. Syst Rev. 2023;12(1). https://doi.org/10.1186/s13643-023-02257-7 .

Boetje J, van de Schoot R. The SAFE procedure: a practical stopping heuristic for active learning-based screening in systematic reviews and meta-analyses. Syst Rev. 2024;13(1). https://doi.org/10.1186/s13643-024-02502-7 .

Yang Y, Loog M. A benchmark and comparison of active learning for logistic regression. Pattern Recognit. 2018;83. https://doi.org/10.1016/j.patcog.2018.06.004 .

De Bruin RJ, Ma Y, Ferdinands G, Teijema J, Van de Schoot. SYNERGY - open machine learning dataset on study selection in systematic reviews. DataverseNL; 2023. https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/HE6NAQ .

Millard SJ, Weston-Green K, Newell KA. The Wistar-Kyoto rat model of endogenous depression: a tool for exploring treatment resistance with an urgent need to focus on sex differences. Prog Neuropsychopharmacol Biol Psychiatry. 2020;101. https://doi.org/10.1016/j.pnpbp.2020.109908 .

Zawia NH, Mattia CJ, Bondy SC. Differential effects of difluoromethylornithine on basal and induced activity of cerebral ornithine decarboxylase and mRNA. Neuropharmacology. 1991;30(4). https://doi.org/10.1016/0028-3908(91)90058-J .

Shea BJ, et al. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ. 2017;358. https://doi.org/10.1136/bmj.j4008 .

Bannach-Brown A, Liao J, Wegener G, Macleod M. Understanding in vivo modelling of depression in non-human animals: a systematic review protocol. Evid Based Preclin Med. 2016;3(2). https://doi.org/10.1002/ebm2.24 .

Meijboom RW, Gardarsdottir H, Becker ML, ten Wolde S, Egberts TCG, Giezen TJ. Incidence of and reasons and determinants associated with retransitioning from biosimilar etanercept to originator etanercept. BioDrugs. 2021;35(6). https://doi.org/10.1007/s40259-021-00501-x .

Menon JML, Struijs F, Whaley P. The methodological rigour of systematic reviews in environmental health. Crit Rev Toxicol. 2022;52(3). https://doi.org/10.1080/10408444.2022.2082917 .

Collins GS, et al. Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence. BMJ Open. 2021;11(7). https://doi.org/10.1136/bmjopen-2020-048008 .

Joseph L, Gyorkos TW, Coupal L. Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. Am J Epidemiol. 1995;141(3). https://doi.org/10.1093/oxfordjournals.aje.a117428 .

Hui SL, Walter SD. Estimating the error rates of diagnostic tests. Biometrics. 1980;36(1). https://doi.org/10.2307/2530508 .

Dendukuri N, Joseph L. Bayesian approaches to modeling the conditional dependence between multiple diagnostic tests. Biometrics. 2001;57(1). https://doi.org/10.1111/j.0006-341X.2001.00158.x .

Collins LM, Lanza ST. Latent class and latent transition analysis: with applications in the social, behavioral, and health sciences. 2010.

Google Scholar  

McCutcheon AL. Latent class analysis. Newbury Park: Sage; 1987.

Rillig MC, Ågerstrand M, Bi M, Gould KA, Sauerland U. Risks and benefits of large language models for the environment. Environm Sci Technol. 2023;57(9). https://doi.org/10.1021/acs.est.3c01106 .

Sagi O, Rokach L. Ensemble learning: a survey.  Wiley Interdiscip Rev Data Min Knowl Discov. 2018;8(4). https://doi.org/10.1002/widm.1249 .

Download references

Acknowledgements

The authors thank the reviewers and the Editor for their insightful comments and suggestions, which have greatly improved the quality of this work.

Not applicable.

Author information

Authors and affiliations.

Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA

Michael Li & Xianming Tan

Department of Mathematics and Statistics, University of North Carolina at Greensboro, Greensboro, NC, 27402, USA

Jianping Sun

Department of Biostatistics and Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA

Xianming Tan

You can also search for this author in PubMed   Google Scholar

Contributions

XT conceived the paper. ML and XT devised the study. ML, JS, and XT analyzed the datasets. ML, JS, and XT wrote the manuscript. All authors edited and approved the final manuscript.

Corresponding author

Correspondence to Xianming Tan .

Ethics declarations

Ethics approval and consent to participate, competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1: table s1. tripod + ai checklist., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Li, M., Sun, J. & Tan, X. Evaluating the effectiveness of large language models in abstract screening: a comparative analysis. Syst Rev 13 , 219 (2024). https://doi.org/10.1186/s13643-024-02609-x

Download citation

Received : 20 December 2023

Accepted : 11 July 2024

Published : 21 August 2024

DOI : https://doi.org/10.1186/s13643-024-02609-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Abstract screening
  • Automated text analysis
  • Google PaLM

Systematic Reviews

ISSN: 2046-4053

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

search terms for literature review

The role of emotions in academic performance of undergraduate medical students: a narrative review

  • Open access
  • Published: 23 August 2024
  • Volume 24 , article number  907 , ( 2024 )

Cite this article

You have full access to this open access article

search terms for literature review

  • Nora Alshareef 1 , 2 ,
  • Ian Fletcher 2 &
  • Sabir Giga 2  

This paper is devoted to a narrative review of the literature on emotions and academic performance in medicine. The review aims to examine the role emotions play in the academic performance of undergraduate medical students.

Eight electronic databases were used to search the literature from 2013 to 2023, including Academic Search Ultimate, British Education Index, CINAHL, Education Abstract, ERIC, Medline, APA Psych Articles and APA Psych Info. Using specific keywords and terms in the databases, 3,285,208 articles were found. After applying the predefined exclusion and inclusion criteria to include only medical students and academic performance as an outcome, 45 articles remained, and two reviewers assessed the quality of the retrieved literature; 17 articles were selected for the narrative synthesis.

The findings indicate that depression and anxiety are the most frequently reported variables in the reviewed literature, and they have negative and positive impacts on the academic performance of medical students. The included literature also reported that a high number of medical students experienced test anxiety during their study, which affected their academic performance. Positive emotions lead to positive academic outcomes and vice versa. However, Feelings of shame did not have any effect on the academic performance of medical students.

The review suggests a significant relationship between emotions and academic performance among undergraduate medical students. While the evidence may not establish causation, it underscores the importance of considering emotional factors in understanding student performance. However, reliance on cross-sectional studies and self-reported data may introduce recall bias. Future research should concentrate on developing anxiety reduction strategies and enhancing mental well-being to improve academic performance.

Explore related subjects

  • Artificial Intelligence

Introduction

Studying medicine is a multi-dimensional process involving acquiring medical knowledge, clinical skills, and professional attitudes. Previous research has found that emotions play a significant role in this process [ 1 , 2 ]. Different types of emotions are important in an academic context, influencing performance on assessments and evaluations, reception of feedback, exam scores, and overall satisfaction with the learning experience [ 3 ]. In particular, medical students experience a wide range of emotions due to many emotionally challenging situations, such as experiencing a heavy academic workload, being in the highly competitive field of medicine, retaining a large amount of information, keeping track of a busy schedule, taking difficult exams, and dealing with a fear of failure [ 4 , 5 , 6 ].Especially during their clinical years, medical students may experience anxiety when interacting with patients who are suffering, ill, or dying, and they must work with other healthcare professionals. Therefore, it is necessary to understand the impact of emotions on medical students to improve their academic outcomes [ 7 ].

To distinguish the emotions frequently experienced by medical students, it is essential to define them. Depression is defined by enduring emotions of sadness, despair, and a diminished capacity for enjoyment or engagement in almost all activities [ 4 ]. Negative emotions encompass unpleasant feelings such as anger, fear, sadness, and anxiety, and they frequently cause distress [ 8 ]. Anxiety is a general term that refers to a state of heightened nervousness or worry, which can be triggered by various factors. Test anxiety, on the other hand, is a specific type of anxiety that arises in the context of taking exams or assessments. Test anxiety is characterised by physiological arousal, negative self-perception, and a fear of failure, which can significantly impair a student’s ability to perform well academically [ 9 , 10 ]. Shame is a self-conscious emotion that arises from the perception of having failed to meet personal or societal standards. It can lead to feelings of worthlessness and inadequacy, severely impacting a student’s motivation and academic performance [ 11 , 12 ]. In contrast, positive emotions indicate a state of enjoyable involvement with the surroundings, encompassing feelings of happiness, appreciation, satisfaction, and love [ 8 ].

Academic performance generally refers to the outcomes of a student’s learning activities, often measured through grades, scores, and other formal assessments. Academic achievement encompasses a broader range of accomplishments, including mastery of skills, attainment of knowledge, and the application of learning in practical contexts. While academic performance is often quantifiable, academic achievement includes qualitative aspects of a student’s educational journey [ 13 ].

According to the literature, 11–40% of medical students suffer from stress, depression, and anxiety due to the intensity of medical school, and these negative emotions impact their academic achievement [ 14 , 15 ]. Severe anxiety may impair memory function, decrease concentration, lead to a state of hypervigilance, and interfere with judgment and cognitive function, further affecting academic performance [ 16 ]. However, some studies have suggested that experiencing some level of anxiety has a positive effect and serves as motivation that can improve academic performance [ 16 , 17 ].

Despite the importance of medical students’ emotions and their relation to academic performance, few studies have been conducted in this area. Most of these studies have focused on the prevalence of specific emotions without correlating with medical students’ academic performance. Few systematic reviews have addressed the emotional challenges medical students face. However, there is a lack of comprehensive reviews that discuss the role of emotions and academic outcomes. Therefore, this review aims to fill this gap by exploring the relationship between emotions and the academic performance of medical students.

Aim of the study

This review aims to examine the role emotions play in the academic performance of undergraduate medical students.

A systematic literature search examined the role of emotions in medical students’ academic performance. The search adhered to the concepts of a systematic review, following the criteria of Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [ 18 ]. Then, narrative synthesise was done to analyse the retrieved literature and synthesise the results. A systematic literature search and narrative review provide complete coverage and flexibility to explore and understand findings. Systematic search assures rigour and reduces bias, while narrative synthesis allows for flexible integration and interpretation. This balance improves review quality and utility.

Eligibility criteria

Inclusion criteria.

The study’s scope was confined to January 2013 to December 2023, focusing exclusively on undergraduate medical students. The research encompassed articles originating within medical schools worldwide, accepting content from all countries. The criteria included only full-text articles in English published in peer-reviewed journals. Primary research was considered, embracing quantitative and mixed-method research. The selected studies had to explicitly reference academic performance, test results, or GPA as key outcomes to address the research question.

Exclusion criteria

The study excluded individuals beyond the undergraduate medical student demographic, such as students in other health fields and junior doctors. There was no imposed age limit for the student participants. The research specifically focused on articles within medical schools, excluding those from alternative settings. It solely considered full-text articles in English-language peer-reviewed journals. Letters or commentary articles were excluded, and the study did not limit itself to a particular type of research. Qualitative studies were excluded from the review because they did not have the quantitative measures required to answer the review’s aim. This review excluded articles on factors impacting academic performance, those analysing nursing students, and gender differences. The reasons and numbers for excluding articles are shown in Table  1 .

Information sources

Eight electronic databases were used to search the literature. These were the following: Academic Search Ultimate, British Education Index, CINAHL, Education Abstract, ERIC, Medline, APA Psych Articles and APA Psych Info. The databases were chosen from several fields based on relevant topics, including education, academic evaluation and assessment, medical education, psychology, mental health, and medical research. Initially, with the help of a subject librarian, the researcher used all the above databases; the databases were searched with specific keywords and terms, and the terms were divided into the following concepts emotions, academic performance and medical students. Google Scholar, EBSCOhost, and the reference list of the retrieved articles were also used to identify other relevant articles.

Search strategy

This review started with a search of the databases. Eight electronic databases were used to search the literature from 2013 to 2023. Specific keywords and terms were used to search the databases, resulting in 3,285,208 articles. After removing duplicates, letters and commentary, this number was reduced to 1,637 articles. Exclusion and inclusion criteria were then applied, resulting in 45 articles. After two assessors assessed the literature, 17 articles were selected for the review. The search terms are as follows:

Keywords: Emotion, anxiety, stress, empathy, test anxiety, exam anxiety, test stress, exam stress, depression, emotional regulation, test scores, academic performance, grades, GPA, academic achievement, academic success, test result, assessment, undergraduate medical students and undergraduate medical education.

Emotions: TI (Emotion* OR Anxiety OR Stress OR empathy) OR emotion* OR (test anxiety or exam anxiety or test stress or exam stress) OR (depression) OR AB ((Emotion* OR Anxiety OR Stress OR empathy) OR emotion* OR (test anxiety or exam anxiety or test stress or exam stress)) (MH “Emotions”) OR (MH “Emotional Regulation”) DE “EMOTIONS”.

Academic performance: TI (test scores or academic performance or grades or GPA) OR (academic achievement or academic performance or academic success) OR (test result* OR assessment*) OR AB (test scores or academic performance or grades or GPA) OR (academic achievement or academic performance or academic success) OR test result* OR assessment*.

Medical Students: TI (undergraduate medical students OR undergraduate medical education) OR AB (undergraduate medical students OR undergraduate medical education), TI “medical students” OR AB “medical students” DE “Medical Students”.

Selection process

This literature review attempts to gather only peer-reviewed journal articles published in English on undergraduate medical students’ negative and positive emotions and academic performance from January 2013 to December 2023. Their emotions, including depression, anxiety, physiological distress, shame, happiness, joy, and all emotions related to academic performance, were examined in quantitative research and mixed methods.

Moreover, to focus the search, the author specified and defined each keyword using advanced search tools, such as subject headings in the case of the Medline database. The author used ‘MeSH 2023’ as the subject heading, then entered the term ‘Emotion’ and chose all the relevant meanings. This method was applied to most of the keywords.

Studies were included based on predefined criteria related to study design, participants, exposure, outcomes, and study types. Two independent reviewers screened each record, and the report was retrieved. In the screening process, reviewers independently assessed each article against the inclusion criteria, and discrepancies were resolved through consensus during regular team meetings. In cases of persistent disagreement, a third reviewer was consulted. Endnote library program was used for the initial screening phase. This tool was used to identify duplicates, facilitated the independent screening of titles and abstracts and helped to retrieve the full-text articles. The reasons for excluding the articles are presented in Table  1 .

Data collection process

Two independent reviewers extracted data from the eligible studies, with any discrepancies resolved through discussion and consensus. If the two primary reviewers could not agree, a third reviewer served as an arbitrator. For each included study, the following information was extracted and recorded in a standardised database: first author name, publication year, study design, sample characteristics, details of the emotions exposed, outcome measures, and results.

Academic performance as an outcome for medical students was defined to include the following: Exam scores (e.g., midterm, final exams), Clinical assessments (e.g., practical exams, clinical rotations), Overall grade point average (GPA) or any other relevant indicators of academic achievement.

Data were sought for all outcomes, including all measures, time points, and analyses within each outcome domain. In cases where studies reported multiple measures or time points, all relevant data were extracted to provide a comprehensive overview of academic performance. If a study reported outcomes beyond the predefined domains, inclusion criteria were established to determine whether these additional outcomes would be included in the review. This involved assessing relevance to the primary research question and alignment with the predefined outcome domains.

Quality assessment

The quality and risk of bias in included studies were assessed using the National Institute of Health’s (NIH) critical appraisal tool. The tool evaluates studies based on the following domains: selection bias, performance bias, detection bias, attrition bias, reporting bias, and other biases. Two independent reviewers assessed the risk of bias in each included study. Reviewers worked collaboratively to reach a consensus on assessments. Discrepancies were resolved through discussion and consensus. In cases of persistent disagreement, a third reviewer was consulted.

To determine the validity of eligible articles, all the included articles were critically appraised, and all reviewers assessed bias. The validity and reliability of the results were assessed by using objective measurement. Each article was scored out of 14, with 14 indicating high-quality research and 1 indicating low-quality research. High-quality research, according to the NIH (2013), includes a clear and focused research question, defines the study population, features a high participation rate, mentions inclusion and exclusion criteria, uses clear and specific measurements, reports results in detail, lists the confounding factors and lists the implications for the local community. Therefore, an article was scored 14 if it met all criteria of the critical appraisal tool. Based on scoring, each study was classified into one of three quality categories: good, fair or poor. The poorly rated articles mean their findings were unreliable, and they will not be considered, including two articles [ 16 , 19 ]. Seventeen articles were chosen after critical appraisal using the NIH appraisal tool, as shown in Table  2 .

Effect measures

For each outcome examined in the included studies, various effect measures were utilised to quantify the relationship between emotions and academic performance among undergraduate medical students. The effect measures commonly reported across the studies included prevalence rat, correlation coefficients, and mean differences. The reviewer calculated the effect size for the studies that did not report the effect. The choice of effect measure depended on the nature of the outcome variable and the statistical analysis conducted in each study. These measures were used to assess the strength and direction of the association between emotional factors and academic performance.

The synthesis method

The findings of individual studies were summarised to highlight crucial characteristics. Due to the predicted heterogeneity, the synthesis involved pooling effect estimates and using a narrative method. A narrative synthesis approach was employed in the synthesis of this review to assess and interpret the findings from the included studies qualitatively. The narrative synthesis involved a qualitative examination of the content of each study, focusing on identifying common themes. This synthesis was employed to categorise and interpret data, allowing for a nuanced understanding of the synthesis. Themes related to emotions were identified and extracted for synthesis. Control-value theory [ 20 ] was used as an overarching theory, providing a qualitative synthesis of the evidence and contributing to a deeper understanding of the research question. If the retrieved articles include populations other than medical, such as dental students or non-medical students, the synthesis will distinguish between them and summarise the findings of the medical students only, highlighting any differences or similarities.

The Control-Value Theory, formulated by Pekrun (2006), is a conceptual framework that illustrates the relationship between emotions and academic achievement through two fundamental assessments: control and value. Control pertains to the perceived ability of a learner to exert influence over their learning activities and the results they achieve. Value relates to a student’s significance to these actions and results. The theory suggests that students are prone to experiencing good feelings, such as satisfaction and pride when they possess a strong sense of control and importance towards their academic assignments. On the other hand, individuals are prone to encountering adverse emotions (such as fear and embarrassment) when they perceive a lack of control or worth in these particular occupations. These emotions subsequently impact students’ motivation, learning strategies, and, eventually, their academic achievement. The relevance of control-value theory in reviewing medical student emotions and their influence on academic performance is evident for various reasons. This theory offers a complete framework that facilitates comprehending the intricate connection between emotions and academic achievement. It considers positive and negative emotions, providing a comprehensive viewpoint on how emotions might influence learning and performance. The relevance of control and value notions is particularly significant for medical students due to their frequent exposure to high-stakes tests and difficult courses. Gaining insight into the students’ perception of their power over academic assignments and the importance they attach to their medical education might aid in identifying emotional stimuli and devising remedies. Multiple research has confirmed the theory’s assertions, showing the critical influence of control and value evaluations on students’ emotional experiences and academic achievements [ 21 , 22 ].

Data extraction

For this step, a data extraction sheet was developed using the data extraction template provided by the Cochrane Handbook. To ensure the review is evidence-based and bias-free, the Cochrane Handbook strongly suggests that more than one reviewer review the data. Therefore, the main researcher extracted the data from the included studies, and another reviewer checked the included, excluded and extracted data. Any disagreements were resolved via discussion by a third reviewer. The data extraction Table  2 identified all study features, including the author’s name, the year of publication, the method used the aim of the study, the number and description of participants, data collection tools, and study findings.

Finalisation of references and study characteristics

Prisma sheet and the summary of final studies that have been used for the review.

When the keywords and search terms related to emotions, as mentioned above, in the eight databases listed, 3,285,208 articles were retrieved. After using advanced search and subject headings, the number of articles increased to 3,352,371. Similarly, searching for the second keyword, ‘academic performance,’ using all the advanced search tools yielded 8,119,908 articles. Searching for the third keyword, ‘medical students’, yielded 145,757 articles. All terms were searched in article titles and abstracts. After that, the author combined all search terms by using ‘AND’ and applied the time limit from 2013 to 2023; the search narrowed to 2,570 articles. After duplicates, letters and commentary were excluded, the number was reduced to 1,637 articles. After reading the title and abstract to determine relevance to the topic and applying the exclusion and inclusion criteria mentioned above, 45 articles remained; after the quality of the retrieved literature was assessed by two reviewers, 17 articles were selected for the review. The PRISMA flow diagram summarising the same is presented in Fig.  1 . Additionally, One article by Ansari et al. (2018) was selected for the review; it met most inclusion and exclusion criteria except that the outcome measure is cognitive function and not academic performance. Therefore, it was excluded from the review. Figure  1 shows the Prisma flow diagram (2020) of studies identified from the databases.

figure 1

Prisma flow diagram (2020)

Study characteristics

Table  2 , summarising the characteristics of the included studies, is presented below.

Findings of the study

Country of the study.

Many of the studies were conducted in developing countries, with the majority being conducted in Europe ( n  = 4), followed by Pakistan ( n  = 2), then Saudi Arabia ( n  = 2), and the United States ( n  = 2). The rest of the studies were conducted in South America ( n  = 1), Morocco ( n  = 1), Brazil ( n  = 1), Australia ( n  = 1), Iran ( n  = 1), South Korea ( n  = 1) and Bosnia and Herzegovina ( n  = 1). No included studies were conducted in the United Kingdom.

Study design

Regarding study design, most of the included articles used a quantitative methodology, including 12 cross-sectional studies. There were two randomised controlled trials, one descriptive correlation study, one cohort study, and only one mixed-method study.

Population and study setting

Regarding population and setting, most of the studies focused on all medical students studying in a medical school setting, from first-year medical students to those in their final year. One study compared medical students with non-medical students; another combined medical students with dental students.

The study aims varied across the included studies. Seven studies examined the prevalence of depression and anxiety among medical students and their relation to academic performance. Four studies examined the relationship between test anxiety and academic performance in medical education. Four studies examined the relationship between medical students’ emotions and academic achievements. One study explored the influence of shame on medical students’ learning.

Study quality

The studies were assessed for quality using tools created by the NIH (2013) and then divided into good, fair, and poor based on these results. Nine studies had a high-quality methodology, seven achieved fair ratings, and only three achieved poor ratings. The studies that were assigned the poor rating were mainly cross-sectional studies, and the areas of weakness were due to the study design, low response rate, inadequate reporting of the methodology and statistics, invalid tools, and unclear research goals.

Outcome measures

Most of the outcome measures were heterogenous and self-administered questionnaires; one study used focus groups and observation ward assessment [ 23 ]. All the studies used the medical students’ academic grades.

Results of the study

The prevalence rate of psychological distress in the retrieved articles.

Depression and anxiety are the most common forms of psychological distress examined concerning academic outcomes among medical students. Studies consistently show concerningly high rates, with prevalence estimates ranging from 7.3 to 66.4% for anxiety and 3.7–69% for depression. These findings indicate psychological distress levels characterised as moderate to high based on common cut-off thresholds have a clear detrimental impact on academic achievement [ 16 , 24 , 25 , 26 ].

The studies collectively examine the impact of psychological factors on academic performance in medical education contexts, using a range of effect sizes to quantify their findings. Aboalshamat et al. (2015) identified a small effect size ( η 2 = 0.018) for depression’s impact on academic performance, suggesting a modest influence. Mihailescu (2016) found a significant negative correlation between levels of depression/anxiety (rho=-0.14, rho=-0.19), academic performance and GPA among medical students. Burr and Beck Dallaghan (2019) reported professional efficacy explaining 31.3% of the variance in academic performance, indicating a significant effect size. However, Del-Ben (2013) et al. did not provide the significant impact of affective changes on academic achievement, suggesting trivial effect sizes for these factors.

In conclusion, anxiety and depression, both indicators of psychological discomfort, are common among medical students. There is a link between distress and poor academic performance results, implying that this relationship merits consideration. Table  3 below shows the specific value of depression and anxiety in retrieved articles.

Test anxiety

In this review, four studies examined the relationship between test anxiety and academic performance in medical education [ 27 , 28 , 29 , 30 ]. The studies found high rates of test anxiety among medical students, ranging from 52% [ 27 ] to as high as 81.1% [ 29 ]. Final-year students tend to experience the highest test anxiety [ 29 ].

Test anxiety has a significant negative correlation with academic performance measures and grade point average (GPA) [ 27 , 28 , 29 ]. Green et al. (2016) found that test anxiety was moderately negatively correlated with USMLE score ( r = − 0.24, p  = 0.00); high test anxiety was associated with low USMLE scores in the control group, further suggesting that anxiety can adversely affect performance. The findings that a test-taking strategy course reduced anxiety without improving test scores highlight the complex nature of anxiety’s impact on performance.

Nazir et al. (2021) found that excellent female medical students reported significantly lower test anxiety than those with low academic grades, with an odds ratio of 1.47, indicating that students with higher test anxiety are more likely to have lower academic grades. Kim’s (2016) research shows moderate correlations between test anxiety and negative achievement emotions such as anxiety and boredom, but interestingly, this anxiety does not significantly affect practical exam scores (OSCE) or GPAs. However, one study found that examination stress enhanced academic performance with a large effect size (W = 0.78), with stress levels at 47.4% among their sample, suggesting that a certain stress level before exams may be beneficial [ 30 ].

Three papers explored shame’s effect on medical students’ academic achievement [ 24 , 31 , 32 ]. Hayat et al. (2018) reported that academic feelings, like shame, significantly depend on the academic year. shame was found to have a slight negative and significant correlation with the academic achievement of learners ( r =-0.15). One study found that some medical students felt shame during simulations-based education examinations because they had made incorrect decisions, which decreased their self-esteem and motivation to learn. However, others who felt shame were motivated to study harder to avoid repeating the same mistakes [ 23 ].

Hautz (2017) study examined how shame affects medical students’ learning using a randomised controlled trial where researchers divided the students into two groups: one group performed a breast examination on mannequins and the other group on actual patients. The results showed that students who performed the clinical examination on actual patients experienced significantly higher levels of shame but performed better in examinations than in the mannequin group. In the final assessments on standardised patients, both groups performed equally well. Therefore, shame decreased with more clinical practice, but shame did not have significant statistics related to learning or performance. Similarly, Burr and Dallaghan (2019) reported that the shame level of medical students was (40%) but had no association with academic performance.

Academic performance, emotions and medical students

Three articles discussed medical students’ emotions and academic performance [ 23 , 24 , 32 ]. Burr and Dallaghan (2019) examine the relationship between academic success and emotions in medical students, such as pride, hope, worry, and shame. It emphasises the links between academic accomplishment and professional efficacy, as well as hope, pride, worry, and shame. Professional efficacy was the most significant factor linked to academic performance, explaining 31.3% of the variance. The importance of emotions on understanding, processing of data, recall of memories, and cognitive burden is emphasised throughout the research. To improve academic achievement, efforts should be made to increase student self-efficacy.

Hayat et al. (2018) found that positive emotions and intrinsic motivation are highly connected with academic achievement, although emotions fluctuate between educational levels but not between genders. The correlations between negative emotions and academic achievement, ranging from − 0.15 to -0.24 for different emotions, suggest small but statistically significant adverse effects.

Behren et al.‘s (2019) mixed-method study found that students felt various emotions during the simulation, focusing on positive emotions and moderate anxiety. However, no significant relationships were found between positive emotions and the student’s performance during the simulation [ 23 ].

This review aims to investigate the role of emotions in the academic performance of undergraduate medical students. Meta-analysis cannot be used because of the heterogeneity of the data collection tools and different research designs [ 33 ]. Therefore, narrative synthesis was adopted in this paper. The studies are grouped into four categories as follows: (1) The effect of depression and anxiety on academic performance, (2) Test anxiety and academic achievement, (3) Shame and academic performance, and (4) Academic performance, emotions and medical students. The control-value theory [ 20 ], will be used to interpret the findings.

The effect of depression and anxiety on academic performance

According to the retrieved research, depression and anxiety can have both a negative and a positive impact on the academic performance of medical students. Severe anxiety may impair memory function, decrease concentration, lead to a state of hypervigilance, interfere with judgment and cognitive function, and further affect academic performance [ 4 ]. Most of the good-quality retrieved articles found that anxiety and depression were associated with low academic performance [ 16 , 24 , 25 , 26 ]. Moreira (2018) and Mihailescu (2016) found that higher depression levels were associated with more failed courses and a lower GPA. However, they did not find any association between anxiety level and academic performance.

By contrast, some studies have suggested that experiencing some level of anxiety reinforces students’ motivation to improve their academic performance [ 16 , 34 ]. Zalihic et al. (2017) conducted a study to investigate anxiety sensitivity about academic success and noticed a positive relationship between anxiety level and high academic scores; they justified this because when medical students feel anxious, they tend to prepare and study more, and they desire to achieve better scores and fulfil social expectations. Similarly, another study found anxiety has a negative impact on academic performance when excessive and a positive effect when manageable, in which case it encourages medical students and motivates them to achieve higher scores [ 35 ].

In the broader literature, the impact of anxiety on academic performance has contradictory research findings. While some studies suggest that having some level of anxiety can boost students’ motivation to improve their academic performance, other research has shown that anxiety has a negative impact on their academic success [ 36 , 37 ]. In the cultural context, education and anxiety attitudes differ widely across cultures. High academic pressure and societal expectations might worsen anxiety in many East Asian societies. Education is highly valued in these societies, frequently leading to significant academic stress. This pressure encompasses attaining high academic marks and outperformance in competitive examinations. The academic demands exerted on students can result in heightened levels of anxiety. The apprehension of not meeting expectations can lead to considerable psychological distress and anxiety, which can appear in their physical and mental health and academic achievement [ 38 , 39 ].

Test anxiety and academic achievement

The majority of the studies reviewed confirm that test anxiety negatively affects academic performance [ 27 , 28 , 29 ]. Several studies have found a significant correlation between test anxiety and academic achievement, indicating that higher levels of test anxiety are associated with lower exam scores and lower academic performance [ 40 , 41 ]. For example, Green et al. (2016) RCT study found that test anxiety has a moderately significant negative correlation with the USMLE score. They found that medical students who took the test-taking strategy course had lower levels of test anxiety than the control group, and their test anxiety scores after the exam had improved from the baseline. Although their test anxiety improved after taking the course, there was no significant difference in the exam scores between students who had and had not taken the course. Therefore, the intervention they used was not effective. According to the control-value theory, this intervention can be improved if they design an emotionally effective learning environment, have a straightforward instructional design, foster self-regulation of negative emotions, and teach students emotion-oriented regulation [ 22 ].

Additionally, according to this theory, students who perceive exams as difficult are more likely to experience test anxiety because test anxiety results from a student’s negative appraisal of the task and outcome values, leading to a reduction in their performance. This aligns with Kim’s (2016) study, which found that students who believed that the OSCE was a problematic exam experienced test anxiety more than other students [ 9 , 22 , 42 ].

In the wider literature, a meta-analysis review by von der Embse (2018) found a medium significant negative correlation ( r =-0.24) between test anxiety and test performance in undergraduate educational settings [ 43 ] . Also, they found a small significant negative correlation ( r =-0.17) between test anxiety and GPA. This indicates that higher levels of test anxiety are associated with lower test performance. Moreover, Song et al. (2021) experimental study examined the effects of test anxiety on working memory capacity and found that test anxiety negatively correlated with academic performance [ 44 ]. Therefore, the evidence from Song’s study suggests a small but significant effect of anxiety on working memory capacity. However, another cross-sectional study revealed that test anxiety in medical students had no significant effect on exam performance [ 45 ]. The complexities of this relationship necessitate additional investigation. Since the retrieved articles are from different countries, it is critical to recognise the possible impact of cultural differences on the impact of test anxiety. Cultural factors such as different educational systems, assessment tools and societal expectations may lead to variances in test anxiety experience and expression across diverse communities [ 46 , 47 ]. Culture has a substantial impact on how test anxiety is expressed and evaluated. Research suggests that the degree and manifestations of test anxiety differ among different cultural settings, emphasising the importance of using culturally validated methods to evaluate test anxiety accurately. A study conducted by Lowe (2019) with Canadian and U.S. college students demonstrated cultural variations in the factors contributing to test anxiety. Canadian students exhibited elevated levels of physiological hyperarousal, but U.S. students had more pronounced cognitive interference. These variations indicate that the cultural environment has an influence on how students perceive and respond to test anxiety, resulting in differing effects on their academic performance in different cultures. Furthermore, scholars highlight the significance of carrying out meticulous instruments to assess test anxiety, which are comparable among diverse cultural cohorts. This technique guarantees that the explanations of test scores are reliable and can be compared across different populations. Hence, it is imperative to comprehend and tackle cultural disparities in order to create efficient interventions and assistance for students who encounter test anxiety in diverse cultural environments. Therefore, there is a need for further studies to examine the level of test anxiety and cultural context.

Shame and academic performance

The review examined three studies that discuss the impact of feelings of shame on academic performance [ 23 , 24 , 48 ]. Generally, shame is considered a negative emotion which involves self-reflection and self-evaluation, and it leads to rumination and self-condemnation [ 49 ]. Intimate examinations conducted by medical students can induce feelings of shame, affecting their ability to communicate with patients and their clinical decisions. Shame can increase the avoidance of intimate physical examinations and also encourage clinical practice [ 23 , 24 , 48 ].

One study found that some medical students felt shame during simulations-based education examinations because they had made incorrect decisions, which decreased their self-esteem and motivation to learn. However, others who felt shame were motivated to study harder to avoid repeating the same mistakes [ 23 ]. Shame decreased with more clinical practice, but shame did not affect their learning or performance [ 48 ]. The literature on how shame affects medical students’ learning is inconclusive [ 31 ].

In the broader literature, shame is considered maladaptive, leading to dysfunctional behaviour, encouraging withdrawal and avoidance of events and inhibiting social interaction. However, few studies have been conducted on shame in the medical field. Therefore, more research is needed to investigate the role of shame in medical students’ academic performance [ 49 ]. In the literature, there are several solutions that can be used to tackle the problem of shame in medical education; it is necessary to establish nurturing learning settings that encourage students to openly discuss their problems and mistakes without the worry of facing severe criticism. This can be accomplished by encouraging medical students to participate in reflective practice, facilitating the processing of their emotions, and enabling them to derive valuable insights from their experiences, all while avoiding excessive self-blame [ 50 ]. Offering robust mentorship and support mechanisms can assist students in effectively managing the difficulties associated with intimate examinations. Teaching staff have the ability to demonstrate proper behaviours and provide valuable feedback and effective mentoring [ 51 ]. Training and workshops that specifically target communication skills and the handling of sensitive situations can effectively equip students to handle intimate tests, hence decreasing the chances of them avoiding such examinations due to feelings of shame [ 52 ].

The literature review focused on three studies that examined the relationship between emotions and the academic achievements of medical students [ 23 , 24 , 32 ].

Behren et al. (2019) mixed-method study on the achievement emotions of medical students during simulations found that placing students in challenging clinical cases that they can handle raises positive emotions. Students perceived these challenges as a positive drive for learning and mild anxiety was considered beneficial. However, the study also found non-significant correlations between emotions and performance during the simulation, indicating a complex relationship between emotions and academic performance. The results revealed that feelings of frustration were perceived to reduce students’ interest and motivation for studying, hampered their decision-making process, and negatively affected their self-esteem, which is consistent with the academic achievement emotions literature where negative emotions are associated with poor intrinsic motivation and reduced the ability to learn [ 3 ].

The study also emphasises that mild anxiety can have positive effects, corroborated by Gregor (2005), which posits that moderate degrees of anxiety can improve performance. The author suggests that an ideal state of arousal (which may be experienced as anxiety) enhances performance. Mild anxiety is commonly seen as a type of psychological stimulation that readies the body for upcoming challenges, frequently referred to as a “fight or flight” response. Within the realm of academic performance, this state of heightened arousal can enhance concentration and optimise cognitive functions such as memory, problem-solving skills, and overall performance. However, once the ideal point is surpassed, any additional increase in arousal can result in a decline in performance [ 53 ]. This is additionally supported by Cassady and Johnson (2002), who discovered that a specific level of anxiety can motivate students to engage in more comprehensive preparation, hence enhancing their performance.

The reviewed research reveals a positive correlation between positive emotions and academic performance and a negative correlation between negative emotions and academic performance. These findings align with the control–value theory [ 8 , 22 ], which suggests that positive emotions facilitate learning through mediating factors, including cognitive learning strategies such as strategic thinking, critical thinking and problem-solving and metacognitive learning strategies such as monitoring, regulating, and planning students’ intrinsic and extrinsic motivation. Additionally, several studies found that extrinsic motivation from the educational environment and the application of cognitive and emotional strategies improve students’ ability to learn and, consequently, their academic performance [ 23 , 24 , 32 ]. By contrast, negative emotions negatively affect academic performance. This is because negative emotions reduce students’ motivation, concentration, and ability to process information [ 23 , 24 , 32 ].

Limitations of the study

This review aims to thoroughly investigate the relationship between emotions and academic performance in undergraduate medical students, but it has inherent limitations. Overall, the methodological quality of the retrieved studies is primarily good and fair. Poor-quality research was excluded from the synthesis. The good-quality papers demonstrated strengths in sampling techniques, data analysis, collection and reporting. However, most of the retrieved articles used cross-section studies, and the drawback of this is a need for a more causal relationship, which is a limitation in the design of cross-sectional studies. Furthermore, given the reliance on self-reported data, there were concerns about potential recall bias. These methodological difficulties were noted in most of the examined research. When contemplating the implications for practice and future study, the impact of these limitations on the validity of the data should be acknowledged.

The limitation of the review process and the inclusion criteria restricted the study to articles published from January 2013 to December 2023, potentially overlooking relevant research conducted beyond this timeframe. Additionally, the exclusive focus on undergraduate medical students may constrain the applicability of findings to other health fields or educational levels.

Moreover, excluding articles in non-English language and those not published in peer-reviewed journals introduces potential language and publication biases. Reliance on electronic databases and specific keywords may inadvertently omit studies using different terms or indexing. While the search strategy is meticulous, it might not cover every relevant study due to indexing and database coverage variations. However, the two assessors’ involvement in study screening, selection, data extraction, and quality assessment improved the robustness of the review and ensured that it included all the relevant research.

In conclusion, these limitations highlight the need for careful interpretation of the study’s findings and stress the importance of future research addressing these constraints to offer a more comprehensive understanding of the nuanced relationship between emotions and academic performance in undergraduate medical education.

Conclusion and future research

The review exposes the widespread prevalence of depression, anxiety and test anxiety within the medical student population. The impact on academic performance is intricate, showcasing evidence of adverse and favourable relationships. Addressing the mental health challenges of medical students necessitates tailored interventions for enhancing mental well-being in medical education. Furthermore, it is crucial to create practical strategies considering the complex elements of overcoming test anxiety. Future research should prioritise the advancement of anxiety reduction strategies to enhance academic performance, focusing on the control-value theory’s emphasis on creating an emotionally supportive learning environment. Additionally, Test anxiety is very common among medical students, but the literature has not conclusively determined its actual effect on academic performance. Therefore, there is a clear need for a study that examines the relationship between test anxiety and academic performance. Moreover, the retrieved literature did not provide effective solutions for managing test anxiety. This gap highlights the need for practical solutions informed by Pekrun’s Control-Value Theory. Ideally, a longitudinal study measuring test anxiety and exam scores over time would be the most appropriate approach. it is also necessary to explore cultural differences to develop more effective solutions and support systems tailored to specific cultural contexts.

The impact of shame on academic performance in medical students was inconclusive. Shame is a negative emotion that has an intricate influence on learning outcomes. The inadequacy of current literature emphasises the imperative for additional research to unravel the nuanced role of shame in the academic journeys of medical students.

Overall, emotions play a crucial role in shaping students’ academic performance, and research has attempted to find solutions to improve medical students’ learning experiences; thus, it is recommended that medical schools revise their curricula and consider using simulation-based learning in their instructional designs to enhance learning and improve students’ emotions. Also, studies have suggested using academic coaching to help students achieve their goals, change their learning styles, and apply self-testing and simple rehearsal of the material. Moreover, the study recommended to improve medical students’ critical thinking and autonomy and changing teaching styles to support students better.

Data availability

all included articles are mentioned in the manuscript, The quality assessment of included articles are located in the supplementary materials file no. 1.

Weurlander M, Lonn A, Seeberger A, Hult H, Thornberg R, Wernerson A. Emotional challenges of medical students generate feelings of uncertainty. Med Educ. 2019;53(10):1037–48.

Article   Google Scholar  

Boekaerts M, Pekrun R. Emotions and emotion regulation in academic settings. Handbook of educational psychology: Routledge; 2015. pp. 90–104.

Google Scholar  

Camacho-Morles J, Slemp GR, Pekrun R, Loderer K, Hou H, Oades LG. Activity achievement emotions and academic performance: a meta-analysis. Educational Psychol Rev. 2021;33(3):1051–95.

Aboalshamat K, Hou X-Y, Strodl E. Psychological well-being status among medical and dental students in Makkah, Saudi Arabia: a cross-sectional study. Med Teach. 2015;37(Suppl 1):S75–81.

Mirghni HO, Ahmed Elnour MA. The perceived stress and approach to learning effects on academic performance among Sudanese medical students. Electron Physician. 2017;9(4):4072–6.

Baessler F, Zafar A, Schweizer S, Ciprianidis A, Sander A, Preussler S, et al. Are we preparing future doctors to deal with emotionally challenging situations? Analysis of a medical curriculum. Patient Educ Couns. 2019;102(7):1304–12.

Rowe AD, Fitness J. Understanding the role of negative emotions in Adult Learning and Achievement: a Social Functional Perspective. Behav Sci (Basel). 2018;8(2).

Pekrun R, Frenzel AC, Goetz T, Perry RP. The control-value theory of achievement emotions: An integrative approach to emotions in education. Emotion in education: Elsevier; 2007. pp. 13–36.

Zeidner M. Test anxiety: The state of the art. 1998.

Cassady JC, Johnson RE. Cognitive test anxiety and academic performance. Contemp Educ Psychol. 2002;27(2):270–95.

Tangney JP, Dearing RL. Shame and guilt: Guilford Press; 2003.

Fang J, Brown GT, Hamilton R. Changes in Chinese students’ academic emotions after examinations: pride in success, shame in failure, and self-loathing in comparison. Br J Educ Psychol. 2023;93(1):245–61.

York TT, Gibson C, Rankin S. Defining and measuring academic success. Practical Assess Res Evaluation. 2019;20(1):5.

Abdulghani HM, Irshad M, Al Zunitan MA, Al Sulihem AA, Al Dehaim MA, Al Esefir WA, et al. Prevalence of stress in junior doctors during their internship training: a cross-sectional study of three Saudi medical colleges’ hospitals. Neuropsychiatr Dis Treat. 2014;10:1879–86.

Moreira de Sousa J, Moreira CA, Telles-Correia D, Anxiety. Depression and academic performance: a Study Amongst Portuguese Medical Students Versus non-medical students. Acta Med Port. 2018;31(9):454–62.

Junaid MA, Auf AI, Shaikh K, Khan N, Abdelrahim SA. Correlation between academic performance and anxiety in Medical students of Majmaah University - KSA. JPMA J Pakistan Med Association. 2020;70(5):865–8.

MihĂIlescu AI, Diaconescu LV, Donisan T, Ciobanu AM, THE INFLUENCE OF EMOTIONAL, DISTRESS ON THE ACADEMIC PERFORMANCE IN UNDERGRADUATE MEDICAL STUDENTS. Romanian J Child Adolesc Psychiatry. 2016;4(1/2):27–40.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372.

Hahn H, Kropp P, Kirschstein T, Rücker G, Müller-Hilke B. Test anxiety in medical school is unrelated to academic performance but correlates with an effort/reward imbalance. PLoS ONE. 2017;12(2):1–13.

Pekrun R. The control-value theory of achievement emotions: assumptions, corollaries, and Implications for Educational Research and Practice. Educational Psychol Rev. 2006;18(4):315–41.

Graesser AC. Emotions are the experiential glue of learning environments in the 21st century. Learn Instruction. 2019.

Pekrun R, Perry RP. Control-value theory of achievement emotions. International handbook of emotions in education: Routledge; 2014. pp. 120 – 41.

Behrens CC, Dolmans DH, Gormley GJ, Driessen EW. Exploring undergraduate students achievement emotions during ward round simulation: a mixed-method study. BMC Med Educ. 2019;19(1):316.

Burr J, Beck-Dallaghan GL. The relationship of emotions and Burnout to Medical Students’ academic performance. Teach Learn Med. 2019;31(5):479–86.

Zalihić A, Mešukić S, Sušac B, Knezović K, Martinac M. Anxiety sensitivity as a predictor of academic success of medical students at the University of Mostar. Psychiatria Danubina. 2017;29(Suppl 4):851–4.

Del-Ben CM, Machado VF, Madisson MM, Resende TL, Valério FP, Troncon LEDA. Relationship between academic performance and affective changes during the first year at medical school. Med Teach. 2013;35(5):404–10.

Nazir MA, Izhar F, Talal A, Sohail ZB, Majeed A, Almas K. A quantitative study of test anxiety and its influencing factors among medical and dental students. J Taibah Univ Med Sci. 2021;16(2):253–9.

Green M, Angoff N, Encandela J. Test anxiety and United States Medical Licensing Examination scores. Clin Teacher. 2016;13(2):142–6.

Ben Loubir D, Serhier Z, Diouny S, Battas O, Agoub M, Bennani Othmani M. Prevalence of stress in Casablanca medical students: a cross-sectional study. Pan Afr Med J. 2014;19:149.

Kausar U, Haider SI, Mughal IA, Noor MSA, Stress levels; stress levels of final year mbbs students and its effect on their academic performance. Prof Med J. 2018;25(6):932–6.

Hautz WE, Schröder T, Dannenberg KA, März M, Hölzer H, Ahlers O, et al. Shame in Medical Education: a randomized study of the Acquisition of intimate Examination skills and its effect on subsequent performance. Teach Learn Med. 2017;29(2):196–206.

Hayat AA, Salehi A, Kojuri J. Medical student’s academic performance: the role of academic emotions and motivation. J Adv Med Educ Professionalism. 2018;6(4):168–75.

Deeks JJ, Riley RD, Higgins JP. Combining Results Using Meta-Analysis. Systematic Reviews in Health Research: Meta‐Analysis in Context. 2022:159 – 84.

Aboalshamat K, Hou X-Y, Strodl E. The impact of a self-development coaching programme on medical and dental students’ psychological health and academic performance: a randomised controlled trial. BMC Med Educ. 2015;15:134.

Jamil H, Alakkari M, Al-Mahini MS, Alsayid M, Al Jandali O. The impact of anxiety and depression on academic performance: a cross-sectional study among medical students in Syria. Avicenna J Med. 2022;12(03):111–9.

Mirawdali S, Morrissey H, Ball P. Academic anxiety and its effects on academic performance. 2018.

Al-Qaisy LM. The relation of depression and anxiety in academic achievement among group of university students. Int J Psychol Couns. 2011;3(5):96–100.

Cheng DR, Poon F, Nguyen TT, Woodman RJ, Parker JD. Stigma and perception of psychological distress and depression in Australian-trained medical students: results from an inter-state medical school survey. Psychiatry Res. 2013;209(3):684–90.

Lee M, Larson R. The Korean ‘examination hell’: long hours of studying, distress, and depression. J Youth Adolesc. 2000;29(2):249–71.

Ali SK. 861 – Social phobia among medical students. Eur Psychiatry. 2013;28:1.

Bonna AS, Sarwar M, Md Nasrullah A, Bin Razzak S, Chowdhury KS, Rahman SR. Exam anxiety among medical students in Dhaka City and its Associated Factors-A cross-sectional study. Asian J Med Health. 2022;20(11):20–30.

Kim K-J. Factors associated with medical student test anxiety in objective structured clinical examinations: a preliminary study. Int J Med Educ. 2016;7:424–7.

Von der Embse N, Jester D, Roy D, Post J. Test anxiety effects, predictors, and correlates: a 30-year meta-analytic review. J Affect Disord. 2018;227:483–93.

Song J, Chang L, Zhou R. Test anxiety impairs filtering ability in visual working memory: evidence from event-related potentials. J Affect Disord. 2021;292:700–7.

Theobald M, Breitwieser J, Brod G. Test anxiety does not predict exam performance when knowledge is controlled for: strong evidence against the interference hypothesis of test anxiety. Psychol Sci. 2022;33(12):2073–83.

Lowe PA. Examination of test anxiety in samples of Australian and US Higher Education Students. High Educ Stud. 2019;9(4):33–43.

Kavanagh BE, Ziino SA, Mesagno C. A comparative investigation of test anxiety, coping strategies and perfectionism between Australian and United States students. North Am J Psychol. 2016;18(3).

Mihăilescu AI, Diaconescu LV, Ciobanu AM, Donisan T, Mihailescu C. The impact of anxiety and depression on academic performance in undergraduate medical students. Eur Psychiatry. 2016;33:S341–2.

Terrizzi JA Jr, Shook NJ. On the origin of shame: does shame emerge from an evolved disease-avoidance architecture? Front Behav Neurosci. 2020;14:19.

Epstein RM. Mindful practice. JAMA. 1999;282(9):833–9.

Hauer KE, Teherani A, Dechet A, Aagaard EM. Medical students’ perceptions of mentoring: a focus-group analysis. Med Teach. 2005;27(8):732–4.

Kalet A, Pugnaire MP, Cole-Kelly K, Janicik R, Ferrara E, Schwartz MD, et al. Teaching communication in clinical clerkships: models from the macy initiative in health communications. Acad Med. 2004;79(6):511–20.

Gregor A. Examination anxiety: live with it, control it or make it work for you? School Psychol Int. 2005;26(5):617–35.

Download references

Acknowledgements

I would like to thank Lancaster university library for helping me to search the literature and to find the appropriate databases and thanks to Lancaster university to prove access to several softwares.

No funding.

Author information

Authors and affiliations.

King Abdulaziz University, Jeddah, Saudi Arabia

Nora Alshareef

Lancaster University, Lancaster, UK

Nora Alshareef, Ian Fletcher & Sabir Giga

You can also search for this author in PubMed   Google Scholar

Contributions

NA made substantial contributions throughout the systematic review process and was actively involved in writing and revising the manuscript. NA’s responsible for the design of the study, through the acquisition, analysis, and interpretation of data, to the drafting and substantive revision of the manuscript. NA has approved the submitted version and is personally accountable for her contributions, ensuring the accuracy and integrity of the work. IF was instrumental in screening the literature, extracting data, and conducting the quality assessment of the included studies. Additionally, IF played a crucial role in revising the results and discussion sections of the manuscript, ensuring that the interpretation of data was both accurate and insightful. IF has approved the submitted version and has agreed to be personally accountable for his contributions, particularly in terms of the accuracy and integrity of the parts of the work he was directly involved in. SG contributed significantly to the selection of papers and data extraction, demonstrating critical expertise in resolving disagreements among authors. SG’s involvement was crucial in revising the entire content of the manuscript, enhancing its coherence and alignment with the study’s objectives. SG has also approved the submitted version and is personally accountable for his contributions, committed to upholding the integrity of the entire work.

Corresponding author

Correspondence to Nora Alshareef .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Consent of publication was obtained from the other authors.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Alshareef, N., Fletcher, I. & Giga, S. The role of emotions in academic performance of undergraduate medical students: a narrative review. BMC Med Educ 24 , 907 (2024). https://doi.org/10.1186/s12909-024-05894-1

Download citation

Received : 08 March 2024

Accepted : 12 August 2024

Published : 23 August 2024

DOI : https://doi.org/10.1186/s12909-024-05894-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Medical students
  • Academic performance
  • Find a journal
  • Publish with us
  • Track your research
  • Case Report
  • Open access
  • Published: 22 August 2024

Mycobacterium marseillense bloodstream infection combined with skin fungal infection: a case report and literature review

  • Danchun Wang 1 , 2 ,
  • Dongjie Xue 1 , 3 ,
  • Xin Chen 1 , 4 ,
  • Xiong Wu 1 , 5 ,
  • Wenjun Gao 6 ,
  • Xinyue Liang 1 ,
  • Wenli Yuan 1 , 7 ,
  • Hui Wang 1 &
  • Qi Wang 1  

BMC Infectious Diseases volume  24 , Article number:  853 ( 2024 ) Cite this article

24 Accesses

Metrics details

Non-tuberculous mycobacteria (NTM) are present widely in the natural environment and can invade the human body through the respiratory tract, gastrointestinal tract, and skin. Immunocompromised patients are particularly prone to infection, which primarily affects multiple organs, including the lungs, lymph nodes, and skin. However, cases of NTM bloodstream infections are rare. Here, we report a rare case of Mycobacterium marseillense bloodstream infection with concurrent skin fungal infection in a patient after kidney transplantation. Related literature was reviewed to enhance the understanding of this rare condition.

Case presentation

A 58-year-old male with a history of long-term steroid and immunosuppressant use after kidney transplantation presented with limb swelling that worsened over the past two months. Physical examination revealed redness and swelling of the skin in all four limbs, with a non-healing wound on the lower left limb. Skin tissue analysis by metagenomic next-generation sequencing (mNGS) and fungal culture indicated infection with Trichophyton rubrum . Blood culture results suggested infection with Mycobacterium marseillense . After receiving anti-NTM treatment, the patient’s symptoms significantly improved, and he is currently undergoing treatment.

Mycobacterium marseillense is a NTM. Gram staining suffered from misdetection, and the acid-fast staining result was positive. This bacterium was identified by mass spectrometry and mNGS analyses. Antimicrobial susceptibility tests for NTM were performed using the broth microdilution method. The results of the susceptibility test showed that Mycobacterium marseillense was sensitive to clarithromycin, an intermediary between moxifloxacin and linezolid. Bacterial clearance requires a combination of drugs and an adequate course of treatment. NTM bloodstream infections are relatively rare, and early identification and proactive intervention are key to their successful management.

Peer Review reports

Non-tuberculous mycobacteria (NTM) are widely found in natural environments and commonly in water and soil [ 1 ]. NTM infections are increasing worldwide, particularly among immunocompromised individuals, such as patients with HIV and those who have undergone solid organ or stem cell transplantation [ 2 ]. Reports on bloodstream infections caused by NTM are relatively rare. NTM infections typically manifest as pulmonary or skin infections; however, bloodstream infections are less common. Immunocompromised individuals and those with underlying health conditions are also at risk. In this case report, a rare case of a Mycobacterium marseillense bloodstream infection combined with skin fungal infection is detailed. The infection in the lower limbs was caused by a fungal skin infection in a patient who had undergone a kidney transplant and was receiving long-term immunosuppressant therapy.

A 58-year-old male patient presented to the hospital with a chief complaint of limb swelling for the past 3 years, which worsened over the past 2 months. The patient had undergone renal transplantation 18 years previously because of glomerulonephritis and uremia. He was on a long-term medication regimen including tacrolimus (1 mg bid), mycophenolate mofetil (0.5 g bid), and prednisolone (4 mg bid). Three years earlier, the patient developed a localized swelling in the posterior aspect of the left lower limb following a mosquito bite, which progressively spread to all four limbs. Two months prior to admission, the patient underwent an incision and drainage procedure for the left lower limb; however, the wound did not heal well. The bacterial culture of the drainage fluid revealed Staphylococcus epidermidis . The patient was referred to our hospital for a skin biopsy. Metagenomic next-generation sequencing (mNGS) of the skin tissue revealed positive results for Trichophyton rubrum (1100 reads), Staphylococcus capitis (198 reads), and Corynebacterium tuberculostearicum (28 reads). Fungal cultures confirmed the presence of Trichophyton rubrum . Antimicrobial susceptibility test MIC results were as follows (µg/ml): terbinafine, 0.015; fluconazole, 2; posaconazole, 0.008; voriconazole, 0.015; itraconazole, 0.03; ketoconazole, 0.06; amphotericin B, 0.5. Histopathological examination revealed signs of fungal infection, including diffuse fibroplasia and neovascularization, neutrophilic infiltration, lymphocytes and plasma cells, focal fibrinoid necrosis, and the presence of fungal hyphae and spores.

The patient showed slight improvement after receiving an intravenous infusion of ceftriaxone 2 g qd and intermittent use of itraconazole. He initially presented with worsening pitting edema in all four limbs, along with redness, swelling, and fever in the lower limbs, particularly the left lower limb. Fluid discharge from the wound in the left lower limb was also observed (Fig.  1 ). His treatment with ceftriaxone and furosemide as an outpatient did not provide satisfactory results, leading to hospital admission for further diagnosis and treatment.

figure 1

The patient’s lower limb skin was affected by a fungal infection

Upon admission, the patient presented with normothermia (36.2 ℃), normal pulse rate (76 bpm), respiratory rate (18 breaths/min), and blood pressure (127/77 mmHg). The patient exhibited diffuse peripheral flushing and edema with tenderness in the lower extremities.

Laboratory results revealed leukocytosis (12 × 10 9 /L) with predominantly increased neutrophils (11 × 10 9 /L) and decreased lymphocytes (0.4 × 10 9 /L). Hypoalbuminemia was observed with a serum albumin level of 33.8 g/L. Furthermore, the absolute count of CD4 + T lymphocytes (139/µL) was reduced. C-reactive protein and procalcitonin levels were within the normal ranges. Following admission, the patient received linezolid 600 mg every 12 h and ceftriaxone 2 g once daily to manage the infection. Ten days after topical treatment with terbinafine and mupirocin for skin inflammation, the patient’s symptoms did not improve. Furthermore, there was a progressive decline in blood cell and platelet counts, suggesting a possible adverse reaction to linezolid, which was discontinued. Concurrently, blood culture of Myco/F Lytic Culture Vials yielded a positive result after 250 h, revealing gram-positive bacilli (Fig.  2 ) and positive acid-fast staining (Fig.  3 ). Xpert MTB/RIF testing was negative for Mycobacterium tuberculosis complex nucleic acids, suggesting NTM infection. A blood culture postive sample was immediately sent for mNGS analysis to determine the specific NTM subtype, which confirmed Mycobacterium marseillense . After 8 days, the blood agar plate incubated at 37 °C showed neatly edged, dry, and flat colonies (Fig.  4 ). Mycobacterium marseillense was identified using the matrix-assisted laser flight desorption/ionizationtime-of-mass spectrometry (MAIDI-TOF MS) (Bruker, Germany). Chest computed tomography (CT) scans revealed patchy opacities, nodular shadows, and pleural effusion. We performed acid-fast staining on early skin tissue pathology slides, which revealed a small number of acid-fast bacilli (Fig.  5 ) and fungal hyphae (Fig.  6 ) upon microscopic examination. The sputum was cultured several times and Pseudomonas aeruginosa was detected in one of them, whereas no pathogenic bacteria were detected at other time points. The patient may have had a disseminated Mycobacterium marseillense infection with a concurrent fungal infection of the skin. The patient was treated with azithromycin 250 mg qd, ethambutol 1 g qd, moxifloxacin 400 mg qd, contezolid 400 mg bid, and clofazimine 100 mg qd for NTM infection, and terbinafine 250 mg qd for fungal infection.

figure 2

Gram-positive bacilli in Gram staining smear from positive blood culture bottle (magnification: x1000)

figure 3

Acid-fast bacilli in Acid fast staining smear from positive blood culture bottle (magnification: x1000)

figure 4

Mycobacterium marseillense grew on the blood agar plate incubated for 8 days at 37 ℃

figure 5

Acid-fast bacilli in Acid fast staining from skin histopathological section (magnification: x1000)

figure 6

Fungal hypha in Acid-fast staining from skin histopathological section (magnification: x1000)

Thermo Scientific Sensitivity SLOMYCO2 was used for the antimicrobial susceptibility test (AST). Table  1 presents the results of the study. After 2 months of treatment, the patient’s limb redness and swelling reduced, and the lower limb wound healed. The patient continues to receive the same anti-NTM and antifungal medications, without any changes.

Literature search and review

We searched the literature on bloodstream infections caused by non-tuberculous mycobacteria. Using specific keywords, we queried the PubMed database and idnetified 103 articles. Among them, we found a literature review published in 2016 that effectively summarized 22 cases of NTM bloodstream infections. [ 3 ] We selected 43 articles published between 2017 and the present. Of them, 28 were excluded as they were not case reports. We then carefully read the remaining 15 case reports, extracted data, summarized the characteristics of the 18 cases (Table  2 ), and compared them with the 22 cases summarized in 2016. The most common pathogen showed a changing trend. Mycobacterium chelonae is the prevailing pathogen from 2017 to the present, whereas Mycobacterium mucogenicum was more frequently reported before 2016. Most patients in both groups were immunocompromised and presented with underlying hematologic disorders, rheumatic diseases, solid tumors, HIV infection, or interstitial pneumonia. Hematologic disorders were the most prevalent underlying conditions, accounting for 37% of the cases since 2017 and 41% before 2016. Our analysis of 18 cases showed that 50% of the patients had received corticosteroid or immunosuppressant therapy, similar to our reported case. The primary organs affected in both groups were the lungs, followed by the endocardium, brain, bone marrow, and soft tissues. This is primarily because NTM is widely present in the environment and can easily cause disseminated NTM infections by initially infecting the respiratory tract and spreading through the bloodstream. Although the chest CT scan of our patient indicated lesions, the clinical symptoms of the patient’s respiratory system were not obvious, and no positive results proved that the lungs were infected with NTM. The mortality rates in the two groups were 11–17% and 23–50% respectively. The lower mortality rate observed since 2017 compared to that before 2016 may be attributed to the diversified identification methods used in recent years. With the availability of various identification methods, the early detection of NTM infections is possible, leading to early treatment and reduced mortality rates. A Canadian study summarized 28 cases of bacteremia caused by NTM [ 4 ]. All patients had long-term indwelling vascular access or received intravenous drug injections. Among them, 95% had their central venous catheters removed, and all achieved successful treatment outcomes. The literature suggests that removing indwelling vascular access is a key factor for successful treatment in these patients. Of the 18 reported cases, 44% involved patients with indwelling catheters in their blood vessels or other locations. All patients underwent catheter removal during treatment, and 87.5% achieved successful treatment outcomes. Although our case did not have a catheter inserted, NTM-related catheter-associated bloodstream infections are high. Therefore, it is crucial to consider the importance of catheter removal during treatment.

NTM are classified into two types: rapid-growing and slow-growing. Slow-growing mycobacteria include Mycobacterium avium complex (MAC), Mycobacterium kansasii , Mycobacterium ulcerans , and Mycobacterium marinum . MAC, an NTM, is the newest species or subspecies discovered. Mycobacterium marseillense is a slow-growing mycobacterium and a member of the MAC complex. In recent years, an increasing number of cases of NTM infections have been reported, with the main sites of infection being the lungs, skin, and soft tissues.

There is currently no literature reporting Mycobacterium marseillense causing bloodstream infections. Because of its slow growth and poor Gram staining, this bacterium can be easily missed in microbiology laboratory analyses. Molecular diagnostic techniques can be used to identify M. marseillense . Identifying the species of NTM is crucial because different species of NTM require different types of antibiotics and treatment durations. [2,25] Due to the long-term use of immunosuppressive drugs after kidney transplantation, patients have a weakened immune system and are classified as immunocompromised. Early mNGS of the skin tissue indicated infection with Trichophyton rubrum and Staphylococcus capitis . Clinically, the primary consideration was a fungal infection combined with a Staphylococcus infection on the patient’s skin, without considering the less common pathogen NTM, which can cause bloodstream infection. Clinical cases of bloodstream infections caused by this bacterium are rare, and their clinical features may not be apparent, leading to challenges in diagnosis. However, we successfully diagnosed the patient and initiated combination therapy with anti-NTM and antifungal medications. However, the question is why the acid-fast stain of the patient’s skin tissue was positive but NTM was not detected by mNGS. It is possible that the patient’s skin tissue had a low bacterial load of NTM, making it difficult to lyse the NTM during the nucleic acid extraction process, yielding false-negative results. For mycobacteria in tissue samples, the sensitivity of mNGS may not be very high and, in some cases, even lower than that of traditional culture methods [ 5 ]. Unfortunately, neither the patient’s skin tissue nor the wound pus yielded any NTM in the culture. The inability to detect NTM in skin tissue may be due to the low bacterial load and slow growth. The presence of other microorganisms, such as Trichophyton rubrum , may have overshadowed the growth of the NTM. Therefore, it is possible that insufficient NTM colonies were present in the culture for detection. A possible reason for failing to detect NTM in pus cultures could be the short duration of laboratory cultivation, leading to missed detections. Additionally, the clinical suspicion of NTM infection was not communicated to the laboratory in advance, resulting in a standard culture duration that may not have been sufficient to cultivate NTM. In general, detecting NTM in blood cultures is considered a sign of pathogenic bacteria [ 6 ]. Among patients undergoing solid organ transplantation, kidney transplant recipients have the highest rate of NTM infection. This could be attributed to the higher frequency of kidney transplantation compared to other solid organ transplants [ 7 ]. The risk factors for NTM infection generally include immunodeficiency, genetic defects, monoclonal antibody therapy, and various other factors. Patients with chronic rejection after solid organ transplantation are at increased risk of developing NTM infections [ 7 , 8 ]. Immunodeficiency can facilitate the progression of NTM infections, and enhanced immunosuppression (such as post-transplant antirejection therapy) further increases the risk of infection [ 9 ]. In addition, We believe that the association between skin fungal infections and NTM bloodstream infections was accidental because superficial fungal infections caused by Trichophyton rubrum do not require an underlying disease condition.

Various molecular diagnostic techniques are available for identifying NTM strains, such as mNGS, PCR, direct or indirect homologous genes, and sequence comparison. In clinical settings, implementing MAIDI-TOF MS and mNGS has greatly improved the accuracy of NTM species identification while significantly reducing the identification time. Among them, mNGS is the most advanced and precise method for identifying and distinguishing NTM species [ 2 ].

According to relevant studies, combination therapy is superior to monotherapy for NTM bacteremia and is associated with a lower recurrence rate [ 10 ]. Currently, breakpoint criteria for the drug susceptibility reference CLSI M24-A2 exist only for clarithromycin, moxifloxacin, and linezolid. However, breakpoints for clinically useful ethambutol, rifampin, rifambutin, streptomycin, and amikacin have not yet been established. In this case, we selected azithromycin, moxifloxacin, ethambutol, clofazimine, and contezolid to combat the NTM infection in our patient. Drug susceptibility testing indicated sensitivity to clarithromycin and intermediate susceptibility to moxifloxacin and linezolid. However, the clinical evaluation confirmed the effectiveness of the chosen treatment. There may be variations in NTM susceptibility between in vivo and in vitro conditions; therefore, there is no recommendation for switching antibiotics. According to the recommended guidelines [ 2 ], the treatment course for patients with disseminated MAC requires prolonged medication for one year or even lifelong. In our review of 18 cases, 83% of the patients had a favorable prognosis, highlighting the higher cure rate for disseminated NTM infections. Nonetheless, tailoring the treatment duration based on individual patient circumstances is crucial.

Preventing NTM bloodstream infections is crucial. First, preventing NTM infections in hospitals and avoiding tap water are key steps in the hospital environment. Specifically, it is important to avoid tap water contact with injection sites, venous catheters, and surgical wounds, and refrain from using tap water to rinse medical instruments. Second, personal and household prevention measures involve avoiding water from saunas and hot tubs, regularly replacing shower heads, periodically changing water filter cartridges, using sterilized water in humidifiers, and avoiding contact with patients with NTM [ 1 ].

Data availability

The datasets used and analysed during the current study available from the corresponding author on reasonable request.

Sharma SK, Upadhyay V. Epidemiology, diagnosis & treatment of non-tuberculous mycobacterial d iseases. Indian J Med Res. 2020;152(3):185–226.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Schoenfeld N, Haas W, Richter E, Bauer T, Boes L, Castell S, Hauer B, Magdorf K, Matthiessen W, Mauch H, Reuss A, Schenkel K, Ruesch-Gerdes S, Zabel P, Dalhoff K, Schaberg T, Loddenkemper R. Recommendations of the German Central Committee against Tuberculosis (DZK) and the German respiratory society (DGP) for the diagnosis and tr eatment of non-tuberculous mycobacterioses. Pneumologie. 2016;70(4):250–76.

Article   CAS   PubMed   Google Scholar  

Bian S-N, Zhang L-F, Zhang Y-Q, Yang Q-W, Wang P, Xu Y-C, Shi X-C, Liu X-Q. Clinical and Laboratory characteristics of patients with Nontuberculou s Mycobacterium Bloodstream Infection in a Tertiary Referral Hospital in Beijing, China. Chin Med J. 2016;129(18):2220–5.

Article   PubMed   PubMed Central   Google Scholar  

Pérez-Cortés Villalobos A, Rotstein C. ) < i > Mycobacterium mucogenicum and < i > Mycobacterium neoaurum bac teremia in immunocompromised hosts. J Assoc Med Microbiol Infect Dis Can. 2021;6(1):55–62.

PubMed   PubMed Central   Google Scholar  

Miao Q, Ma Y, Wang Q, Pan J, Zhang Y, Jin W, Yao Y, Su Y, Huang Y, Wang M, Li B, Li H, Zhou C, Li C, Ye M, Xu X, Li Y, Hu B. Microbiological Diagnostic performance of Metagenomic Next-generation sequencing when Applied to Clinical Practice. Clin Infect Diseases: Official Publication Infectio us Dis Soc Am. 2018;67(suppl2):S231–40.

Article   CAS   Google Scholar  

Naito M, Fukushima K, Kusakabe S, Endo T, Shiroyama T, Ohira K, Azuma K, Tanizaki S, Yamamoto Y, Hosono Y, Naito Y, Futami S, Miyake K, Hirata H, Takeda Y, Kumanogoh A. Disseminated non-tuberculous mycobacterial infection caused by Mycobac terium obuense in an immunocompromised patient: a case report. BMC Infect Dis. 2023;23(1):517.

Abad CL, Razonable RR. Non-tuberculous mycobacterial infections in solid organ transplant rec ipients: an update. J Clin Tuberc Other Mycobact Dis. 2016;4:1–8.

Haverkamp MH, van Dissel JT, Holland SM. Human host genetic factors in nontuberculous mycobacterial infection: lessons from single gene disorders affecting innate and adaptive immun ity and lessons from molecular defects in interferon-gamma-dependent s ignaling. Microbes Infect. 2006;8(4):1157–66.

Dorman S, Subramanian A, Practice ASTIDC. (2009) Nontuberculous mycobacteria in solid organ transplant recipients. American journal of transplantation: official journal of the American Society of Transplantation and the American Society of Transplant Sur geons 9 Suppl 4:S63-69.

Griffith DE, Aksamit T, Brown-Elliott BA, Catanzaro A, Daley C, Gordin F, Holland SM, Horsburgh R, Huitt G, Iademarco MF, Iseman M, Olivier K, Ruoss S, von Reyn CF, Wallace RJ Jr., Winthrop K, Subcommittee ATSMD, American Thoracic S, Infectious Disease Society of A. An official ATS/IDSA statement: diagnosis, treatment, and prevention o f nontuberculous mycobacterial diseases. Am J Respir Crit Care Med. 2007;175(4):367–416.

Rodge G, Nagvekar V, Jhala D, George A. Mycobacterium abscessus causing native valve endocarditis due to perip herally inserted central catheter line. J Clin Tuberc Other Mycobact Dis. 2017;9:19–20.

Samaddar A, Srivastava S, Khan S, Tak V, Sharma A, Nag VL, Bohra GK. ) < i > Mycobacterium chelonae bacteraemia in a patient with myasthenia gravis receiving long-term steroid therapy. Access Microbiol. 2019;1(10):e000069.

Ueda Y, Tokumasu K, Hagiya H, Iio K, Fujimori T, Kakehi A, Okura M, Minabe H, Otsuka F. Mycobacterium chelonae bloodstream infection induced by osteomyelitis of toe: a case report. J Infect Chemotherapy: Official J Japan Soc Chemother. 2020;26(8):843–6.

Grandjean Lapierre S, Toro A, Drancourt M. Mycobacterium Iranicum bacteremia and hemophagocytic lymphohistiocytos is: a case report. BMC Res Notes. 2017;10(1):372.

Furukawa Y, Hamada H, Kamikawaji K, Unoki T, Inoue H, Tashiro Y, Okamoto M, Baba M, Hashiguchi T. Successful treatment of an AIDS patient with prolonged Mycobacterium a vium bacteremia, high HIV RNA, HBV infection, Kaposi’s sarcoma and cyt omegalovirus retinitis. J Infect Chemotherapy: Official J Japan Soc Chemother. 2020;26(2):279–81.

Watanabe Y, Fujita H, Fukushima S, Nakamura I. Disseminated Mycobacterium chelonae infection, including Discitis. Intern Med. 2022;61(13):2085–6.

Article   PubMed   Google Scholar  

Zeitler K, Walter J, Khan F, Sriaroon C. Recurrent disseminated < i > Mycobacterium avium in a female patient from Thailand with anti-interferon-gamma autoantibodies: dilemma on Tr eatment approach. BMJ Case Rep. 2021;14(1):e237909.

Yamada T, Iwakami S-I, Hara SAM, Iwakami N, Nakamura A, Suzuki Y, Sasaki S-I, Takahashi K. Disseminated non-tuberculous mycobacterial infection caused by anti-in terferon-γ autoantibodies in a patient of very advanced age. Geriatr Gerontol Int. 2018;18(7):1132–3.

Shabi Y, Haldane D, Bonnar P. ) < i > Mycobacterium fortuitum pacemaker infection: a case report. J Assoc Med Microbiol Infect Dis Can. 2022;7(1):81–3.

Walayat S, Awwal T, Roy M, Ahmad S. ) < i > Mycobacterium neoaurum line-related bacteremia with pulmonary i nvolvement: Case report and review of literature. IDCases. 2018;11:88–90.

Butt S, Tirmizi A. ) < i > Mycobacterium smegmatis bacteremia in an immunocompetent host. IDCases. 2019;15:e00523.

Nagata A, Sekiya N, Najima Y, Horiuchi M, Fukushima K, Toya T, Igarashi A, Kobayashi T, Kakihana K, Ohashi K, Doki N. Nontuberculous mycobacterial bloodstream infections after allogeneic h ematopoietic stem cell transplantation. Int J Infect Diseases: IJID: Official Publica tion Int Soc Infect Dis. 2020;97:131–4.

CAS   Google Scholar  

Shaharir SS, Sulaiman Sahari N, Mohamed Fuad Z, Zukiman WZHW, Mohd Yusof NH, Sulong A, Periasamy P. Non-tuberculous mycobacterium bacteraemia in a pregnant systemic lupus erythematosus (SLE) patient: a case review and pooled case analysis. Clin Rheumatol. 2018;37(3):837–47.

Takemori-Sakai Y, Iwata Y, Oe H, Sakai Y, Wada T. Bloodstream infection caused by Mycobacterium chelonae. Pediatr Int. 2018;60(6):599–600.

Download references

Acknowledgements

We would like to thank our patient and his family for giving us the opportunity to follow his case. We would like to thank Dr. Yuyao Yin (Department of Laboratory Medicine, Peking University People’s Hospital) and Dr. Yifan Guo (Department of Laboratory Medicine, Peking University People’s Hospital) for their explanations of the results in the mNGS test.

No funding was received for the conception, composition, editing, or submission of this manuscript.

Author information

Authors and affiliations.

Department of Clinical Laboratory, Peking University People’s Hospital, Beijing, China

Danchun Wang, Dongjie Xue, Xin Chen, Xiong Wu, Xinyue Liang, Wenli Yuan, Hui Wang & Qi Wang

Department of Clinical Laboratory, Shantou Central Hospital, Shantou, Guangdong Province, China

Danchun Wang

Department of Clinical Laboratory, Beijing Huairou Hospital, Beijing, China

Dongjie Xue

Department of Clinical Laboratory, Foshan Fosun Chancheng Hospital, Foshan, Guangdong Province, China

Department of Clinical Laboratory, The Third People’s Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang Uygur Autonomous Region, China

Department of Pulmonary and Critical Care Medicine, Peking University People’s Hospital, Beijing, China

Department of Clinical Laboratory, The Affiliated Hospital of Yunnan University, Kunming, Yunnan Province, China

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed to the conception and design of the study. DCW collected clinical data and wrote the initial draft of the manuscript. DCW, DJX, XYL, WLY, and XW performed the review of the literature. DCW, XC, and WJG collected images. QW, HW, and DCW supervised and edited the manuscript. All the authors have read and approved the final version of the manuscript.

Corresponding author

Correspondence to Qi Wang .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Informed consent was obtained from the patient for the publication of this case report and accompanying images.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Wang, D., Xue, D., Chen, X. et al. Mycobacterium marseillense bloodstream infection combined with skin fungal infection: a case report and literature review. BMC Infect Dis 24 , 853 (2024). https://doi.org/10.1186/s12879-024-09741-y

Download citation

Received : 04 February 2024

Accepted : 08 August 2024

Published : 22 August 2024

DOI : https://doi.org/10.1186/s12879-024-09741-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Mycobacterium marseillense
  • Non-tuberculous mycobacterium
  • Bloodstream infection

BMC Infectious Diseases

ISSN: 1471-2334

search terms for literature review

IMAGES

  1. Literature reviews

    search terms for literature review

  2. Search terms used in the literature review

    search terms for literature review

  3. Overview of the search terms for the literature review.

    search terms for literature review

  4. What is a literature review?

    search terms for literature review

  5. Search terms for literature review.

    search terms for literature review

  6. literature review article examples Sample of research literature review

    search terms for literature review

COMMENTS

  1. 4. Search Terms & Strategies

    Systematic Reviews: Constructing a Search Strategy and Searching for Evidence from the Joanna Briggs Institute provides step-by-step guidance using PubMed as an example database. General Steps: Locate previous/ relevant searches. Identify your databases. Develop your search terms and design search. Evaluate and modify your search.

  2. A systematic approach to searching: an efficient and complete method to

    INTRODUCTION. Librarians and information specialists are often involved in the process of preparing and completing systematic reviews (SRs), where one of their main tasks is to identify relevant references to include in the review [].Although several recommendations for the process of searching have been published [2-6], none describe the development of a systematic search strategy from ...

  3. Systematic Reviews: Step 3: Conduct Literature Searches

    When developing a literature search, a good place to start is to identify searchable concepts of the research question, and then expand by adding other terms to describe those concepts. Read below for more information and examples on how to develop a literature search, as well as find tips and tricks for developing more comprehensive searches.

  4. Guides: Literature Review: How to search effectively

    Specific proximity symbols will vary. Check the 'Help' section of the database you are searching. 4. Improve your search results. All library databases are different and you can't always search and refine in the same way. Try to be consistent when transferring your search in the library databases you have chosen.

  5. How to Write a Literature Review

    Examples of literature reviews. Step 1 - Search for relevant literature. Step 2 - Evaluate and select sources. Step 3 - Identify themes, debates, and gaps. Step 4 - Outline your literature review's structure. Step 5 - Write your literature review.

  6. How to Construct an Effective Search Strategy

    Step 2: Brainstorm your search terms, including MeSH terms, that should be included in your search. Step 3: Use Boolean logic to combine your terms. Step 4: Perform a preliminary search to determine if their is any literature on your topic (You can schedule an appointment with a librarian to assist you with performing your preliminary search.)

  7. Literature Reviews

    A literature review search is an iterative process. Your goal is to find all of the articles that are pertinent to your subject. Successful searching requires you to think about the complexity of language. ... Terms. Start by doing a preliminary search using the words from the key parts of your research question. Step #1: Initial Search.

  8. Defining the process to literature searching in systematic reviews: a

    Background Systematic literature searching is recognised as a critical component of the systematic review process. It involves a systematic search for studies and aims for a transparent report of study identification, leaving readers clear about what was done to identify studies, and how the findings of the review are situated in the relevant evidence. Information specialists and review teams ...

  9. Search Strategies for [Systematic] Literature Reviews

    A search strategy is the method by which relevant sources are found, usually by searching selected databases and search engines using a mix of keywords, controlled vocabulary and search operators. Relevance is determined by a review question for which guidelines can be found in Sections 4.2 and 4.3.

  10. Literature Review: Developing a search strategy

    This guide will define a literature review, describe the process of writing a literature review and examine where to search for literature. ... (Phelps et. al. 2007, pp. 128-149), so that you can keep track of effective search terms, or to help others to reproduce your steps and get the same results. This record could be a document, table or ...

  11. Research Guides: Literature Reviews: Develop Search Strategies

    Developing a search strategy is a balance between needing a very precise search that yields fewer highly relevant results or a comprehensive search (high retrieval) with lower precision. The focus of a narrative literature review for a dissertation or thesis is thoroughness, so you should aim for high retrieval.

  12. Defining the process to literature searching in systematic reviews: a

    One area that is less well covered by the guidance, but nevertheless appears in this literature, is the quality appraisal or peer review of literature search strategies. The PRESS checklist is the most prominent and it aims to develop evidence-based guidelines to peer review of electronic search strategies [5, 122, 123]. A corresponding ...

  13. Introduction

    Students and researchers in the health sciences are often required to conduct literature searches for a number of reasons including identifying appropriate studies and methods to include in a literature review manuscript. Understanding the basics of database searching can allow you to effectively and efficiently find the information you need.

  14. Search Strategies

    Overview of Search Strategies. There are many ways to find literature for your review, and we recommend that you use a combination of strategies - keeping in mind that you're going to be searching multiple times in a variety of ways, using different databases and resources. Searching the literature is not a straightforward, linear process - it ...

  15. Researching for your literature review: Develop a search strategy

    The papers in your 'gold set' can then be used to help you identify relevant search terms Look up your 'sample set' articles in a database that you will use for your literature review. For the articles indexed in the database, look at the records to see what keywords and/or subject headings are listed.

  16. How to undertake a literature search: a step-by-step guide

    Abstract. Undertaking a literature search can be a daunting prospect. Breaking the exercise down into smaller steps will make the process more manageable. This article suggests 10 steps that will help readers complete this task, from identifying key concepts to choosing databases for the search and saving the results and search strategy.

  17. Library Guides: Systematic Review: Developing a search strategy

    Revise your strategy/search terms until:. the results match your research question; you are confident you will find all the relevant literature on your topic; See Creating search strings for information on how to enter your search terms into databases.. Example search string (using Scopus's Advanced search option) for the terms in the above table:

  18. Develop a search strategy

    Decide where to search. Develop a search strategy. Refine your search. Save your search. A search strategy is an organised structure of key terms used to search a database. The search strategy combines the key concepts of your search question in order to retrieve accurate results. Your search strategy will account for all: possible search terms;

  19. LibGuides: Literature Reviews: Developing a Search Strategy

    Once all terms have been identified, you need to put them together in a search string. You can export your search strategy in addition to the results, to use in your search documentation. A search string will generally look like: (Topic A term 1 OR Topic A term 2) AND (Topic B term 1 OR Topic B term 2) AND (Topic C term 1 OR Topic C term 2)

  20. Literature search for research planning and identification of research

    Abstract. Literature search is a key step in performing good authentic research. It helps in formulating a research question and planning the study. The available published data are enormous; therefore, choosing the appropriate articles relevant to your study in question is an art. It can be time-consuming, tiring and can lead to disinterest or ...

  21. Steps in Conducting a Literature Review

    A literature review may be a stand alone work or the introduction to a larger research paper, depending on the assignment. ... Use the key words in your question, as well as synonyms for those words, as terms in your search. Use the database tutorials for help. Save the searches in the databases. This saves time when you want to redo, or modify ...

  22. Literature Searching vs. Literature Review

    In general, a literature search is the process of seeking out and identifying the existing literature related to a topic or question of interest, while a literature review is the organized synthesis of the information found in the existing literature. In research, a literature search is typically the first step of a literature review. The ...

  23. Research Guides: Research at NJAES : Literature Reviews

    There are many different types of literature reviews from traditional literature reviews to rigorous systematic reviews. Each has its own methodology. Please review resources on this page and familiarize yourself with the task, commitment, and purpose of each before trying to decide on the type of review best fitting your research question.

  24. Working through interpreters in old age psychiatry: A literature review

    The strategy was considered especially important for this review which focuses on communication across many different languages. Thus, the authors conducted a search of the World Wide Web using Google Scholar, employing the search term Medical Interpreters and Mental Health. The search included literature in all languages.

  25. Evaluating the effectiveness of large language models in abstract

    This study aimed to evaluate the performance of large language models (LLMs) in the task of abstract screening in systematic review and meta-analysis studies, exploring their effectiveness, efficiency, and potential integration into existing human expert-based workflows. We developed automation scripts in Python to interact with the APIs of several LLM tools, including ChatGPT v4.0, ChatGPT v3 ...

  26. The role of emotions in academic performance of ...

    This paper is devoted to a narrative review of the literature on emotions and academic performance in medicine. The review aims to examine the role emotions play in the academic performance of undergraduate medical students. Eight electronic databases were used to search the literature from 2013 to 2023, including Academic Search Ultimate, British Education Index, CINAHL, Education Abstract ...

  27. Mycobacterium marseillense bloodstream infection combined with skin

    Literature search and review. We searched the literature on bloodstream infections caused by non-tuberculous mycobacteria. Using specific keywords, we queried the PubMed database and idnetified 103 articles. Among them, we found a literature review published in 2016 that effectively summarized 22 cases of NTM bloodstream infections. We selected ...