To carry out a healthcare systematic review, set a protocol, search widely, screen in pairs, judge bias, synthesize, and report with PRISMA 2020.
What A Systematic Review Delivers
A systematic review pulls together all trustworthy studies on a clear question and treats the process like a study in its own right. You set rules up front, follow them, and show your work. That way readers can see how the evidence was found, what was kept out, and why conclusions hold. In healthcare, this approach beats single-study takes because it pools signals and reduces random noise. It also shows where the body of evidence is thin, which helps funders, clinicians, and patients. The point is simple: a good review gives a stable view of what the totality of research says, not just one loud paper.
Carrying Out A Systematic Review In Healthcare: Scope & Setup
Start by pinning down the question. PICO works well for treatments; for diagnostics or public health, tweak the elements as needed. Bring in at least two reviewers and, if you can, a librarian. Next, write a protocol that names outcomes, study designs, time windows, and planned analyses. Register that protocol on PROSPERO to lock decisions and lower bias. Decide in advance how you’ll handle language limits, gray literature, preprints, and trial registries. Map the path for screening, including how disagreements get fixed. Sketch the data items you’ll capture, plus the risk-of-bias tool for each study design. Finally, outline the synthesis plan so your later choices aren’t driven by the results you find. These steps make later work faster, quieter, and far easier to defend.
| Stage | Decision You Record | Why It Helps |
|---|---|---|
| Define the question | Use PICO, PECO, or PICOS to list elements | Aligns scope and reduces drift |
| Eligibility criteria | Populations, settings, designs, dates | Prevents ad-hoc choices |
| Outcomes | Primary, secondary, time points | Keeps analyses consistent |
| Information sources | Databases, registries, gray sources | Broader coverage, fewer misses |
| Search strategy | Draft strings, limits, peer review | Reproducible and testable |
| Screening process | Dual screening, training, kappa targets | Cuts errors, boosts agreement |
| Data items | Variables, units, imputation rules | Cleaner extraction |
| Risk of bias methods | Tool per design; conflict handling | Transparent judgments |
| Synthesis plan | Effect measures, models, subgroups | Pre-set choices, honest calls |
| Reporting | PRISMA flow, tables, supplements | Readers can audit steps |
Build A Reproducible Search Strategy
Work with a librarian to shape the search. Combine subject headings and free-text terms, include synonyms, and test alternative spellings. Use Boolean logic, proximity operators, and truncation only when they add recall without wrecking precision. Search at least MEDLINE or PubMed, Embase when relevant, and a nursing or allied health database such as CINAHL. Add CENTRAL for trials, plus major trial registries. Decide whether you need Scopus or Web of Science for citation chasing. Record every detail: database, platform, dates, full strings, and any limits. The Cochrane Handbook section on searching lays out sound practice and easy pitfalls, so it’s a handy cross-check.
For transparency later, save your strategies as text files and screenshots. Run a peer review of the search with another member using a short checklist. Plan one search update before combining results, and another just before submission if the field moves quickly. Don’t use study-design or language filters unless your protocol calls for them. Document gray sources: conference abstracts, dissertations, agency reports, and preprint servers. Keep export logs from each source so deduplication is traceable.
Screen Studies Without Bias
Import all records into a manager that preserves source tags. Deduplicate by fields and by title match to avoid accidental loss. Pilot title-abstract screening on a small set, compare decisions, and tighten the criteria until agreement is steady. Then run dual screening for the full set. Record reasons for exclusion at full text, not just a code. Resolve conflicts by a short meeting or a third reviewer. Keep a PRISMA flow that shows counts at each step. This picture matters to readers; it shows where studies dropped out and keeps chain of decisions clear.
Guard against eligibility drift. Revisit the protocol if tricky cases appear, write the exact rule you’ll use, and apply it forward to all records not yet screened. If a rule must change, note the change with a time stamp. When you reach full texts, track missing articles and try interlibrary access or author contact. The screening log will become a key supplement, so keep it tidy.
Extract The Data That Matters
Design a form before touching full data. Pilot it on three to five papers from different designs and tweak until fields are unambiguous. Typical items include study setting, design, sample size, population features, intervention details, comparators, outcome units, follow-up time, effect estimates, and funding. Add columns for risk-of-bias signals that flow into your tool. Record who extracted each item and who verified it. If numbers don’t add up, contact authors with a short, clear request. Store raw extraction sheets and decisions in a versioned folder so you can retrace every step.
Plan how you’ll treat multiple reports from the same study. Link them with a study-level ID and use the most complete data while keeping time points aligned. If outcomes are reported in different formats, preplan conversions, such as standard errors to standard deviations. When scales differ, set rules for direction so higher scores always mean the same thing. Flag unit-of-analysis issues early for cluster trials or crossover trials.
Judge Risk Of Bias, Not Just Quality
Risk of bias asks whether study methods or conduct could tilt the effect estimate. Pick a tool that matches each design. For randomized trials, RoB 2 covers randomization, deviations from intended care, missing data, outcome measurement, and selective reporting. For non-randomized studies of interventions, ROBINS-I walks through confounding, selection, classification, deviations, missing data, outcome measurement, and reporting. For diagnostic accuracy, QUADAS-2 fits better. Train your reviewers, pilot your judgments, and capture quotes or page numbers to back each call.
| Study Type | Suggested Tool | What It Covers |
|---|---|---|
| Randomized controlled trial | RoB 2 | Sequence, deviations, missing, measurement, reporting |
| Non-randomized intervention study | ROBINS-I | Confounding, selection, classification, deviations, missing, measurement, reporting |
| Diagnostic accuracy study | QUADAS-2 | Patient selection, index test, reference standard, flow |
| Systematic review | AMSTAR 2 | Search, selection, justification, synthesis |
| Prognostic study | QUIPS | Participation, attrition, measurement, confounding, analysis |
Plan The Synthesis
First decide whether a meta-analysis makes sense. If studies ask the same question and report compatible outcomes, pick the effect measure in your protocol: risk ratio, odds ratio, mean difference, or standardized mean difference. Use random-effects when you expect genuine between-study variation; fixed-effect only when a common effect is credible. Inspect heterogeneity with forest plots and I², but don’t chase a target number. Look for clinical and method differences that explain variation. Plan subgroup tests sparingly and write the exact hypotheses in advance. Sensitivity runs, such as removing high risk-of-bias studies, help show if a finding is fragile.
When synthesis must be narrative, lay out why pooling was not possible, group studies by design or outcome, and keep the same order across text, tables, and figures. Check small-study effects with funnel plots when you have enough comparisons. For certainty ratings, GRADE gives a simple, transparent way to express how much trust to place in pooled results. Pre-specify what will trigger a downgrade and carry those rules through every outcome. End each comparison with a short take-home line that links the effect size, its precision, and the certainty.
Report So Readers Can Trust It
Report exactly what you did and what you found. Use the PRISMA 2020 checklist to structure the write-up and supply a flow diagram. Put full search strings, screening forms, extraction forms, and any analytic code in a public supplement. Label one date only on the page and keep separate dates in metadata, so search engines read the latest version cleanly. Tables should present baseline features, effect estimates, and risk-of-bias calls in a consistent order. Close the main text with a short plain-language wrap that states who the findings apply to, what the effect looks like, and what gaps remain.
Common Pitfalls And Simple Fixes
These traps are common, but each one has a clean fix.
- Single-reviewer screening: Two sets of eyes catch more mistakes; use dual screening for titles and full texts.
- Weak search: Missing one core database or trial registry can skew results. Draft strings with a librarian and peer review the strategy.
- Eligibility drift: If new rules crop up mid-screen, freeze the queue, revise the protocol, and restart from the top of the list.
- Outcome switching: Stick to outcomes named in the protocol. Add new ones only as “exploratory” and label them clearly.
- Quality scores: Avoid composite scores that hide where bias sits. Use domain-level judgments and show them alongside results.
- Over-weighting small studies: Check influence with leave-one-out or by down-weighting very small trials in sensitivity runs.
- Opaque reporting: If a step isn’t reproducible from your paper and supplement, fix that gap before submission.
Use the list as a pre-submission sweep.
Lightweight Timeline And Roles
Timelines vary with topic and team size, but a lean plan helps. Budget two weeks for protocol drafting and registration. Block three to six weeks for searching, deduplication, and screening titles and abstracts. Full-text screening often takes another two to four weeks. Data extraction and risk-of-bias work can run in parallel for three to five weeks. Synthesis and write-up usually need four to six weeks. Assign clear owners: a lead for methods, a search lead, two independent screeners, two extractors, and a stats lead. Meet weekly for 30 minutes to unblock issues and keep momentum. Pause between steps to lock logs and backups properly.