Previous - Appraisal of research Next - Appraisal of Qualitative Studies
At this point we have merely established that in a single reported study there seems to be no difference in outcomes between those discharged early and those who are treated in a conventional manner. Would we regard this as conclusive proof? Would similar studies also report no significant difference? This is where the strengths of the systematic review come into their own. If a number of studies show the same effect, despite variations in setting or locality we can be increasingly certain that the intervention works. If, however, the above study is seen to be a one-off occurrence that contradicts previous studies we can focus on differences between the studies in terms of possible design flaws (bias) or factors that may confuse the picture because they were not taken into account at the beginning of the study (confounders). We would also look at any minor differences in the study population, the exact nature of the interventions used and how effects were measured. Any of the above factors may help us to explain away any differences.
In practice we may find a number of studies with a wide variety of results, some in favour and others against the effectiveness of an intervention. Of course at a very basic level we could simply count each study as a "vote" and see where the balance of opinion lies. However this would not take into account the relative size of the different studies. Alternatively we could rank the studies in terms of the size of the study population and see which side of the line could command the largest support. However this would not take into account the quality of the studies - some may be well designed and others poorly designed. In practice therefore the balance sheet is devised by taking into account both the size and quality of the component studies and then weighting them accordingly. In this way the cumulative or pooled result will contain both statistical power and study quality.
The following approach is suggested for secondary studies, that synthesis or integrate information from multiple primary studies. These may be meta-analyses, systematic reviews, guidelines or economic analyses. Again the checklist has the same three components identified from the User Guides' for primary studies.
A. Are the results of the review valid?
1. Does the review set out to answer a precise question about patient care? (i.e. it is something more scientific than an attempt to gather everything written about malaria).
2. Have they sought out studies thoroughly by searching:
(for meta-analysis, it may be important to track down the raw study data for re-analysis, rather than the final report)
3. Have the authors included explicit inclusion and exclusion criteria for studies, taking account of the patients in the studies, the interventions used, the outcomes recorded and the methodology?
B. What are the results?
4. For a meta-analysis, how are the results presented?
In practice the results are often displayed graphically as horizontal lines representing the 95% confidence intervals of the effect of each trial (strictly the 95% CI s of the relative risk of the intervention group compared to the control group) Sometimes there is a "blob" in the middle representing single best estimate of the intervention found by that study. The results of the meta-analysis are represented by a diamond:
|
|
||
| Intervention group does better than control |
1.0 ("line of no effect") |
Intervention group does worse than control |
i. Probably a small study, with a wide confidence interval, crossing unity (i.e. unable to say if the intervention works
ii. Probably a small study, wide confidence interval , but does not cross unity: suggests intervention works but weak evidence.
iii. Larger study, narrow confidence interval: but crosses unity, so no evidence that intervention works.
iv. Large study, narrow confidence intervals: entirely to left of unity: suggests intervention works.
v. Small study, wide confidence intervals, suggests intervention is detrimental!
vi. Meta-analysis of all identified studies: suggests intervention works.
5. Have the authors considered the "homogeneity:" the idea that the studies are sufficiently similar in their design, interventions and subjects to merit combination. (If you were looking at the effect of fruit consumption on cancer, you could combine studies about apple and orange consumption, but if you were only interested in the effect of citrus fruit, you wouldn’t want the apple studies). In meta-analysis, this is done either by eyeballing graphs like one above, by applications of chi-square tests (Thompson, 1995) or by plotting effect estimates against sample size and looking for a symmetrical "Funnel Plot" a shape like an inverted filter funnel
C. Will the results help locally:
6. In some ways this is easier than for a piece of original research. The various studies may have used patients of different ages or social classes, but if the treatment effects are consistent across the studies, then generalisation to other groups or populations is more justified. However, be wary of "sub-group analyses" where the authors attempt to draw new conclusions by comparing the outcomes for patients in one study with the patients in another study, rather than trying to draw together the patients in the control and intervention groups in all studies Such conclusions have often later been shown to be artefacts and not justified. Also, be wary of "data-dredging" exercises, testing multiple hypotheses against the data, especially if the hypotheses were constructed after the study had begun after collection.
7. Were all clinically important outcomes considered?
8. Are the benefits worth the harms and costs?
Worked example -– critical appraisal of the systematic review article:
Effectiveness of home care programmes for patients with incurable cancer on their quality of life and time spent in hospital: systematic review F W J M Smeenk, JC M van Haastregt, LP de Witte, HF J M Crebolder BMJ 1998; 316: 1939-1944 and at:
http://www.bmj.com/cgi/content/full/316/7149/1939
A. Are the results of the review valid?
Screening Questions
1. Did the review address a clearly focused issue?
Yes: clearly stated: to examine if a comprehensive home care programme (intervention) is more effective than standard hospital based care for patients with terminal cancer (population), in terms of maintaining quality of life and reducing time as an in-patient (outcomes). Although a "comprehensive home care programme" is defined as something more than an intervention aimed at a single aspect of care at home, there still seems a degree of subjectivity about "comprehensive" which the authors solved by establishing an internal consensus.
2 Did the authors look for the appropriate sort of papers?
Partly: see question 3 below.
Is it worth continuing?
There is a risk that the authors’ literature search, although well constructed, may not be wide ranging enough. However, unless you can find a study with a more extensive exploration of material outside the main databases, it seems worth persevering.
Detailed Questions
3 Do you think the important, relevant studies were included?
Partly: There was an explicit search of databases, with defects addressed in first paragraph "Shortcomings" p. 1942, but little evidence of grey literature search. The authors did not apply language restrictions (i.e. they did not attempt to discriminate against non-English papers in their trawl of the databases. However, these databases themselves may have a bias in favour of English language publications, necessitating a wider trawl. The search strategy seems to have been appropriate, for retrieving pertinent papers but insufficiently thorough and wide ranging.
4 Did the review’s authors do enough to assess the quality of the included studies?
Yes: Studies were scored by methodology: which was later used to weight them. Explicit inclusion/exclusion criteria cut out most identified studies (only 9 of 348 got through). Two investigators independently ranked studies for quality, and anonymised the articles by removing authors, titles and journal name. This would reduce bias towards certain well respected authors or publications, and use of a third investigator to adjudicate over differences would deal with inter-observer differences. In such a small research group, some external advice on differences would be ideal. However, the resources and timescale of the study may have precluded this refinement. The criteria for inclusion of studies (study population and sampling, drop outs, description of intervention, outcome measures, and data handling and presentation) are comprehensive and relevant. The overall score for methodology was only "moderate" (48-68 marks out of a possible 100, mean 59).
5 If the results of the review have been combined, was it reasonable to do so?
There is no formal meta-analysis. Such is not practical, as the studies have different backgrounds (three hospital based, two hospice-based, one in a rehabilitation centre). He studies also use different tools to measure outcomes. Four report improvements in all the scales they used; two deterioration in the scales used; one no significant change; and one shows improvements in two scales and deterioration in one scale. For some, but not all studies, more information is provided on two other outcomes: re-admission rates and survival. Inclusion of this data is erratic, and the only significant differences are two studies which show reduced re-admission rates in intervention group. The information on the interventions used by each study (table 3) is also incomplete. There was variable use of interventions like home visiting, technical care and team meetings. It is not so much that some studies specifically did not employ these methods, but for many the tables shows "not stated" (i.e. not clear from the description of the methodology) or "some" (i.e. not applied consistently to all subjects, and therefore introducing an extra variable into the study).
B. What are the results?
6 What is the overall result of the review?
No clear benefit demonstrated for comprehensive home programmes over current "standard provision." Re-admission rates were lower in the intervention groups, and significantly so for two studies, and five showed some evidence of physical improvement. However, these benefits are not consistent across the eight studies. Rather as a footnote, the authors seem to favour home visits and multi-disciplinary team meetings as improving patient satisfaction. These interventions were used in some (but not possibly all of the studies), and to back this up the authors cite three more studies (14, 15, 22) all of which describe themselves as randomised trials!
7 How precise are the results?
The results are summarised, rather than being presented mathematically.
C. Will the results help locally?
8 Can the results be applied to the local population?
The "home care" programmes this study investigates may seem desirable or popular. They fit in with the move from hospital to community care; they may appear socially desirable ("getting people home more quickly!"); they may appear attractive to hard pressed hospital managers, wary of "bed blocking cases" (two studies suggested significant reduction in re-admission rates); the basic idea may appear "sensible" ("of course people will be more comfortable in their own homes"). But, moving care into the community is not without resource implications, and this review does not come up with much rigorous evidence that such home cure programmes deliver significant improvements. This work thus helps inform the debate about funding them, even if its main conclusion is to sound a note of caution.
Generalisability seems weak: there were eight US studies and one UK so one would have to consider the relative strengths of primary care in these countries: in the US primary care is not so comprehensive and well developed as in most of Europe.
9 Were all important outcomes considered?
One problem is the use of many evaluation scales (over 20 in various combinations across the eight studies). It is not clear which end points each scale measures: one suspects that there may be a degree of overlap in the end points measured by say P&S (Pain and symptoms); MMPQ (McGill-Melzack pain questionnaire); MHI (Mental Health Index); and the PD (Psychological distress) scales. In some ways, the studies have set themselves two many outcomes (physical and psychological well being, ability to self care, patient satisfaction; re-admission rates and survival). It would have been more useful to separate the softer issues (patient satisfaction) from the harder ones (re-admission and survival).
10 Are the benefits worth the harms and costs?
There is no attempt to cost out care in hospital and care at home for cancer patients. The review is inconclusive as to whether care at home provides benefits. The authors do not seem to have gleaned financial information from the studies (it may not be there, or their criteria may not have included asking such questions). This would be problematic for the American studies, where free public access to health care is far from universal. One would need to have information on the patient mix, as to which were eligible for free care (under Medicare and Medicaid schemes), and which were covered by private or occupational insurance (and how comprehensive the various schemes were for terminal care).