The current study presents a systematic review of the literature on the accuracy of B-type natriuretic peptide for diagnosing HF in the emergency department. Our review differs from others which have addressed this issue. The review by Schwam[5] described the individual studies in detail but did not systematically assess their quality. A recent review by Wang et al[10] performed quality assessments but did not report all aspects of study quality. A review by Doust et al[11] performed quality assessments and pooled data across studies, but calculated a pooled diagnostic odds ratio, which is not directly applicable to clinical practice. As the use of BNP has become increasingly common, we feel there is a need for quality assessed pooled data for clinical application. Our review is the first to present a complete quality assessment of the included articles and to pool data in a way that is relevant to practicing clinicians.
A BNP of 100 pg/ml is often cited as the best cutoff for the diagnosis of HF[43]. We found a pooled positive LR of 3.4 and negative LR of .14. At higher cutoffs of between 300 and 400 pg/ml, the positive LR rises to 7.6, with a negative LR of .17. Thus, our analysis suggests that among adult patients who with suspected heart failure, a low BNP seems to make HF unlikely, and very high BNP makes HF likely. BNP values between 100 and 300 pg/ml may not be helpful in diagnosing HF.
Table 4 demonstrates the utility of BNP in diagnosing HF in the context of varying pre-test probabilities. With a pre-test probability of 10% or 30%, a BNP of 300 pg/ml; while an elevated BNP in a patient with a low (30%) pre-test probability leads to a 77% chance of HF and might result in further diagnostic testing.
Several limitations of the evaluated studies should be considered. First, there were potential methodological issues including possible selection bias in many of the studies and verification bias in a few. No studies reported inter-relater agreement for the reference standard, and a few studies failed to clarify important inclusion and exclusion criteria. Most important, however, may be the issue of spectrum. In studies of diagnostic tests, it is important that investigators enroll patients in whom there is diagnostic uncertainty [44], in whom the test would be used in clinical practice. The studies of BNP recruited broadly, and likely included patients in whom there was diagnostic uncertainty in addition to patients in whom the diagnosis was clear. For example, some patients presenting with dyspnea certainly had obvious asthma exacerbations, and these patients were included in the studies of BNP although they are not patients in whom a BNP assay would be utilized in clinical practice. The broadness of the spectrum in this case may have resulted in a biased estimation of the accuracy of the diagnostic test[44,45]. This potential bias limits the applicability of the included studies and of our analysis to clinical practice. Some investigators explored the impact of this broad spectrum by performing sub-group analysis looking only at patients in whom there was diagnostic uncertainty. In the Breathing Not Properly study, BNP did perform well in the subset of patients with a history of pulmonary disease[30] with sensitivity and specificity of 93% and 77% at a cutoff of 100 pg/ml, similar to that of the larger population. This finding suggests that accuracy of BNP as determined by the studies overall may in fact reflect the accuracy of the test in patients in whom there is a high degree of diagnostic uncertainty. We look forward to a confirmation of these findings in other populations.
The best evidence in support of the widespread use of a new diagnostic test is a randomized trial demonstrating that its use improves quality of care. [46,47] In the case of BNP, such a trial has already been published. In the B-type Natriuretic Peptide for Acute Shortness of Breath Evaluation (BASEL) Study[48] Mueller and colleagues performed a single-blind randomized trial in Switzerland in which patients presenting the emergency department with acute dyspnea were assigned to a diagnostic strategy including a single bedside BNP measurement or a "standard" diagnostic strategy which excluded BNP. As in the studies included in our analysis, patients with obvious trauma were excluded. A total of 452 patients were randomized; 58% were men and the mean age was 71 years. Clinicians were given guidelines for the interpretation of BNP values. A BNP level of 500 pg/ml made HF the most likely diagnosis. No strong conclusions about HF were recommended for patients with values between 100 and 500 pg/ml. The study found lower rates of hospital admission (75% vs. 85%), significant reductions in time to discharge (8 vs. 11 days) and an $1850 reduction in cost in the group randomized to BNP testing. In-hospital and 30-day mortality were not significantly different in the two groups.
On initial inspection these strongly positive results seem inconsistent with our findings that BNP is only a moderately accurate diagnostic test. In fact, some have criticized the BASEL study [49]. Patients enrolled in the study received somewhat scripted care, which may have included testing that is not routinely performed [50]. The specific nature of the care of patients in each group and the diagnostic tests they received was not described in the publication [48]. The severity of illness in the study group is also striking. Although presenting symptoms seemed moderate, 80% of enrolled patients were hospitalized, of whom 20% were admitted to the intensive care unit, and the median length of stay in the control group was 11 days. While these characteristics may be related to the standards of practice in Switzerland or to the particulars of the study, it is unclear whether the population evaluated is representative of the broad group of patients presenting with dyspnea to emergency departments.
It is difficult, however, to disregard the dramatic finding of the BASEL study, and some of the utility of BNP may have been related to the way in which it was interpreted in the study. The study utilized two separate diagnostic cutoffs for BNP: a level of 100 pg/ml to rule out HF and a level of 500 pg/ml to rule in HF. Our determinations of the performance characteristics of BNP support the use of two diagnostic cutoffs. The pooled likelihood ratio (LR) was 0.14 for a BNP of 10 can similarly rule in disease independent of pre-test probability [44]. LRs of between 0.1 and 0.2 or between 5 and 10 have moderate ability to rule out or rule in disease [44]. The LR of 0.14 indicates that a BNP of [51].
Our study has some methodological limitations. We pooled data from different studies with obvious heterogeneity, utilizing the random effects model to determine pooled LRs. We opted to perform pooling despite the heterogeneity because we believe that there is a clinical need for a compilation of findings regarding BNP, so that clinicians may understand its utility in clinical practice. We utilized the random effects model to minimize the bias associated with heterogeneous results, but the validity of the pooled estimates may still be limited by the presence of heterogeneity. In addition, we were limited by the quality of the available data, which was not ideal.