Article, Cardiology

Interrater reliability of hemodynamic profiling of patients with heart failure in the ED

Brief Report

Interrater reliability of hemodynamic profiling of patients with heart failure in the ED

Ahmed Chaudhry DO, Adam J. Singer MD?, Jasmine Chohan BA, Valerie Russo BA, Christopher Lee MD

Department of Emergency Medicine, Stony Brook University Medical Center, Stony Brook, NY 11794-8350

Received 9 March 2007; revised 17 April 2007; accepted 21 April 2007


Objective: Hemodynamic profiling (HP) of patients with heart failure based on clinical assessment of central congestion and peripheral perfusion has been widely used by cardiologists to help guide therapy and determine prognosis but has never been tested or validated in the emergency department (ED). We hypothesized that the interrater reliability of HP in the ED would be good or greater than 0.6. Methods: Study design. This was a prospective, observational study. Setting. It was conducted in an academic suburban ED with emergency medicine residency. Subjects. A convenience sample of patients presenting to the ED with suspected acute decompensated HF was enrolled. Measures. Demographic and clinical information was collected using standardized data collection forms. Two emergency medicine physicians (masked to each other) evaluated all patients and classified them into 1 of 4 hemodynamic categories based on the presence or absence of central congestion (wet or dry) and peripheral hypoperfusion (cold or warm): warm and dry, warm and wet, cold and dry, and cold and wet. B-type natriuretic peptide levels, objective test of cardiac function, and final cardiologist diagnoses were obtained. Data analysis. Interrater reliabilities for overall hemodynamic profile and individual elements of congestion and perfusion were calculated using ? coefficients.

Results: Sixty-eight patients were enrolled. Their mean age was 72 +- 14 years, 53% were female, and 78% were white. Of the patients, 79% had a final diagnosis of HF. Most patients (N60%) were considered wet and warm. Interrater agreement for HP was 64%, ? = 0.28 (95% confidence interval, 0.01-0.51). Interrater agreement was poor to fair for all elements of congestion and perfusion except peripheral edema (? = 0.67) and a narrow Pulse pressure (? = 0.66).

Conclusions: Hemodynamic profiling of patients with HF by emergency physicians in the ED is not highly reliable. One in 5 patients thought to have HF in the ED did not have a final diagnosis of HF.

(C) 2008


Presented in part at the Annual Meeting of the Society for Academic Emergency Medicine, May 2006, San Francisco, Calif.

* Corresponding author.

E-mail address: [email protected] (A.J. Singer).

0735-6757/$ – see front matter (C) 2008 doi:10.1016/j.ajem.2007.04.029

heart failure is a common and costly problem in the United States with 550000 new cases annually [1,2] at a Total cost of 10000 to 38000 million dollars [3] accounting for 1% to 2% of all health care expenditures [4]. Seventy percent of

the costs associated with HF are incurred from hospitaliza- tion [4], and HF is the single leading cause of hospitalization in adults older than 65 years [5,6]. Patients with HF have traditionally been risk stratified by cardiologists using tools such as the New York Hospital Association classification scheme and hemodynamic profiling (HP) [7,8].

Hemodynamic profiling allows physicians to stratify patients with HF into 1 of 4 profiles or categories based on physical examination findings of central congestion and peripheral perfusion. The concept of HP was first proposed for patients with acute myocardial infarction by Forrester et al

[9] in 1976. Patients were classified clinically into 1 of 4 hemodynamic subsets (profile 1, no congestion or hypoper- fusion; profile 2, congestion without hypoperfusion; profile 3, hypoperfusion without congestion; and profile 4, both congestion and hypoperfusion) based on clinical findings as well as pulmonary artery catheterization results. Profiles with hypoperfusion were noted to have a 10-fold increase in mortality vs the other subsets. Moreover, detection of hemodynamic abnormalities was possible with an accuracy of more than 80% using physical examination and ancillary tests when compared with Swan-Ganz hemodynamic measurements [9]. The concept of HP was later extended to HF by Stevenson and Perloff [10], who also demonstrated that physical signs have limited accuracy in estimating hemodynamics in chronic HF. A more extensive review of the accuracy of signs and symptoms in diagnosing HF further demonstrates the overall poor accuracy of the physical examination [11]. This review also failed to identify any study that evaluated the interrater reliability of the physical examination findings. Despite the lack of data on the reliability of physical findings, practice guidelines emphasize their importance in the evaluation of patients with HF [12].

Hemodynamic profiling of patients with HF has been shown to have both prognostic and Therapeutic implications [1,13]. In one study, patients with HF who were classified as cold and wet were shown to have twice the mortality and need for cardiac transplant at 1 year than patients in warm and wet categories. In this study, most patients (67%) were in the warm and wet category, whereas only 4% appeared in the cold and dry profile [14]. In another study of 452 patients admitted with HF, 50% were profiled as wet and warm, 27% as dry and warm, 20% as cold and wet, and only 3% as dry and cold. The patients with a wet and cold profile had a lower survival than those in the warm and wet profile. Furthermore, both of the wet profiles had a significantly higher mortality in comparison with the warm and dry profile [13]. Hemody- namic profiling may also be used to help determine therapy. For example, patients with evidence of adequate perfusion and pulmonary congestion (warm and wet) are treated with diuretics and vasodilators, whereas patients with decreased perfusion require inotropic agents [14].

Hemodynamic profiling in the aforementioned studies was conducted by individual cardiologists, and interobserver reliability and justification for specific hemodynamic profile classification were not provided or verified by the authors

[13]. Furthermore, HP was often limited to patients with chronic HF and in outpatient settings. Although the classification system was developed to help cardiologists guide therapy, it is unclear whether it serves this purpose, especially in the setting of the ED. The objective of the current study was to determine the interrater reliability of HP in the ED and to assess whether it was thought to be useful to emergency medicine physicians. We hypothesized that ED physicians can classify patients presenting with signs and symptoms of HF into 1 of 4 hemodynamic profiles with a high degree of reliability (ie, an interrater reliability of at least 0.60.)


Study design

A prospective observational study was performed to determine the interrater reliability or agreement on HP of ED patients with HF by emergency physicians. The study was approved by the institutional review board, and all patients gave written informed consent.


The study was conducted at a suburban, university- affiliated ED with an emergency medicine residency training program and an annual census of 75000.

Study population

Adult patients presenting with a chief complaint of shortness of breath, cough, fatigue, chest pain/pressure, and/ or edema with suspected HF were eligible for enrollment. The patients were enrolled when 1 of the investigators was present in the ED representing a convenience sample.


A standardized data collection form was used for data collection. The standardized data collection form included basic demographic information about the patient in addition to questions concerning prior diagnosis of HF, presenting complaint, initial vital signs, physical findings of pulmonary congestion and peripheral perfusion, hemodynamic profile designation (Fig. 1), and patient disposition. All study physicians were briefly in-serviced on the HF hemodynamic classification scheme before study initiation. The presence of the following items was used to help determine whether central congestion was present: orthopnea, crepitations, peripheral edema, jugular venous distention (JVD), a third heart sound (S3), loud pulmonic valve (P2) sound, ascites, and abdominojugular reflux. The presence of the following items was used to help determine the presence or absence of

Fig. 1 Hemodynamic profiles.

peripheral hypoperfusion: a narrow pulse pressure (b25 mm Hg), pulsus alternans, cool extremities, and hypotension (systolic blood pressure, <=90 mm Hg).

After the initial physician assessment, a second emer- gency medicine physician was instructed to obtain a targeted history and physical examination on the same patient and to independently fill out the standard HF data form. Henceforth, the initial examining physician will be referred to as physi- cian A, and the second examining physician will be referred to as physician B. Only senior emergency medicine residents or Attending emergency physicians participated in the study. Both investigators were instructed to fill out the forms before knowledge of chest x-ray and/or B-type Natriuretic Peptide assay results. In addition, physicians were instructed to note whether HP influenced their choice of treatment in the ED. The final diagnosis of HF was based on the discharge diagnosis and a review by a board-certified cardiologist and board-certified emergency physician. This determination was made based on BNP levels, inpatient echocardiogram results, and cardiac catheterization results, when available.


The primary outcome was the interrater reliability or agreement with regard to the hemodynamic profile assigned to the patient. Secondary outcomes were the interrater agreement on the individual elements making up the hemodynamic profile and the agreement between the ED diagnosis of HF and the final discharge diagnosis.

Data analysis

Continuous variables are presented as means and 95% confidence intervals (CIs), and categorical variables are presented as percentages. The gold standard for a diagnosis of HF, used to determine the validity of HP, was the final inpatient discharge diagnosis by a board-certified cardiolo- gist. The interrater reliability of independent physician assessments of physical examination findings, historical information, and ultimate HF profile designation were measured using ? coefficients with 95% CIs. We also determined the percentage of cases where there was agreement between the observers in the hemodynamic profile classification. A ? value of 0.6 to 0.8 is considered good agreement, whereas a ? of 0.8 or greater is considered excellent. A ? of 0.4 to 0.6 is considered fair, whereas a ?

value of less than 0.4 is considered poor. Arbitrarily, the descriptive statistics of chief complaints and symptoms are based on the findings of physician A. Data analysis was performed using SPSS 14.0 for Windows (SPSS Inc, Chicago, Ill) statistical software.


During the study period, we enrolled a convenience sample of 68 patients presenting with an ED diagnosis of acute decompensated HF. The mean age for the enrolled patients was 72 +- 14 years, 53% were female, 78% were white, and 9% were black. Of all patients, 60% had a prior diagnosis of HF. Most patients presented with a chief complaint of shortness of breath (87.3%); other presenting symptoms included peripheral edema (30.2%), cough (22.2%), fatigue (20.6%), and chest discomfort (20%).

Using the final discharge diagnosis of HF as the gold standard, ED physicians were correct in making the diagnosis of HF in 79% of cases. new-onset atrial fibrillation, coronary artery disease without HF, and chronic obstructive pulmonary disease were among the most common diagnoses in patients discharged without a diagnosis of HF. In general more than 60% of patients were profiled by the examining physicians as warm and wet with only 2.9% profiled as cold and dry (Table 1). Observers agreed on the hemodynamic profile in 44 (65%) patients. Classification of patients into a specific hemodynamic profile had an interrater reliability of 0.35 (95% CI, 0.16- 0.53). When only patients with a final diagnosis of HF were included in the analysis, overall agreement for the hemody- namic profile was 64% (? = 0.28; 95% CI, 0.01-0.51). Interrater agreement was poor for all elements of central congestion and peripheral perfusion except for the presence of peripheral edema (? = 0.67) and a narrow pulse pressure (? = 0.66). The interrater reliability of the various parameters that make up the HP are presented in Table 2.

Table 1 Hemodynamic profiling of study patients

Physician A Physician B

Warm and wet (n [%])

45 (66.2)



Warm and dry (n [%])

12 (17.6)



Cold and wet (n [%])

9 (13.2)



Cold and dry (n [%])

2 (2.9)



(n [%])

Table 2 Physical signs of central congestion and peripheral hypoperfusion

Central Physician Physician Interrater agreement




(? [95% CI])

Orthopnea (n [%])

36 (52.9)

37 (54.4)

0.20 (-0.03 to 0.44)

JVD (n [%])

22 (32.4)

18 (26.5)

0.08 (-0.16 to 0.32)

S3 (n [%])

6 (8.8)

6 (8.8)

-0.10 (-0.15 to 0.04)

P2 (n [%])

0 (0)

0 (0)


Edema (n [%])

39 (57.4)

36 (52.9)

0.67 (0.50 to 0.85)

Ascites (n [%])

4 (5.9)

4 (5.9)

0.47 (0.03 to 0.91)

Crepitations (n

49 (72.1)

44 (64.7)

0.43 (0.20 to 0.65)


Abdominojugular 0 (0) 1 (1.5) NA reflux (n [%])

Peripheral hypoperfusion


4 (5.9)

7 (10.3)

0.51 (0.14 to 0.88)

(n [%])

Cool extremities

13 (19.1)

9 (13.2)

0.46 (0.18 to 0.74)

Pulsus alternans (n [%])

Narrow pulse pressure (n [%])

NA, not available.

0 (0) 0 (0) NA

1 (1.5) 2 (2.9) 0.66 (0.04 to 1.00)

with suspected acute decompensated HF. These findings are similar to those of other studies that found similar diagnostic rates among ED physicians [3]. The current study, like others before, also indicates that most ED patients with HF are congested and well perfused (warm and wet) with very few patients hypoperfused, de-emphasizing the role of inotropes in the ED management of HF. Our study also found that the interrater reliability for HP among emergency physicians was poor to fair at best, with observers agreeing on the hemodynamic profile 64% of the time. This suggests that HP in the ED is fairly unreliable and may not be of great use in most patients. An informal survey conducted by the authors at a national meeting of more than 20 ED HF investigators found that very few ED physicians were familiar with and/or routinely used HF in their clinical practice (unpublished data). The results of the current study may therefore represent a self-fulfilling prophecy if one considers that few ED physicians are familiar with or rely on HP. Whether HP is more reliable among cardiologists or other specialists remains to be determined.

The presence of a highly reliable, valid, noninvasive method for classifying the hemodynamic profiles of patients with HF would be highly useful to Emergency practitioners for a number of reasons. First, it would allow tailoring of specific therapies based on individual patient hemodynamic

Classification of patients as wet or centrally congested

was generally based on the presence of orthopnea, edema, or crepitations. Of these 3, only the presence of edema had good interrater reliability as evidenced by a ? of 0.67 (Table 2). In detecting signs of inadequate peripheral perfusion, the most frequent reported finding was cool extremities to the touch. The interrater agreements for cool extremities and hypoten- sion were only 0.46 and 0.51, respectively.


To be useful, a diagnostic tool (such as the HP) must be simple, logical, reliable, and valid [15]. A reliable tool is one that results in similar measurements when repeated by the same or different observers. Reliability is evaluated by repeating measurements of the same parameter while assuming that the clinical condition remains constant between measurements. Validity refers to how close a measure is to the true or criterion measure. Prior studies of HP that determined its association with central pulmonary artery pressures and clinical outcomes, such as mortality and need for cardiac transplantation, suggest that HP is valid. However, a prerequisite to using any clinical tool or measure is that it be reliable or that the same results be obtained when

2 different observers or physicians determine it. We are unaware of prior studies that evaluated the interrater or interobserver reliability of HP.

The results of our study demonstrate that emergency physicians accurately diagnose decompensated HF in 4 of 5 ED patients presenting with signs and symptoms consistent

profiles. Second, it would allow early and simple bedside risk stratification of patients with HF that would be helpful in determining patient disposition and prognosis. For example, studies have consistently found that patients profiled as cold and wet have a higher mortality and need for cardiac transplant at 1 year when compared with patients profiled as warm and wet [13,14]. A third reason why bedside HP would be useful is that it would allow researchers and investigators to stratify study patients into treatment groups based on their hemodynamic profile and simplify extrapolation of study results from one study to another.

Prior studies have correlated clinical HP with the results obtained on pulmonary artery catheterization and echocar- diography. However, these studies were performed by highly trained cardiologic specialists in the intensive care unit and outpatient settings and lack any testing of interrater agreement. Furthermore, these studies do not clarify how patients were classified into 1 of the 4 profiles. Although physical examination findings and ancillary tests have been considered to be accurate for the detection of pulmonary congestion and decreased perfusion when performed by cardiologists [13], they may be nonspecific and not always lead to an appropriate diagnosis [16].

Although HP was found to have low interrater reliability or agreement in the current study, this does not mean that HP does not have a role in clinical management. Prior studies clearly have demonstrated that HP performed by experi- enced cardiologists predicted prognosis. Thus, it is very possible that Interobserver agreement in patients at the extremes of the categories, in which pulmonary congestion and/or hypoperfusion are obvious, may actually be quite

high. In contrast, in patients with less obvious signs and symptoms of HF, the agreement may be much poorer. Thus, for the patients for whom accurate classification of their hemodynamic profile most matters, the HP may indeed be useful and reliable.

Similar to prior studies, most patients with suspected HF enrolled in this study presented with a complaint of dyspnea [16,17]. Dyspnea, the chief complaint in patients with congestive HF, is reported to be present in 2.7% of ED visits and 15% to 20% of all hospital admissions [16]. Orthopnea, on the other hand, is the most sensitive and specific symptom of elevated filling pressures [1,9]. In chronic HF, a proportional pulse pressure of less than 25% has been shown to correlate well with a cardiac index of less than 2.2 L/(min.m2) [10]. A positive hepatojugular reflux also correlates with elevated pulmonary capillary wedge pressures in chronic HF, whereas an abnormal arterial blood pressure response to a Valsalva maneuver predicts elevated pulmonary capillary wedge pressures with a sensitivity of 92% to 100% and a specificity of 83% to 91% [13]. Physical evidence of hypoperfusion includes a narrow pulse pressure, cool extremities, and occasionally an altered mental status with supporting evidence sometimes provided by decreasing serum sodium level and worsening renal function [1]. In the current study, we excluded patients that could not consent for themselves; therefore, none of our patients had an altered mental status.

The interrater reliability for detecting signs of congestion and perfusion was poor in most categories with the exception of edema (? = 0.67) and a narrow pulse pressure (? = 0.66). This may be because physical findings of JVD, an S3, a loud P2, and a hepatojugular reflux are difficult to detect [11]. Indeed, detection of signs of congestion such as elevated jugular venous pressure requires regular practice [12]; and although cardiologists attain a high agreement on the presence of elevated jugular venous pressure, it is likely that the reproducibility is much lower among nonspecialists. Similarly, although cardiologists may attain a high agree- ment for the presence of an S3 under study conditions, the interobserver agreement is less than 50% among nonspecia- lists. Pulmonary crepitations are not specific to HF, and interobserver agreement for this sign is poor [18]. Poor emergency physician interobserver reliability for detecting signs of central congestion and peripheral perfusion may also be a reflection of the noisy ED environment and the limited time accorded to the examination in the ED setting as well as to differences in levels of medical training. The poor reliability of auscultation of crepitations in the ED is further reflected by poor agreement of this finding with the chest X-ray findings and BNP levels in our study patients (results not shown). Surprisingly, interobserver agreement for hypotension and a narrow pulse pressure was also only fair to moderate. This may be a result of having the 2 physicians independently measure the patients’ blood pressure. Prior studies have demonstrated considerable beat-to-beat varia- bility in blood Pressure measurements even when obtained nearly simultaneously.

Finally, with regard to whether HP was of any use to the ED physicians, more than 75% of physicians felt that profiling was helpful in guiding therapy. However, when we looked at whether physicians agreed that HP was helpful in individual patients, interobserver agreement was only 0.07.

Study limitations

Our study has several notable limitations. First, because of the small sample size, the CIs around the point estimates for interrater reliability are quite wide. Therefore, it is possible that a larger study may have found better or worse agreement between the ED physicians. Second, the criterion standard that was used for diagnosing HF (the cardiologist’s discharge diagnosis) has been the subject of considerable debate. However, the lack of an appropriate gold standard would not have affected the measurement of interrater reliability. Third, the final diagnosis was determined by 1 cardiologist and 1 emergency physician instead of the standard 2 cardiologists. Fourth, some of the participating physicians were senior emergency medicine residents who may have lacked the experience of prior physicians who originally described HP. Thus, it is possible that having more experienced physicians participate in the study may have improved interrater reliability. Finally, we did not look into other methods of classifying patients with HF such as the New York Hospital Association classification and the Framingham criteria, which may have been more reliable and helpful in the ED setting.

In conclusion, 4 of 5 patients with a diagnosis of HF in the ED had a final discharge diagnosis of HF. Our study demonstrated poor to fair interrater agreement for assignment of hemodynamic profiles in patients with HF, questioning its reliability and utility in the ED setting.


  1. Grady KL, Dracup K, Kennedy G, et al. Team management of patients with heart failure. A statement for healthcare professionals from The Cardiovascular Nursing Council of the American Heart Association. Circulation 2000;102:2443-56.
  2. American Heart Association. Heart disease and stroke statistics–2005 update. Dallas (Tex): American Heart Association; 2005.
  3. Maisel A, Krishnaswamy P, Nowak R, et al. Rapid measurement of B-type natriuretic peptide in the emergency diagnosis of heart failure. N Engl J Med 2002;347:161-7.
  4. McDonagh T, Morrison C, Lawrence A. Symptomatic and asympto- matic left-ventricle systolic dysfunction in an urban population. Lancet 1997;350:829-33.
  5. Cheng BS, Kazanagra R, et al. A rapid bedside test for B-type peptide predicts Treatment outcomes in patients admitted for de- compensated heart failure: a pilot study. J Am Coll Cardiol 2001;37: 386-91.
  6. Heart disease and stroke statistics–2007 update. A report from the American Heart Association Statistics Committee and Stroke Statistics Subcommittee. Available at content/short/CIRCULATIONAHA.106.179918 [Accessed January 11, 2007].
  7. Hunt SA, Abraham WT, Chin MH, et al. ACC/AHA 2005 guideline update for the diagnosis and management of chronic heart failure in the adult: summary article: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines (Writing Committee to Update the 2001 Guidelines for the Evaluation and Management of Heart Failure). J Am Coll Cardiol 2005;46:1116-43.
  8. Nieminen MS, Bohm W, Cowie MR, et al. Executive summary of the guidelines on the diagnosis and treatment of acute heart failure: The Task Force on Acute Heart Failure of the European Society of Cardiology. Eur Heart J 2005;26:384-416.
  9. Forrester JS, Diamond GA, Swan JC. Correlative classification of clinical and hemodynamic function after acute myocardial infarction. Am J Cardiol 1977;39:137-45.
  10. Stevenson LW, Perloff J. The limited reliability of physical signs for estimating hemodynamics in chronic heart failure. JAMA 1989; 261:884-8.
  11. Wang CS, Fitz Gerald JM, Schulzer M, Mak E, Ayas NT. Does this dyspneic patient in the emergency department have congestive heart failure? JAMA 2005;294:1944-56.
  12. Adams KF, Lindfield J, Arnold JMO, et al. Executive summary: HFSA 2006 comprehensive heart failure practice guidelines. J Card Fail 2006;12:10-38.
  13. Nohria A, Tsang S, BS, et al. Clinical assessment identifies hemodynamic profiles that predict outcomes in patients admitted with heart failure. J Am Coll Cardiol 2003;41:1797-804.
  14. Nohria A, Eldrin L, Stevenson L. Medical management of advanced heart failure. JAMA 2002;287:628-40.
  15. Singer AJ, Thode HC, Hollander JE. Research fundamentals: selection and development of clinical outcome measures. Acad Emerg Med 2000;7:397-401.
  16. Morrison KL, Harrison A, et al. Utility of a rapid BNP assay in differentiating congestive heart failure from lung disease in patients presenting with dyspnea. J Am Coll Cardiol 2002;39:202-9.
  17. Remes J, Miettinen H, Reunanen A, Pyorala K. Validity of clinical diagnosis of heart failure in primary health care. Eur Heart J 1991;12:315-21.
  18. Cleland J. Guidelines for the diagnosis of heart failure. The Task Force on Heart Failure of the European Society of Cardiology. Eur Heart J 1995;16:741-51.

Leave a Reply

Your email address will not be published. Required fields are marked *