Last week, Medicare added patient satisfaction data to its hospital reporting website. This is progress, but it raises an interesting question: should patient satisfaction scores be case-mix adjusted?
The
motivation to include patient satisfaction data comes from the
Institute of Medicine’s inclusion of “patient-centeredness” as one key component of quality.
And what could be simpler than asking patients a few questions, as the
Center for Medicare & Medicaid Services (CMS) survey does. (A pdf
of the survey, formally known as HCAHPS, or “H-CAPS”, for Hospital
Consumer Assessment of Healthcare Providers and Systems, is here).
I like the addition of the patient experience data and found the
presentation on the CMS site to be fairly reader-friendly (as did US News & World Report’s
Avery Comarow). For example, it only took a few seconds to find my
hospital’s performance on the summary question, “Would you definitely
recommend this hospital?”:
UCSF Medical Center: 80% yesAverage for Northern and Central California: 65% yesAverage for all U.S. Hospitals: 67% yes
[You’ll note that we didn’t do too badly. But it would be legitimate to
wonder whether I, being relatively fond of my job and unenthusiastic
about being shunned by my colleagues, would have shown you something
that made us look crummy. You should have the same skepticism when you
look at every hospital’s web site, a point Peter Pronovost, Marlene
Miller, and I made in this JAMA article.]
One can debate the relative value of considering patient experience vs.
harder measures of quality and safety forever. Personally, I want both:
great technical quality (which few patients will be able to judge) as
well as a clean room with nice people who listen and communicate well.
There’s no reason that the dexterous surgeon needs to be a jerk, nor
that the empathic internist needs to be a diagnostic imbecile.
But,
like most things in healthcare, patient satisfaction measurement and
reporting is trickier than it looks. Think about it in terms of the Donabedian triad
of quality measurement: structure, process, and outcomes. One of the
advantages of using processes (did the patient get a beta blocker?) and
structure (are there intensivists available?) is that outcome
measurement requires case-mix adjustment to avoid apples-to-oranges
comparisons. Just comparing raw 30-day post-CABG mortality rates, for
example, would clearly be misleading, since the superb surgeon who
operates on older diabetic vasculopaths might well have a
higher-than-average mortality rate, notwithstanding his excellence.
Just as importantly, if you don’t employ scientifically bullet-proof
case-mix adjustment, the Pavlovian response of every provider and
hospital whose outcomes are worse than average is… “But–but–but… You
don’t understand… My patients are sicker and older!”
Not everybody’s patients can be sicker and older. Except in some Bizarro-World version of Lake Wobegon. But believe me, that is what everybody will claim.
Anyway,
it is largely for this reason that most publicly reported quality
measures to date have not been of outcomes – the science of case-mix
adjustment has not been ready for prime time. But this science is
getting better, and the world is clearly moving toward outcome
measurement and reporting: CMS and several states now report case-mix-adjusted CABG mortality, and California is now reporting case-mix-adjusted ICU outcomes via its CalHospitalCompare project.
If you think about it, patient satisfaction is simply another outcome measure.
But
do satisfaction survey responses need to be adjusted? Well, yes. For
example, maternity patients tend to rate their experience more highly
than do medical and surgical patients (no surprise there). Well
educated people tend to be more critical, and older patients are more
forgiving.
Impressively, the H-CAPS folks thought of some of this, and the data you see on hospitalcompare have been adjusted
for the following variables: service line (medical, surgical, or
maternity care), age, education, self-reported health status, language
other than English spoken at home, emergency room (ER) admission, and
the time between discharge and survey completion.
But is this enough? Probably not. As reported in the New York Times,
states showed substantial variation in their average satisfaction
scores. For example, 79% of patients in Alabama hospitals “would
definitely recommend” their hospital to friends and family, while only
64% of folks in New Jersey, 61% in Florida, and 56% in Hawaii would do
the same. I’m guessing that these differences are more likely to be due
to the characteristics of Floridians (spend a day visiting my family in
Boca if you doubt this) or in Hawaii (“hey, dude, I’m missing some gnarly grinders”) than differences in the niceness of nurses and doctors.
Would
it be possible to capture the personal characteristics that would fuel
a robust satisfaction “case-mix adjustment” engine? I’d guess that
insurance status would be a predictor of satisfaction; income might be
as well. I’d also wager that Nordstrom shoppers are more demanding than
Target shoppers, and that people with young kids or busy jobs are less
tolerant of long waits than retirees.
The point is that we
don’t understand the interactions between these subtle sociocultural
and economic variables and the likelihood of ranking your hospital or
doctor more highly. For now, this isn’t a big deal – the data are just
being put out there and folks can draw their own conclusions. (Medicare
penalizes hospitals – about $100 per hospital admission – for not
reporting, but there is no change in payment based on performance on
these measures. Yet.) But if satisfaction is ultimately tied to
reimbursement, or if patients or insurers begin making decisions based
on satisfaction data, it will be important to either adjust for these
variables or at least understand and describe them.
Some day,
the presentation of patient satisfaction scores may be similar to that
of presidential polling results: “Independents, single mothers, and
Asian men over 55 really adored Hospital X.” Or perhaps it will be more
like Amazon.com: “Customers like you prefer Hospital Y.”
Robert Wachter is widely regarded as a leading figure in the modern
patient safety
movement. Together with Dr. Lee Goldman, he coined the
term "hospitalist" in an influential 1996 essay in The New England
Journal of Medicine. His most recent book, Understanding Patient
Safety, (McGraw-Hill, 2008) examines the factors that have contributed
to what is often described as "an epidemic" facing American hospitals.
His posts appear semi-regularly on THCB and on his own blog "Wachter’s World."
Categories: Uncategorized
I feel patient satisfaction is not required to be advertised..if a patient is really satisfied his mouth will do the advertising..anyways thanks for sharing this with me!
I don’t think satisfaction scores should be adjusted. There are far too many factors and we’ll end up over-analyzing the data to a point where they are no longer meaningful.
I think the Medicare satisfaction data are meaningful as long as they are in aggregate form and are benchmarked with hospitals in the same region (as I expect an average consumer would naturally compare when shopping for a hospital). Satisfaction data should always be juxtaposed with quality data to make intelligent decisions.
Furthermore, hospitals should work on managing patients’ expectations as one way to improve scores.
I’m working on a site that will help consumers review Medicare satisfaction and quality data. I would love to get comments at http://www.wheretofindcare.com/blog.aspx.
Absolutely they should be adjusted. Standard RATER scales for pateint satisfaction have most of their variablity due to tangibles (waiting, environment, personality, etc…). Since satisfaction = expetations – experience the degree of complexity or specialty plays a major role. In familiar circumstances the patients are able to form expectations and this will impact on satisfaction. Whereas in uncommon circumstances (such as seeing a specialist) expectations create little or no variation in satisfaction estimates. The most basic example is waiting 30min — for blood work it will seem intolerable but for a specialist it will seem normal. Without case-mix adjustments the effect on expectations is not transportable from one institution to another.
http://www.waittimes.blogspot.com
Hhmmh. Dr Wachter touches a significant issue – namely the difficulty comparing patient satisfaction scores, as well as the fact that most patients likely will not be able to judge “technical quality”.
I personally think that Dr. Wachter might be getting too much into the quality rating thing. One major problem I see is that many patients have expectations that should not be met, in their own, or society’s, interest (antibiotics for viral infections or nondescript headaches, superfluous scans, disability certifications, desire for narcotic or other medications that are not indicated). All this is not exceptional. Likely, doctors who do the “right thing” will be punished with lower scores (counselling is time intensive and will only go so far).
I over all feel that this is a quite US specific obsession: ranking things, even very complex ones. The difference between being hospital #1 and #2 (or 5 and 8, or whatever) in town matters only if there are substantial (at least stat. significant) differences between the mean or median ratings. In my MSG, satisfaction scores were all very high and, for almost all physicians quite similar, despite all obvious differences in style, communication and possibly “technical quality”. If you want to quibble with these ratings, you would speculate about nonsignificant rating differences.
Patient satisfaction is important, but is not an absolute and never should be. Do work with ratings about the facility and the physician’s communication skills, otherwise take the very few easily measurable quality indicators that exist (e.g., as mentioned, beta blockers, or HbA1c) or do peer review of charts in order to rate “technical quality”. Do not bring this into a context where A is ranked superior to B when the differences are marginal or superficial. Do ask patients for anonymous feedback in open ended questions and act on the reasonable input. Don’t obsess about whether the “overall experience” is rated, on average, 9.3 compared to 9.1.
Hhmmh. Dr Wachter touches a significant issue – namely the difficulty comparing patient satisfaction scores, as well as the fact that most patients likely will not be able to judge “technical quality”.
I personally think that Dr. Wachter might be getting too much into the quality rating thing. One major problem I see is that many patients have expectations that should not be met, in their own, or society’s, interest (antibiotics for viral infections or nondescript headaches, superfluous scans, disability certifications, desire for narcotic or other medications that are not indicated). All this is not exceptional. Likely, doctors who do the “right thing” will be punished with lower scores (counselling is time intensive and will only go so far).
I over all feel that this is a quite US specific obsession: ranking things, even very complex ones. The difference between being hospital #1 and #2 (or 5 and 8, or whatever) in town matters only if there are substantial (at least stat. significant) differences between the mean or median ratings. In my MSG, satisfaction scores were all very high and, for almost all physicians quite similar, despite all obvious differences in style, communication and possibly “technical quality”. If you want to quibble with these ratings, you would speculate about nonsignificant rating differences.
Patient satisfaction is important, but is not an absolute and never should be. Do work with ratings about the facility and the physician’s communication skills, otherwise take the very few easily measurable quality indicators that exist (e.g., as mentioned, beta blockers, or HbA1c) or do peer review of charts in order to rate “technical quality”. Do not bring this into a context where A is ranked superior to B when the differences are marginal or superficial. Do ask patients for anonymous feedback in open ended questions and act on the reasonable input. Don’t obsess about whether the “overall experience” is rated, on average, 9.3 compared to 9.1.