Pay for Performance: Fad or Forever?

by Kaveh Safavi, M.D., J.D.

July 16, 2004

Pay for performance enters into virtually every contemporary conversation about rising healthcare costs or insufficient patient safety. To create or administer these models, participants mustreach explicit agreement on what they mean and why.

PDF download

Pay for performance in health costs is a hot topic, but few tangible examples
of these models exist. Is pay for performance the capitation panacea of the
1990s or a sustainable revolution for the way providers are paid?

To avoid the fate of other payment fads, the first step is agreement on the
meaning of these simple words. This is more difficult than it might seem, since,
typically, neither the payer nor the provider is concerned with defining terms.
Instead, both are apt to debate how to administer the program. This paper seeks
to clarify the possible meanings of “pay for performance” with enough specificity
that stakeholders can seek agreement first. Any of the proposed meanings can
be useful in the quest to improve health care quality in the United States,
but agreement must come first. A viable pay for performance model requires defining
both pay and performance.

What Is Payment?

First, consider the concept of “pay.” Providers commonly understand this to
mean more payment is needed to make the investment necessary to improve performance.
Implied is the belief that our delivery system is underfunded regarding the
information system infrastructure.

Payers start from the position that performance needs to be improved to justify
the current payment level. They believe we are already overpaying for the current
quality of care and any additional money must come from redistributing current
dollars from the poor performers to the better ones. Payers often assume improved
performance will mean they will spend less money on health care costs in the

Providers think about money in terms of unit price, or reimbursement for an
individual unit of service. Payers think of payment in terms of total expenditure
for a population, or unit price times the number of services consumed (see Figure

Figure 1. Paying for Care: Two Sides of the Same Coin

As a result, payers feel the fiscal impact when consumption of services rises
or falls, even when unit price is unchanged. Individual providers feel the fiscal
impact of growth or reduction in unit prices even when consumption is steady.
Therefore, if costs rise due to increased utilization, providers may question
where the additional dollars are being spent, since they may experience no growth
in the payment they receive per service. This leads to the popular perception
that society pays too much for health care, but does not pay enough for individual
patient care to support the necessary systems investment to change the current
delivery model.

There is significant risk in not clarifying this issue. If payers wish to redistribute
resources, a systemic approach to performance measurement will be needed, involving
many providers who may have no other relationship to each other. Providers,
on the other hand, think of performance improvement in terms of acts they can
individually undertake to maximize their relative position. However, since they
are incapable of redistributing money from underperforming peers to themselves
to create the new investment funds needed, they often see payer strategies as
“a dog that won’t hunt.”

What Is Performance?

Defining performance is no easier task. There are three common axes of performance
that require agreement between the parties being measured and the parties doing
the measuring. These are outcomes versus process of care; absolute versus relative
performance; and individual versus groups of providers. This debate is further
confounded by an unspoken dispute over the veracity of socalled administrative
data versus clinical data (see Figure 2).

Figure 2. Defining Performance

Process vs. Outcome Measures

First, one must consider the endpoint of measurement. If given the choice between
a better outcome or a better process, a patient would probably choose a better
outcome. However, this does not axiomatically translate into a pay for performance

Clinical outcomes, such as survival rates, complication rates, and patient
perception of functional status, are the result of a complex interplay of factors,
some of which we currently understand and some we do not; some we can control
and some we cannot. There is also debate regarding what we do and don’t understand.
Should a woman between 40 and 50 years old get a mammogram or not, and if so,
will early detection make a difference in that woman’s life expectancy? Will
treating low-grade prostate cancer in a 70-year-old man make a difference in
his life expectancy? Should some women receive hormone replacement therapy?

The ambiguity of this debate is compounded by the fact that statistical performance
of a group is composed of individual performance, some of whom responded to
treatment and others who did not; and we cannot tell why the ones who responded
did so and why those who failed to respond did not. Some look at this information
and see black and white; others see gray. This leads us back to the issue of
process versus outcome measures.

While we can measure outcomes, the real issue is can we change them? Or more
precisely, can we change them using financial incentives that reward better
outcomes? Clearly, if we measured and rewarded overly complex outcomes, we should
not expect to see much of a change in outcomes despite pay for performance.

Agreement on this point is not hard to come by. It leads to the logical conclusion
that if we want to improve performance, we should focus on the processes of
care within the control of the measured providers and likely to contribute a
better outcome if done perfectly. For example, was the right antibiotic given
to the patient admitted with pneumonia? Was it given as quickly as possible
after the physician recognized the need for it? There is an unstated assumption
that even perfectly performing these steps will not guarantee that the patient
will get better. Other factors, such as additional illnesses or genetic responsiveness
to a drug, may confound the most perfectly executed treatment plan. But it is
all we have. For the patient whose outcome did not change, did we improve something
meaningful for them?

Patients and consumer advocates tend to champion the notion of outcomes, biological
or perceptual. Providers seeking to improve the quality of care tend to organize
around processes they can manage. Payer and public health advocates must decide
if they are looking for the best result regardless of process, or seeking to
engineer a better system.

The payer perspective is currently the dominant accountability model found
in hospital and health plan accreditation and employer initiatives. These are
based largely on process measures. This supports the notion that payers see
themselves as participants in the improvement process itself. From the patient
perspective, is it really important what percent of all patients receive the
ordered antibiotic for pneumonia within two hours, or is it more important that
an individual improved and was treated with respect?

Absolute vs. Relative Measures

Think of this axis as the difference between a traditional 100- point grading
scale and a grading curve. A patient-centered approach might argue that what
is important is comparative information that differentiates relevant competitors
since a patient can only see one provider at a time. Furthermore, a patient’s
notion of value would direct them to look for relevant differences to determine
if a provider deserves their money, time, or loyalty. Patients will seek differentiation
with or without the provider’s help since they are interested in relative performance.

Conversely, an academic medical model would likely emphasize performance measures
that ensure everyone tending to patients treats them the same way. From this
perspective, incentives should be aligned to provide all patients a minimum
acceptable standard of care, preferably an evidence-based one. Evidence-based
care describes observations occuring in investigational settings in a manner
that cannot be statistically explained as a random event.

The opposing perspectives leave the issue of whether we should pay for superior
competitive performance or superior performance against minimum standards of
quality unresolved. Historically, performance measures have been organized around
absolute standards. After a decade of this, the need for a performance- based
payment model has not been met and new ones are being invented.

Group vs. Individual Performance

The third issue underlying performance measurement concerns whose performance
is being evaluated. This can be the individual care provider or the subsystem
of care (i.e., department, hospital, or the entire delivery system). The medical
model of care is rooted in measuring the performance of the expert authority
or physician. Complicating this is the concept that our health care system is
really a collection of independent actors with common characteristics acting
within the same frame of reference.

Individual performance measurement views the physician as the sole final accountable
actor. There is a reason that military metaphors such as captain of the ship
are used in health care and it leads to the notion that meaningful change must
occur at the physician level. The current pay for performance model used by
many health plans is a physician-centric model reflecting the belief that physicians
largely control cost and quality.

Quality improvement experts recognize that system failures are often responsible
for breaches in good process and lead to bad outcomes. More significant are
the emerging models of improving hospital safety, paralleled after the aviation
safety model and accept the fact that individuals make errors. A system must
be created to mitigate harm that could arise from those errors. The corollary
of this debate is, whose performance should be rewarded, individuals or systems?

Administrative vs. Clinical Data

The struggle to unilaterally define performance is aggravated by our limited
measurement system. A relevant measurement system needs to use information generally
available from all who are to be measured. The cost of collecting and analyzing
the information must be sufficiently low as to not create a barrier in itself.
Fortunately, everyone needs to get paid. The result is a rich and universal
set of data around diagnosis, procedures, supplies and drugs used, and tests

Three decades of work have led to widely understood measures of care derived
from this administrative data. This data can reveal resource consumption or
outcomes such as morbidity, mortality, and complications. A variety of published
methods have been developed to try to account for the underlying illness burden
of patients to better evaluate provider efficiency or quality. Severity and
risk adjustment considerations are often included in these measures. Critics
argue this type of administrative data is inherently limited, perhaps to the
point of being useless, for studying meaningful measures of performance. They
argue the data is collected for payment purposes and is therefore incomplete
and misses a large amount of information collected in the medical record that
is useful in evaluating quality.

For example, a patient may not have received the expected drug for heart disease
because of a known and valid reason such as an allergy or history of a complicating
condition. This condition may not, in and of itself, be captured in the administrative
data, but it may be present in the clinical data. The providers may appear to
be giving suboptimal care when they are actually offering the correct care for
that patient.

Detractors also note the inherent limitations in models of adjustment for severity
and risk, which only account for about one-third of the observed variances in
patient resources consumed or survival rates. Therefore, it is possible that
two patients with different outcomes who seem matched for severity of illness
are, in fact, not matched, and have very different underlying conditions. Rather
than indicating a failed process, the outcome differences are transparent when
one reviews the detailed medical record.

Doctors may cite many patient records that have been supposedly selected based
on administrative data that show suboptimal quality yet contain no evidence
of poor care. These advocates argue that meaningful performance measurements
cannot occur until detailed clinical care can be examined. Given the current
system, a universal performance measurement model based on the clinical record
itself will require significant time and money to create. Do we have enough
credible information today to create a meaningful system of financial incentives
likely to result in better outcomes?

Emerging Trends

Emerging trends in pay for performance follow a few patterns. Several health
plan payers have released models focusing on physician performance. However,
they focus on groups of physicians rather than the individual physician. This
is largely due to the limitation of any valid measurement method applicable
at the individual physician level. Individual physicians see such a small number
of patients with similar conditions, who are limited to one insurance plan,
that it is hard to compare performance.

Another limitation of this model is the difficulty of making meaningful payment
differentiation at the individual physician level which overrides the price
negotiation dynamics setting the basic reimbursement level. This is particularly
true where measuring performance against an absolute standard. Furthermore,
health plans require large networks of physicians to be competitive. If nearly
every doctor needs to be in-network, there is little play for variable incentives.

Recently, a small number of hospital-based pay for performance models have
been introduced into some markets. Their success is likely to be aided or hindered
by the program’s ability to negotiate a clear understanding with the network
providers of the definitions of pay and performance. They must also get agreement
likely to result in behavior that improves performance due to incentives. Similar
to physicians, hospitals often think that current payment levels are inadequate,
and have a limited appetite for making increases contingent upon improved performance.
Absent clarity of intent, these schemes are seen merely as discounting strategies.

The Center for Medicare and Medicaid (CMS) is also undertaking a pilot program
to prove that a race to the top will result in overall improvement substantially
offsetting the cost of the incentives. CMS has indicated that their pay for
performance plan cannot result in significantly more dollars paid out to hospitals
in aggregate than would otherwise be expended. Therefore, the savings must come
through efficiency and redistribution from poor performers to superior performers.
Performance must be improved before payment, so providers cannot wait for money
to make investments. The federal government has an uncanny way of forcing market
dynamics once they put their mind to it.

There is also a smoldering consumer information movement targeted at discretionary
consumer decisions. These efforts include hospital’s marketing of designations,
(i.e., Solucient’s 100 Top,’s five-star rating, U.S. News and
World Report’s Best Hospitals, and J. D. Power’s Hospitals of Distinction).
However, what appear to be similar outputs come from very different inputs.
For example, U.S. News and World Report’s method relies primarily on physician
surveys of hospitals they would recommend. J. D. Power uses patient survey data.
Solucient’s 100 Top list and Healthgrade’s ratings rely primarily on measures
calculated from administrative data.

Based on how aggressively hospitals market such designations, there is a clear
belief that the real pay for performance will come from discretionary care attracted
to top performers. Many of these types of perceptual outcomes may not become
a mainstream part of academic provider performance improvement programs. However,
they are likely to throw much of the current investment in performance measurement
infrastructure up for grabs. Nothing will put a finer point on the debate of
process versus outcome measures than measuring performance rooted in a patient’s
or physician’s perception of care and service. Physicians have long known there
is a wide chasm between biologically or technically correct care and good care
from a patient’s perspective.

Pay for Value

Part of the contradictions uncovered by competitive notions of rewarding “good”
providers comes from the recognition that purchasers pay for perceived value
received. Value is related to, but not synonymous with, perfect processes or
outcomes. A large body of research suggests that what patients want from their
health care providers is both compassion and competence. Traditional provider-based
measures of quality only reflect competence. Perhaps the compassion quotient
explains why preferred hospitals are not always the best hospitals. When a public
payer is involved, value is often ascribed to both conformity and competency.
In this case, good hospitals are ones that are clinically good and play by the

Pay for performance enters into virtually every contemporary conversation about
rising health care costs or insufficient patient safety. In order for it to
be worth anyone’s time to create or administer these models, the partners will
have to reach explicit agreement on what they mean and why.

Share Button


No comments yet.

Leave a Reply

You must be logged in to post a comment.