Pay for Performance: Fad or Forever?
Pay for performance in health costs is a hot topic, but few tangible examples of these models exist. Is pay for performance the capitation panacea of the 1990s or a sustainable revolution for the way providers are paid?
To avoid the fate of other payment fads, the first step is agreement on the meaning of these simple words. This is more difficult than it might seem, since, typically, neither the payer nor the provider is concerned with defining terms. Instead, both are apt to debate how to administer the program. This paper seeks to clarify the possible meanings of pay for performance with enough specificity that stakeholders can seek agreement first. Any of the proposed meanings can be useful in the quest to improve health care quality in the United States, but agreement must come first. A viable pay for performance model requires defining both pay and performance.
What Is Payment?
First, consider the concept of pay. Providers commonly understand this to mean more payment is needed to make the investment necessary to improve performance. Implied is the belief that our delivery system is underfunded regarding the information system infrastructure.
Payers start from the position that performance needs to be improved to justify the current payment level. They believe we are already overpaying for the current quality of care and any additional money must come from redistributing current dollars from the poor performers to the better ones. Payers often assume improved performance will mean they will spend less money on health care costs in the future.
Providers think about money in terms of unit price, or reimbursement for an individual unit of service. Payers think of payment in terms of total expenditure for a population, or unit price times the number of services consumed (see Figure 1).

Figure 1. Paying for Care: Two Sides of the Same Coin
As a result, payers feel the fiscal impact when consumption of services rises or falls, even when unit price is unchanged. Individual providers feel the fiscal impact of growth or reduction in unit prices even when consumption is steady. Therefore, if costs rise due to increased utilization, providers may question where the additional dollars are being spent, since they may experience no growth in the payment they receive per service. This leads to the popular perception that society pays too much for health care, but does not pay enough for individual patient care to support the necessary systems investment to change the current delivery model.
There is significant risk in not clarifying this issue. If payers wish to redistribute resources, a systemic approach to performance measurement will be needed, involving many providers who may have no other relationship to each other. Providers, on the other hand, think of performance improvement in terms of acts they can individually undertake to maximize their relative position. However, since they are incapable of redistributing money from underperforming peers to themselves to create the new investment funds needed, they often see payer strategies as a dog that wont hunt.
What Is Performance?
Defining performance is no easier task. There are three common axes of performance that require agreement between the parties being measured and the parties doing the measuring. These are outcomes versus process of care; absolute versus relative performance; and individual versus groups of providers. This debate is further confounded by an unspoken dispute over the veracity of socalled administrative data versus clinical data (see Figure 2).

Figure 2. Defining Performance
Process vs. Outcome Measures
First, one must consider the endpoint of measurement. If given the choice between a better outcome or a better process, a patient would probably choose a better outcome. However, this does not axiomatically translate into a pay for performance scheme.
Clinical outcomes, such as survival rates, complication rates, and patient perception of functional status, are the result of a complex interplay of factors, some of which we currently understand and some we do not; some we can control and some we cannot. There is also debate regarding what we do and dont understand. Should a woman between 40 and 50 years old get a mammogram or not, and if so, will early detection make a difference in that womans life expectancy? Will treating low-grade prostate cancer in a 70-year-old man make a difference in his life expectancy? Should some women receive hormone replacement therapy?
The ambiguity of this debate is compounded by the fact that statistical performance of a group is composed of individual performance, some of whom responded to treatment and others who did not; and we cannot tell why the ones who responded did so and why those who failed to respond did not. Some look at this information and see black and white; others see gray. This leads us back to the issue of process versus outcome measures.
While we can measure outcomes, the real issue is can we change them? Or more precisely, can we change them using financial incentives that reward better outcomes? Clearly, if we measured and rewarded overly complex outcomes, we should not expect to see much of a change in outcomes despite pay for performance.
Agreement on this point is not hard to come by. It leads to the logical conclusion that if we want to improve performance, we should focus on the processes of care within the control of the measured providers and likely to contribute a better outcome if done perfectly. For example, was the right antibiotic given to the patient admitted with pneumonia? Was it given as quickly as possible after the physician recognized the need for it? There is an unstated assumption that even perfectly performing these steps will not guarantee that the patient will get better. Other factors, such as additional illnesses or genetic responsiveness to a drug, may confound the most perfectly executed treatment plan. But it is all we have. For the patient whose outcome did not change, did we improve something meaningful for them?
Patients and consumer advocates tend to champion the notion of outcomes, biological or perceptual. Providers seeking to improve the quality of care tend to organize around processes they can manage. Payer and public health advocates must decide if they are looking for the best result regardless of process, or seeking to engineer a better system.
The payer perspective is currently the dominant accountability model found in hospital and health plan accreditation and employer initiatives. These are based largely on process measures. This supports the notion that payers see themselves as participants in the improvement process itself. From the patient perspective, is it really important what percent of all patients receive the ordered antibiotic for pneumonia within two hours, or is it more important that an individual improved and was treated with respect?
Absolute vs. Relative Measures
Think of this axis as the difference between a traditional 100- point grading scale and a grading curve. A patient-centered approach might argue that what is important is comparative information that differentiates relevant competitors since a patient can only see one provider at a time. Furthermore, a patients notion of value would direct them to look for relevant differences to determine if a provider deserves their money, time, or loyalty. Patients will seek differentiation with or without the providers help since they are interested in relative performance.
Conversely, an academic medical model would likely emphasize performance measures that ensure everyone tending to patients treats them the same way. From this perspective, incentives should be aligned to provide all patients a minimum acceptable standard of care, preferably an evidence-based one. Evidence-based care describes observations occuring in investigational settings in a manner that cannot be statistically explained as a random event.
The opposing perspectives leave the issue of whether we should pay for superior competitive performance or superior performance against minimum standards of quality unresolved. Historically, performance measures have been organized around absolute standards. After a decade of this, the need for a performance- based payment model has not been met and new ones are being invented.
Group vs. Individual Performance
The third issue underlying performance measurement concerns whose performance is being evaluated. This can be the individual care provider or the subsystem of care (i.e., department, hospital, or the entire delivery system). The medical model of care is rooted in measuring the performance of the expert authority or physician. Complicating this is the concept that our health care system is really a collection of independent actors with common characteristics acting within the same frame of reference.
Individual performance measurement views the physician as the sole final accountable actor. There is a reason that military metaphors such as captain of the ship are used in health care and it leads to the notion that meaningful change must occur at the physician level. The current pay for performance model used by many health plans is a physician-centric model reflecting the belief that physicians largely control cost and quality.
Quality improvement experts recognize that system failures are often responsible for breaches in good process and lead to bad outcomes. More significant are the emerging models of improving hospital safety, paralleled after the aviation safety model and accept the fact that individuals make errors. A system must be created to mitigate harm that could arise from those errors. The corollary of this debate is, whose performance should be rewarded, individuals or systems?
Administrative vs. Clinical Data
The struggle to unilaterally define performance is aggravated by our limited measurement system. A relevant measurement system needs to use information generally available from all who are to be measured. The cost of collecting and analyzing the information must be sufficiently low as to not create a barrier in itself. Fortunately, everyone needs to get paid. The result is a rich and universal set of data around diagnosis, procedures, supplies and drugs used, and tests ordered.
Three decades of work have led to widely understood measures of care derived from this administrative data. This data can reveal resource consumption or outcomes such as morbidity, mortality, and complications. A variety of published methods have been developed to try to account for the underlying illness burden of patients to better evaluate provider efficiency or quality. Severity and risk adjustment considerations are often included in these measures. Critics argue this type of administrative data is inherently limited, perhaps to the point of being useless, for studying meaningful measures of performance. They argue the data is collected for payment purposes and is therefore incomplete and misses a large amount of information collected in the medical record that is useful in evaluating quality.
For example, a patient may not have received the expected drug for heart disease because of a known and valid reason such as an allergy or history of a complicating condition. This condition may not, in and of itself, be captured in the administrative data, but it may be present in the clinical data. The providers may appear to be giving suboptimal care when they are actually offering the correct care for that patient.
Detractors also note the inherent limitations in models of adjustment for severity and risk, which only account for about one-third of the observed variances in patient resources consumed or survival rates. Therefore, it is possible that two patients with different outcomes who seem matched for severity of illness are, in fact, not matched, and have very different underlying conditions. Rather than indicating a failed process, the outcome differences are transparent when one reviews the detailed medical record.
Doctors may cite many patient records that have been supposedly selected based on administrative data that show suboptimal quality yet contain no evidence of poor care. These advocates argue that meaningful performance measurements cannot occur until detailed clinical care can be examined. Given the current system, a universal performance measurement model based on the clinical record itself will require significant time and money to create. Do we have enough credible information today to create a meaningful system of financial incentives likely to result in better outcomes?
Emerging Trends
Emerging trends in pay for performance follow a few patterns. Several health plan payers have released models focusing on physician performance. However, they focus on groups of physicians rather than the individual physician. This is largely due to the limitation of any valid measurement method applicable at the individual physician level. Individual physicians see such a small number of patients with similar conditions, who are limited to one insurance plan, that it is hard to compare performance.
Another limitation of this model is the difficulty of making meaningful payment differentiation at the individual physician level which overrides the price negotiation dynamics setting the basic reimbursement level. This is particularly true where measuring performance against an absolute standard. Furthermore, health plans require large networks of physicians to be competitive. If nearly every doctor needs to be in-network, there is little play for variable incentives.
Recently, a small number of hospital-based pay for performance models have been introduced into some markets. Their success is likely to be aided or hindered by the programs ability to negotiate a clear understanding with the network providers of the definitions of pay and performance. They must also get agreement likely to result in behavior that improves performance due to incentives. Similar to physicians, hospitals often think that current payment levels are inadequate, and have a limited appetite for making increases contingent upon improved performance. Absent clarity of intent, these schemes are seen merely as discounting strategies.
The Center for Medicare and Medicaid (CMS) is also undertaking a pilot program to prove that a race to the top will result in overall improvement substantially offsetting the cost of the incentives. CMS has indicated that their pay for performance plan cannot result in significantly more dollars paid out to hospitals in aggregate than would otherwise be expended. Therefore, the savings must come through efficiency and redistribution from poor performers to superior performers. Performance must be improved before payment, so providers cannot wait for money to make investments. The federal government has an uncanny way of forcing market dynamics once they put their mind to it.
There is also a smoldering consumer information movement targeted at discretionary consumer decisions. These efforts include hospitals marketing of designations, (i.e., Solucients 100 Top, Healthgrade.coms five-star rating, U.S. News and World Reports Best Hospitals, and J. D. Powers Hospitals of Distinction). However, what appear to be similar outputs come from very different inputs. For example, U.S. News and World Reports method relies primarily on physician surveys of hospitals they would recommend. J. D. Power uses patient survey data. Solucients 100 Top list and Healthgrades ratings rely primarily on measures calculated from administrative data.
Based on how aggressively hospitals market such designations, there is a clear belief that the real pay for performance will come from discretionary care attracted to top performers. Many of these types of perceptual outcomes may not become a mainstream part of academic provider performance improvement programs. However, they are likely to throw much of the current investment in performance measurement infrastructure up for grabs. Nothing will put a finer point on the debate of process versus outcome measures than measuring performance rooted in a patients or physicians perception of care and service. Physicians have long known there is a wide chasm between biologically or technically correct care and good care from a patients perspective.
Pay for Value
Part of the contradictions uncovered by competitive notions of rewarding good providers comes from the recognition that purchasers pay for perceived value received. Value is related to, but not synonymous with, perfect processes or outcomes. A large body of research suggests that what patients want from their health care providers is both compassion and competence. Traditional provider-based measures of quality only reflect competence. Perhaps the compassion quotient explains why preferred hospitals are not always the best hospitals. When a public payer is involved, value is often ascribed to both conformity and competency. In this case, good hospitals are ones that are clinically good and play by the rules.
Pay for performance enters into virtually every contemporary conversation about rising health care costs or insufficient patient safety. In order for it to be worth anyones time to create or administer these models, the partners will have to reach explicit agreement on what they mean and why.

