Adjusting Risk Adjustment in Medicare Advantage: A Conversation with Richard Kronick
Can an accelerating gravy train for Medicare Advantage be slowed or stopped?
This past March, as in many Marches previous, MedPAC’s annual report to Congress found that a) the federal government is paying Medicare Advantage plans more than it would pay to cover the same enrollees in traditional, fee-for-service Medicare; b) that excess payment is widening (from 104% in 2022 to 106% this year); c) almost all the excess payment (almost 5 percentage points) stems from a risk adjustment system that enables MA plans inflating their enrollees’ risk scores, and d) the risk score gap between MA enrollees and FFS enrollees is also widening.
Sum it all up, and risk adjustment stands out as the engine by which MA is swallowing FFS Medicare. 2023 is the first year in which more Medicare enrollees are enrolled in MA than in FFS. MedPAC raises the possibility that in some counties at least FFS may no longer serve as a reliable benchmark for CMS’s capitated payment rates to MA plans. Those benchmarks - -which, according to MedPAC, also require adjustment — are the tether that hold MA provider payment rates close to those set by FFS Medicare. That tether is basically the only effective control on provider payment rates.
A modest proposal: Revenue-neutral risk adjustment in MA
MA insurers’ inflation of their enrollees’ risk scores is so obvious and pervasive that CMS is statutorily required to shave a minimum of 5.9% off of MA risk scores. It’s not enough. MedPAC estimates that in 2022 MA risk scores exceeded the scores that MA enrollees would be ascribed in FFS Medicare by 10.8%. In November 2021, Richard Kronick, a former CMS official and current professor at UCSD, and F. Michael Chua, also of UCSD, pegged the MA coding excess at 20% — almost double the MedPAC estimate — and estimated that the resulting overpayments would total $600 billion from 2023 to 2031 if not adjusted.
For Kronick, the Nov. 2021 paper is the latest in a series of analyses (Kronick and Welch 2014, Kronick 2017) finding a steadily widening gap between MA and FFS risk scores. Kronick and Chua report that average risk scores in MA increased relative to FFS scores by 1% per year from 2006 to 2017 and by 2% per year from 2017-2019. At the same time, demographic and other evidence suggest that MA enrollees on average are no sicker, and in fact are probably slightly healthier, than FFS enrollees.
Kronick proposes a simple corrective: compare MA and FFS scores derived from basic demographic data (age, gender, Medicaid status) with risk scores derived from diagnostic data and cut MA risk scores across the board by the difference. The budget neutrality principle is that “aggregate payments to MA plans should be equal to the amount that would have been paid using demographic risk adjustment” (Kronick 2017). Diagnostic risk scores would then serve only to differentiate payment among MA plans, not to affect their payment rates relative to a FFS baseline.
That very system was in place when CMS first phased in diagnostic risk adjustment in 2004. At that point, average MA diagnostic risk scores were 8% below the FFS average. Ironically, Kronick explains, Democrats killed the coding intensity adjustment because at that point it created excess payments for MA plans, which had lower average risks scores than FFS enrollees. But just how hard it would be politically to restore a revenue-neutral system of risk adjustment that was in place before overpayment of MA plans induced the program to metastasize is illustrated by the power struggle triggered by CMS’s more modest proposed adjustments, and still more modest finalized adjustments, to the current system.
In April this year CMS finalized far less sweeping tweaks to the MA risk adjustment system. I spoke to Kronick about the rule changes, as well as about his own proposal and other ways of reducing or eliminating overpayment through risk adjustment. An edited version of that conversation is below.
First, a review of CMS’s latest adjustments to the system, as well as remedies proposed by MedPAC. If this is familiar ground to you, skip to “A Conversation with Richard Kronick” below.
CMS takes a “baby step” toward curbing upcoding
In its Advance Notice of payment rule updates for MA released in February, CMS proposed some mild curbs to “coding intensity,” embedded in a transition of the diagnoses categories underlying the risk adjustment system from ICD-9 to ICD-10, which Medicare adopted for payment purposes in 2015. Mapping the updated diagnoses into the diagnosis clusters that determine risk scores ( Hierarchical Condition Categories, or HCCs), entailed reducing the number diagnoses incorporated into the HCCs that affect payment — though the number of HCCs that affect payment themselves increased. The updated model cut about 2,000 ICD-10 codes out of the payment model. More to the point, CMS selectively cut another 75 codes “where there is wider variation in diagnosing and coding” — i.e., more opportunity for upcoding. In one widely cited example, codes for diabetes with unspecified complications or complications related to blood sugar were moved to the lowest payment rung, and drug-induced diabetes codes were categorized to a non-payment HCC.” As proposed, the change to the risk adjustment formula would have cut payments by 3% on average, though other changes to the MA payment formula would have resulted in a net 1% average increase. In a March 21 Kaiser Family Foundation webinar, Kronick described these adjustments as a “baby step.”
But MA insurers treat any slowing of the risk adjustment gravy train as a call to battle. While rule finalization was pending, the health insurers’ lobbying group the Better Medicare Alliance unleashed a $13.5 million advertising barrage devoted to scaring seniors about purported likely cuts to benefits (stronger prior trims to MA overpayment have never resulted in benefit cuts). Insurers also recruited patients to flood CMS with cookie-cutter pleas not to “cut” MA. Result: in its final rule, CMS phased the new risk adjustment changes in over three years, cutting their impact in 2024 by two-thirds, and boosting the 2024 payment increase to 3%.
MedPAC’s proposed risk adjustment…adjustments
For many years (most notably 2016), MedPAC’s annual reports have recommended or floated somewhat strong means of curbing MA risk coding abuse. Proposed measures reviewed in this year’s report include:
Increase the across-the-board haircut that CMS imposes on MA risk scores — currently 5.9%, the minimum mandated by Congress. MedPAC estimates that MA risk scores exceed the scores that MA enrollees would be ascribed in FFS Medicare by 10.8%.
Prohibit the use of Home Risk Assessments and “chart reviews” — retrospective combing of providers’ reports in search of more diagnoses — in MA plans’ risk scoring. A 2021 report by HHS’s Office of the Inspector General flagged HRAs and chart reviews as the major engines of risk score inflation, accounting for 64% of, um, “coding intensity.” This year’s MedPAC report to Congress reiterates: “We find that nearly two-thirds of MA coding intensity could be due to use of diagnoses from chart reviews and health risk assessments, and that these two mechanisms are a primary factor driving coding differences among MA plans.”
Identify the main offenders — the 2021 OIG report suggests just 10% MA insurers account for 79% of risk score inflation — and sort plans into tiers determined by their coding intensity, cutting the higher tiers more severely. For discussion of this option, and a dive into MA risk adjustment reform generally, see this Zoom conversation between healthcare journalists Merrill Goozner (also a Substacker), Bob Herman, and Mark Miller, part of this year’s Medicare Advantage Summit.
Develop a risk-adjustment model that uses two years of FFS and MA diagnostic data. MedPAC asserts that this “would improve the accuracy of both FFS and MA diagnostic information and would reduce year-to-year variation in documentation.”
A conversation with Richard Kronick
I spoke to Kronick about CMS’s adjustments, MedPAC’s proposed options, his own recommendations, and a different approach to revenue-neutral risk adjustment.
MA industry advocates have claimed that the HCCs that CMS cut out of the payment formula will lead plans to cut benefits important to those with targeted conditions, such as diabetes. I asked Kronick whether he saw any downsides to the adjustments, other than the political heat generated. He identified problems on the other end.
The updated coding regime “doesn’t discourage plans from trying to code more” — or reward those who don’t upcode as much,” he said. Further, “The HCCs that are most disproportionately coded in MA are not the ones CMS flagged. Excluding some of those that are most disproportionately coded is tough to do because some of them have strong patient advocacy groups.” More broadly, adjusting HCCs is “a complementary approach to getting the coding intensity adjustment [the annual across-the-board haircut] right that will take away at least some of the easier methods plans are using to increase risk scores.” While the cut HCCs may not have gone as far as Kronick would like, “I am delighted that CMS has its nose in the tent,” he said.
I asked Kronick about the tiered approach that MedPAC has floated — that is, varying the haircut according to how intensely evidence suggests each plan upcodes. He suggested that tiering makes sense on the merits but is difficult both technically and politically.
“In 2009, CMS proposed a tiered approach in the Advance Notice, but did not finalize the proposal,” Kronick recounted. “I was a consultant to CMS at the time. CMS proposed identifying the top 25% of contracts based on how much they coded, and implementing an adjustment for those contracts. But developing a technique for reliably measuring how much plans are coding is not easy. The main method CMS was using was analyzing how much risk scores increase for members enrolled for 24 months— how much increase compared to what you’d expect? MedPAC has extended and refined that method in their analyses on differential MA coding.”
“Another decent measure,” he added, “is to compare the diagnostic risk score to a risk-score developed using information on prescription drugs. The good news is that there’s a strong relationship between the two. However, the bad news is that there are some contracts that look like they are coding a lot using one measure but not the other, and it is not completely clear which approach provides a better estimate of differential coding.” In sum, Kronick said, “it makes sense to try to have tier-specific adjustments. But it would take work to have a tier-specific adjustment that would withstand the loud screams sure to come from the MA industry.”
Speaking of industry screaming: a coding intensity haircut based on the formula that Kronick has suggested — which would yield about three times the current haircut — would trigger a lot of it. In fact, in a 2015 advance notice of changes to its MA payment methods for 2016 (p. 19), CMS announced that it was considering calculating the coding intensity adjustment using a method similar to that subsequently supported by Kronick and others:
“Using such a model, we would first estimate the risk of MA-enrolled beneficiaries relative to the risk of beneficiaries in FFS. Next, we would calculate the ratio of MA-to-FFS risk using the CMS-HCC model. Using the difference between the two ratios, we would calculate the MA coding adjustment factor.”
CMS asked for comment. While industry objected, Kronick said, “they raised no concerns that I thought were troubling. We know enough to say that this method can stand up to industry objections.”
Revenue neutrality without haircuts
The purpose of risk adjustment is to remove plans’ incentives to attract healthier-than-average enrollees, and to pay plans appropriately for the risks they do assume. Most risk adjustment schemes are truly zero-sum: one plan’s loss (because its enrollees are healthier than average) is another’s gain. The Medicare Advantage risk adjustment system is unique in that plans are scored against a benchmark — FFS Medicare— that the majority of plans exceed. As with the plans' star ratings, in MA risk adjustment most of the children are above average.
I asked Kronick whether FFS Medicare couldn’t be left out of the MA risk adjustment formula entirely. He allowed that it could. “Medicaid programs use diagnostic risk adjustment — all in a zero sum environment, as do the plans in the ACA Marketplaces. By definition, the average risk score is 1.0. Without changing its benchmark approach to MA payment, CMS could assure that the average risk score was 1.0 and leave the bidding system in place.”
At least one issue would have to be worked out. Plans would have a timing problem, Kronick said: “A plan, when it was bidding, wouldn’t know what its risk score was going to be. You could potentially do it in arrears — take last year’s risk score.”
Perhaps ironically, Kronick suggested that one risk adjustment system redesign that some MA industry officials have supported implies a revenue-neutral environment. That is to recalibrate the HCCs to reflect MA plans’ own encounter data* instead of that of FFS Medicare. As an Urban Institute report on a roundtable summit devoted to the idea describes it:
Currently, the MA risk adjustment model is estimated using traditional Medicare data, meaning risk scores in MA reflect the treatment patterns and costs for conditions in traditional Medicare. Recalibrating the model with MA encounter data would make risk scores more accurately reflect the relative costs of providing care for patients with particular conditions in MA.
According to Kronick, if the HCCs were recalibrated in this way, “In year one, risk scores would average 1.0.” He added, however, that industry would not accept this willingly — “They would argue there needs to be some calibration of this model relative to FFS.”
There is a long-suffering backbeat to Kronick’s dispassionate analyses of MA upcoding and MedPAC’s often-repeated recommendations for curbing it. At bottom, the case for tinkering at the edges of the risk adjustment system for MA as opposed to ensuring that it is truly revenue neutral seems to come down to calibrating how much industry opposition CMS — or potentially, Congress — can withstand. On the bright side, a “baby step,” a “nose in the tent” — as with the first foray into Medicare drug price negotiation, mandated by the Inflation Reduction Act of 2022 — suggest that under certain political or fiscal conditions, stronger measures may be possible.
—
* Encounter data is the information submitted by healthcare providers, reflecting diagnoses and services provided. The Urban Institute report recounts: “Since 2012… CMS has also been collecting far more comprehensive encounter data —which covers all services provided to MA beneficiaries—from MA plans for use in calculating risk scores.” As noted above, however, the HCCs into which diagnoses are grouped for risk adjustment purposes are based on FFS data.