Fluorescence in situ hybridization (FISH) and immunohistochemistry (IHC) tests are commonly used to assess human epidermal growth factor 2 (HER2) status of tumors in patients with breast cancer. This analysis evaluates the likely cost-effectiveness of expanded retesting to assess HER2 tumor status in women with early stage breast cancer.
We developed a decision-analytic model to estimate the incremental cost-effectiveness ratio (ICER) of expanded reflex testing from a US payer perspective. Expanded reflex testing is defined as retesting tumor specimens from patients whose tumors are IHC0, IHC1+, or FISH-negative on their first test. In the base case, we assumed that 80% of patient tumors are initially IHC-tested and 20% are FISH-tested. Testing outcomes for IHC and FISH with and without retesting were based on published meta-analyses. The cost of tests and treatment and the long-term health outcomes were obtained from the literature.
In the base case, we estimated that 2.27% of women who received expanded reflex testing would be HER2-positive and receive trastuzumab treatment: the projected ICER was $36,721 per life year or $39,745 per quality-adjusted life year (QALY). This varied between $47,100 per QALY and $35,500 per QALY if we assumed that 1%-8% of patients retested were then HER2+, respectively. The results of deterministic and probabilistic sensitivity analysis were robust. This strategy would result in 4700 (2000-17,000) patients being eligible to receive trastuzumab treatment annually.
Breast cancer, the second most common cause of cancer death in women, is responsible for approximately 15% of cancer deaths in the United States. It is estimated that in 2013 there will be 232,340 cases of invasive breast cancer in the United States and that 39,620 women will die from the disease.
Approximately 20% of patients with breast cancer have human epidermal growth factor 2(HER2)-positive disease, which is associated with a poor prognosis.[2-4] HER2-positive tumors are also responsive to treatment with trastuzumab (Herceptin; Genentech, South San Francisco, CA), a monoclonal antibody that targets HER2, reducing the risk of recurrence and improving survival.[5, 6]
Two pathological diagnostic tests are commonly used to test for HER2 status: immunohistochemistry (IHC) and fluorescent in situ hybridization (FISH). IHC measures HER2 overexpression and is simple to implement, available in most laboratories, and relatively inexpensive. However, tissue handling and fixation, including time to fixation and fixation time, as well as test sensitivity and specificity can affect accuracy. FISH measures HER2 gene amplification and is less susceptible to tissue handling and fixation problems, but it is more complicated, requires training and experience, and is more expensive. The IHC test is scored as follows: 0 and 1+, negative; 2+, equivocal; and 3+, positive. The FISH test can also be positive, equivocal, or negative depending on the gene/chromosome 17 ratio (ie, a ratio of <1.8 is negative, a ratio of 1.8-2.2 is equivocal, and a ratio of >2.2 is positive). Because of the poor correlation between weak positivity by IHC2+ and FISH positivity, it is currently recommended that patients who are IHC2+ are considered to be equivocal and are reflex-tested with FISH to determine HER2 status.
Tumors of approximately 80% of new invasive breast cancer patients are tested for HER2 using IHC and 20% are tested using FISH. The National Comprehensive Cancer Network (NCCN), American Society of Clinical Oncology (ASCO), the College of American Pathologists (CAP), and other groups recommend treating patients whose tumors test IHC3+ or FISH-positive with trastuzumab, treating patients whose tumors are IHC0 or 1+ or FISH-negative with standard chemotherapy and retesting tumors that test IHC2+. ASCO and CAP also recommend either using IHC assays for initial evaluation of HER2 status followed by reflex testing by FISH of some IHC categories or primary use of FISH in initial testing.[10, 11] While concordance (defined as the number of patients who test the same on both tests) between unequivocal IHC and FISH results can be as high as 96%, recent studies have suggested that significant discordance remains.[12, 13]
Intratumoral discordance between test results may lead to false-negative results for HER2 overexpression status. Patients with a false-negative tumor result would be denied the clinical benefits of treatment with trastuzumab, which could have serious morbidity and mortality consequences. An additional important finding is that 2 large randomized trials now suggest that IHC-positive, FISH-negative patients demonstrate clinical benefit when treated with trastuzumab with a hazard ratio similar to that seen in patients whose tumors are both IHC- and FISH-positive. Finally, because of a small rate of discordance between IHC and FISH, some tumors may be positive by 1 test and negative by the second test, and a tumor may be mistakenly classified as HER2-negative if it falls into this category. Expanded reflex testing, as depicted in Figure 1, would reduce the likelihood of this occurrence.
Expanded reflex testing, represented schematically in Figure 2, implies taking a “believe the positive” approach (supported by extensive data) in which patients testing positive by either test would receive treatment with trastuzumab and patients testing negative by either test (IHC0, IHC1+, or FISH-negative) would receive the opposite test for confirmation. This would expand the categories of patients receiving additional testing for HER2 status, and the result would be a substantial reduction in patients with false-negative results (who would have been denied the benefits of adjuvant anti-HER2 therapy), but accepting the risk of increasing the number of patients with false-positive results (who would incur the added costs and possible adverse effects of trastuzumab).
The objective of this analysis was to estimate the potential cost-effectiveness of expanded reflex testing for HER2 status among patients with early stage breast cancer. We assessed whether substantially reducing false-negative results is a rational choice for both cost and effectiveness reasons.
MATERIALS AND METHODS
This economic evaluation was a cost-utility assessment that compared the costs and outcomes of 2 HER2 testing algorithms from a US payer perspective: expanded reflex testing and the standard testing algorithm. Expanded reflex testing was defined as retesting IHC0, IHC1+, or FISH-negative early stage breast cancer tumors for HER2 status, while standard HER2 testing involved retesting only IHC2+ specimens using FISH in line with NCCN guidelines.
The cost-effectiveness analysis utilized a decision-analytic model. In the standard testing algorithm, tumor specimens from patients with a diagnosis of breast cancer would be tested for HER2 status using either FISH or IHC. Patients whose tumors were initially tested using FISH and were found to be positive received treatment with trastuzumab, while patients with tumors found to be FISH-negative did not receive further testing and were not treated with trastuzumab.
In the expanded reflex testing arm, tumor specimens of patients with a diagnosis of breast cancer were also tested for HER2 status using either FISH or IHC. Patients whose tumors were first tested using FISH and found to be FISH-positive were treated with trastuzumab. However, patients whose tumors were FISH-negative received a second test (IHC) to ascertain HER2 overexpression. Patients in this group whose tumors were IHC3+ were treated with trastuzumab, while patients whose tumors were negative or equivocal by the IHC test (IHC0, IHC1+, IHC2+) were not treated with trastuzumab. Patients whose tumors were initially tested with IHC and found to be IHC3+ were treated with trastuzumab, while patients with IHC0, IHC1+, and IHC2+ results received a second test and were tested with FISH. Of these patients, those whose tumors were found to be FISH-positive were treated with trastuzumab.
Table 1 shows a summary of the parameters used in the decision-analytic model. In the base case, it was assumed that 80% of patient tumors are initially IHC-tested, and 20% FISH-tested. The probability of FISH positivity at initial testing was 0.207 while the distribution of IHC test results at initial testing was 36.1% IHC0, 35.5% IHC1+, 12.0% IHC2+, and 16.2% IHC 3+. The probability of testing FISH-positive after initial IHC test was 1.6% if previously IHC0, 4.9% if previously IHC1+, 29.8% if previously IHC2+, and 92.4% if previously IHC3+. The distribution of IHC outcomes following a negative FISH test were as follows: IHC0 and IHC1+ 66.1%; IHC 2+ 31.3%; and IHC 3+ 2.6%. Following a HER2-positive test, we assumed that 95% of patients accept the recommendation to receive trastuzumab.
Table 1. Parameters of Decision-Analytic Model
Abbreviations: FISH, fluorescence in situ hybridization; IHC, immunohistochemistry; QALYs, quality-adjusted life years.
Costs were expressed in 2010 US dollars. Costs from earlier years were converted into 2010 costs using the US Consumer Price Index for health care. Costs were divided into costs of HER2 testing with FISH or IHC, costs of breast cancer treatment without trastuzumab, and costs of trastuzumab. The cost of a FISH test was $339, while the cost of an IHC test was $118.16 The lifetime cost of treatment without trastuzumab was $31,371, while the lifetime cost of trastuzumab was estimated based on 2010 Average Sales Price payment allowance limits for Medicare Part B at $57,677.18 Infusion costs are included as well. Because we adopted a US payer perspective, we are not considering indirect time or productivity costs (eg, those associated with reduced labor force participation).
Life Years and Quality-Adjusted Life Years
Treatment with chemotherapy alone is associated with 11.88 life years (LYs) and 10.08 quality-adjusted life years (QALYs), whereas treatment with chemotherapy and trastuzumab is associated with 13.72 LYs and 11.78 QALYs.
Based on recent epidemiological data, we projected that approximately 218,000 women with early stage breast cancer would have their tumors tested for HER2 status in 2011. Given the test result probabilities, we estimated the percentage and number of women who would have an initial false-negative result.
Robustness of results was assessed using sensitivity analysis. The percentage of patients whose tumors would be reclassified as HER2-positive by reflex testing was varied between 1% and 8% to determine the impact of this parameter on the cost-effectiveness of reflex testing. Parameter values were also varied across a range of plausible values to determine the impact of variation on the overall results. When data were available, the 95% confidence or credibility ranges were used, and when unavailable, the most plausible values were used. Univariate deterministic sensitivity analysis was performed by assessing sensitivities to individual parameters. To further test the robustness of our results, probabilistic sensitivity analysis was conducted. Probability distributions for all parameters in the model were created. The base-case value was used for the mean, and the standard error was estimated based on the approximation that the range used for 1-way sensitivity analyses represented a 95% confidence interval, with the range approximately equal to 4 times the standard error. A beta distribution was used for probabilities and utilities, and a gamma distribution was used for costs. Monte Carlo simulation was used to create 10,000 iterations for which the expected outcome values were calculated. The probability that expanded reflex testing was cost-effective was then calculated for various levels of willingness to pay per QALY gained. Data analysis was performed using Microsoft Excel and TreeAge Pro.
In the base case, retesting IHC0, IHC1+, or FISH-negative tumors of patients with early stage breast cancer for HER2 status (expanded reflex testing) is estimated to increase the proportion of early breast cancer patients eligible to be treated with trastuzumab from 19.12% to 21.31%, an increment of 2.27% (Table 2). Expanded reflex testing is estimated to cost an additional $1455 per patient treated with an expected gain of 0.040 LYs and 0.037 QALYs (Table 2). Consequently, the incremental cost per LY gained is $36,721 and the incremental cost per QALY gained is $39,745.
Table 2. Cost-Effectiveness Results
Current Testing Algorithm
Expanded Reflex Testing
Values may be affected by rounding error.
Abbreviations: LYs, life years; QALYs, quality-adjusted life years.
% Treated with trastuzumab
Using the base case test result probabilities, we estimated that 2.27% of women would have a false-negative result in 2011. This would affect approximately 4700 women annually in the United States.
Varying the incremental proportion of women who were included in the trastuzumab treatment pool from 1% to 8% resulted in an incremental cost-effectiveness ratio (ICER) ranging from $47,110 per QALY to $35,579 per QALY (Figure 3), and the extra number of women treated annually in the United States varied from 2180 to 17,440.
One-way sensitivity analysis of key parameters is illustrated as a tornado diagram (Figure 4). The ICER ($/QALY) was most sensitive to the probability that the reflex IHC test was IHC0 or IHC1+ if the index FISH test was negative and the utility associated with chemotherapy with trastuzumab. The ICER ($/QALY) was more robust to other probabilities, utility, life expectancy, and costs.
Probabilistic sensitivity analysis (Figure 5) showed that at a conservative cost-effectiveness threshold of $50,000 per QALY, the probability that expanded reflex testing would be cost-effective is approximately 87%; at a threshold of $100,000 per QALY, the cost-effectiveness would be nearly certain.
In this study, the projected potential lifetime cost per QALY for implementing expanded reflex testing was estimated to be $39,745, a level that is below commonly accepted thresholds for cost-effectiveness for health care interventions.[21, 22] This projected benefit is driven by the additional yield of patients who are eligible for trastuzumab therapy—a 2.27% increase in the proportion of invasive breast cancer patients receiving this treatment. This increase amounts to 4700 additional women in the United States who would receive trastuzumab therapy annually in the base case. Over their lifetimes, they would be projected to gain an additional 9016 life years and 8330 QALYs (both discounted to a present value basis). This magnitude of this benefit is uncertain and could range from 2000 to 17,000 women within the plausible range. Nonetheless, the benefits are substantial and imply that the potential additional yield that results in reducing false-negative cases at the expense of increasing false-positive cases is cost-effective and may justify a change in clinical policy.
A recent retesting study performed in the VIRGO observational cohort, which included a substudy of 499 patients whose tumors were HER2-negative when locally tested, showed that 22 (4%) patients were found to be HER2-positive according to central laboratory testing. Importantly, of these 22 patients, 15 (68%) were found to be positive using a test that was not performed locally. This finding suggests that the 2.27% base case projection is likely to be conservative.[23-25] At that false-negative probability, the lifetime cost per QALY would improve, falling by 7% to about $37,000.
Our analysis implicitly assumes that all of the additional patients identified by reflex testing would accrue the additional benefit due to trastuzumab: that is, they would have the same biological response to treatment as patients identified in the definitive trials, who were identified by the current testing algorithm. This assumption is supported by data from the adjuvant N9831 trial and the HERA trial.
The other assumption implicit in our modeling framework is that there is no differential uptake or acceptance of trastuzumab treatment between women identified by the current testing algorithm and the extra women identified as a result of expanded reflex testing. This assumption is plausible because the implementation of reflex testing is likely to be a decision that is moved from the patient to the physicians. The pathologist would most likely be the decision maker and would simply perform any revised testing protocol: they would test each negative sample with the opposite test, and they would recommend trastuzumab to patients testing positive, regardless of whether they tested positive on the index or the reflex test, unless there are contraindications.
Misclassification of HER2 results can arise due to quality of laboratory testing and from the test itself. Misclassification due to quality of laboratory testing may be remedied by standardizing laboratory test procedures or repeating tests with independent pathologists. However, misclassification due to the test itself requires the development of a perfect test, which may not be feasible. Such a test might also be prohibitively costly. Although a few authors have suggested that FISH is superior to IHC in reproducibility and precision, recent data contradict this assumption: other data also suggest that both tests are similar in terms of reproducibility. Because both IHC and FISH retain a level of subjectivity in their interpretation, reflex testing might offer a cost-effective compromise between allowing women with potential false-negative tumor results miss out on the benefits of trastuzumab and perfect sensitivity and specificity in which there are no false-positive or false-negative results. Another source of potential misclassification of HER2 results may arise due to intratumoral heterogeneity, and one might consider retesting excision specimens for HER2 status even if tumors tested HER2-negative at the initial biopsy.
The “believe the positive” approach, the argument behind expanded reflex testing, implies that patients who are incorrectly diagnosed as being HER2-positive in the face of imperfect tests are treated with trastuzumab, and data suggest that they benefit from this treatment. This is a testament to the difficulty of defining false-positivity for FISH and IHC in this early breast cancer context given that they are based on subjective interpretation of pathological samples and that a true gold standard may not exist. In theory, patients incorrectly diagnosed as being HER2-positive would receive trastuzumab inadvertently and would face both an unnecessary incremental cost (of medication and adverse events) and the disutility of adverse events. We find no evidence that this is the case, and it has been suggested that misclassification leading to missed HER2 cases has worse consequences than misclassification leading to HER2-negative cases getting treatment with trastuzumab because of recent advances in reducing trastuzumab-related level 3 and 4 cardiotoxicity.
We found no economic evaluations of expanded reflex testing in the literature. Other economic evaluations of HER2 testing have attempted to identify the most cost-effective combination of FISH and IHC to minimize both false-positive and false-negative results. Dendukuri et al compared the cost-effectiveness of the following strategies for identification of early stage breast cancer patients for treatment with trastuzumab in Ontario: 1) IHC followed by FISH for IHC2+ patients; 2) IHC only with IHC2+ and 3+ receiving trastuzumab; 3) IHC only with IHC3+ receiving trastuzumab; 4) IHC followed by confirmation of IHC1+ and IHC2+ by FISH; 5) IHC followed by confirmation of IHC2+ and IHC3+ by FISH; 6) IHC followed by confirmation of IHC1+, IHC2+, and IHC3+ by FISH; and 7) FISH only. Each was compared with strategy 1 (IHC followed by FISH for IHC2+ patients) in terms of incremental cost per accurate diagnosis. The investigators found that confirmation of HER2 status by FISH in IHC3+ patients was optimal, reducing the false-positive results to 0% and increasing the percentage accurately determined to 97.6% at an ICER of $6175 per case of HER2 status accurately determined. This study differs from our study because expanded reflex testing does not seek to maximize accuracy but to maximize yield of HER2 cases (ie, minimize false-negative results). This is the reason that our modeling framework allows for the treatment of patient with initially false-positive results. Additionally, their study considered only testing costs and disregarded costs of treatment, measuring cost per accurate test as their outcomes compare to our study, which used cost per QALY.
Blank et al. used a life-long Markov model to assess the cost-effectiveness of HER-2 tumor testing strategies comparing IHC, FISH, the combination of the two, or FISH confirmation of IHC2 in the Swiss healthcare system. They found that FISH alone is the most cost-effective at €12,245 per QALY. Their optimal strategy would also differ from expanded reflex testing because with up to 5%-8% false-negative tests (ie, tumors that are FISH-negative and IHC-positive) patients would not receive trastuzumab if FISH alone is used. The advantage of expanded reflex testing is that it further reduces the chance of false-negative results, albeit at the cost of treating some false-positive results with trastuzumab.
One limitation of this analysis is that in taking the US payer perspective, we do not consider indirect time or productivity costs. In our previous analysis of trastuzumab in adjuvant therapy, we estimated that the time and travel costs associated with trastuzumab would be about $2000 over a lifetime, raising lifetime costs and the cost-effectiveness ratio only slightly (by 4.6%) to about $28,000 per QALY. This would still be considered cost-effective, but it also ignores any indirect benefits in terms of improved productivity or labor force attachment over the remainder of the patient's working life, which could be very substantial.
Several factors will affect the potential implementation of expanded reflex testing. In practice, many laboratories only have the capacity to perform IHC, given the additional equipment and expertise requirements for implementing FISH. Implementing expanded reflex testing will require improvements in capacity and investment in equipment, which may be costly. However, this would be less expensive than implementing FISH as the index test for HER2 status, as has been suggested, because the percentage of patients receiving the FISH test rises from 26.6% under the current testing algorithm to 86.9% under expanded reflex testing (compared with 100% if FISH were the index test). This would require increased capacity and reimbursement for extra tests. Furthermore, there remains uncertainty about whether particular payers will be willing to accept these changes. The US Food and Drug Administration has approved additional HER2 tests, such as silver in situ hybridization, chromogenic in situ hybridization (CISH), and dual in situ hybridization. These new tests may reduce analytic burden (eg, CISH requires widely available light microscopes instead of expensive fluorescent microscopes), thereby potentially reducing the costs of expanded reflex testing.
In addition to identifying patients for treatment with trastuzumab, HER2 testing is already being used to identify patients for treatment with the tyrosine kinase small molecule inhibitor lapatinib (Tykerb; GlaxoSmithKline, Philadelphia, PA), which has been approved recently in adjuvant and neoadjuvant metastatic disease settings. The emergence of newer therapies for adjuvant and metastatic treatment of breast cancer that target HER2 may impact the potential clinical utility and cost-effectiveness of expanded reflex testing, particularly if these drugs lead to additional gains in survival or QALYs, implying that the potential impact of false-negative results on patients is further amplified in LY or QALY terms.
HER2 testing is a cornerstone of treatment of patients with breast cancer. While accuracy in testing is important, a balance between having false-negative HER2 results and treating false-positive patients with trastuzumab is an important clinical and policy decision. Expanded reflex testing is projected to be cost-effective at $39,745 per QALY gained, and would affect approximately 4700 women annually in the United States. The estimated benefit would be even greater in younger patients. This base case projects a false-negative probability of 2.27%, but this may be a conservative figure given recent empirical evidence. Expanded reflex testing allows for a second opportunity to measure HER2 status accurately, correcting both handling errors and testing inconsistency.
Unrestricted support by Genentech, Inc., to VeriTech Corporation.
CONFLICT OF INTEREST DISCLOSURES
Dr. Brammer and Dr. Lalla are employed by Genentech, a member of the Roche Group, and own stock in Roche. Dr. Garrison has received honoraria from Genentech to plan, conduct, prepare, and present the current analyses and has acted as a consultant for Genentech and Roche Portugal and as a member of the Speakers Bureau for Roche Portugal. Dr. Wang has received honorarium from Genentech to plan, conduct, prepare, and present the current analyses and has acted as a consultant for Genentech. Dr. Babigumira has received a consulting fee/honorarium from Genentech.