Validity of the prognostication tool PREDICT version 2.2 in Japanese breast cancer patients

Abstract Introduction PREDICT is a prognostication tool that calculates the potential benefit of various postsurgical treatments on the overall survival (OS) of patients with nonmetastatic invasive breast cancer. Once patient, tumor, and treatment details have been entered, the tool will show the estimated 5‐, 10‐, and 15‐year OS outcomes, both with and without adjuvant therapies. This study aimed to conduct an external validation of the prognostication tool PREDICT version 2.2 by evaluating its predictive accuracy of the 5‐ and 10‐year OS outcomes among female patients with nonmetastatic invasive breast cancer in Japan. Methods All female patients diagnosed from 2001 to 2013 with unilateral, nonmetastatic, invasive breast cancer and had undergone surgical treatment at Kyushu University Hospital, Fukuoka, Japan, were selected. Observed and predicted 5‐ and 10‐year OS rates were analyzed for the validation population and the subgroups. Calibration and discriminatory accuracy were assessed using Chi‐squared goodness‐of‐fit test and area under the receiver operating characteristic curve (AUC). Results A total of 636 eligible cases were selected from 1, 213 records. Predicted and observed OS differed by 0.9% (p = 0.322) for 5‐year OS, and 2.4% (p = 0.086) for 10‐year OS. Discriminatory accuracy results for 5‐year (AUC = 0.707) and 10‐year (AUC = 0.707) OS were fairly well. Conclusion PREDICT tool accurately estimated the 5‐ and 10‐year OS in the overall Japanese study population. However, caution should be used for interpretation of the 5‐year OS outcomes in patients that are ≥65 years old, and also for the 10‐year OS outcomes in patients that are ≥65 years old, those with histologic grade 3 and Luminal A tumors, and in those considering ETx or no systemic treatment.


| INTRODUCTION
Breast cancer is the leading cancer in women all over the world. It is also the most frequent cause of mortality from cancer in women regardless of race or ethnicity. In 2018, over 2 million newly diagnosed cases of breast cancer and more than 600,000 breast cancer-specific mortalities were reported worldwide. 1 In Japan, a total of 92,253 new breast cancer cases and 14,285 deaths were recorded in 2017. 2 Adjuvant systemic treatment is a systemic therapy given after surgery to stop or prevent micrometastasis. It has been proven to reduce the risk of recurrence and breast cancer mortality. 3,4 Accurate appraisal of prognosis and prospective benefit from additional postsurgical systemic therapy could help minimize undertreatment and overtreatment. This can help both physicians and patients in choosing the best treatment option that will provide the optimal therapeutic benefit while reducing the side effects and maintaining the quality of life. 5,6 Several prediction models have been created to assist in deciding regarding which adjuvant systemic therapy is most suitable for the patient depending on the patient and tumor characteristics, such as Adjuvant! Online and PREDICT. In comparison with Adjuvant! Online, PREDICT incorporates other factors such as mode of detection, Ki67 status, and HER2 status, however, it does not include patient's comorbidities. 7 Validation studies in Asian patients have shown that Adjuvant! Online was overoptimistic in predicting overall survival. 8 Another study done in Southeast Asian patients have shown that PREDICT was accurate in most subgroups of patients, but was overoptimistic in young patients (<40 years), and in those receiving neoadjuvant chemotherapy. 7 Adjuvant! Online is a well-known and widely used clinical prognostication model, however, it has been unavailable for quite sometime. 9,10 Currently, PREDICT is the only free online prediction tool that has been advocated by the American Joint Committee on Cancer. 11 PREDICT is a prognostication tool that shows how various postsurgical treatments can potentially improve the overall survival (OS) of patients with nonmetastatic invasive breast cancer. Once patient, tumor, and treatment details have been entered, the tool will show the estimated 5-, 10-, and 15-year OS rates, with and without postoperative treatments (systemic chemotherapy, hormonal therapy, trastuzumab, and bisphosphonates). Results are displayed in visual and textual formats. 12 PREDICT was a collaboration between the Cambridge Breast Unit, University of Cambridge Department of Oncology, and the UK's Eastern Cancer Information and Registration Centre. The original model was based from the cancer databank information involving 5694 women treated in East Anglia from 1999 to 2003, and was validated using data of more than 5000 patients with breast cancer from the West Midlands Cancer Intelligence Unit. 12,13 The estimated benefits from the treatments were based on the Early Breast Cancer Trialists' Collaborative Group meta-analyses of clinical trials. 4 The pioneer model (version 1) of PREDICT was made available online in 2010 and the tool has become more wellknown thereafter. 13,14 Several adjustments and updates have been made afterwards that helped improve the tool's OS estimates. 15 The initial upgrade in 2011, version 1.1, incorporated human epidermal growth factor 2 (HER2) status, and included the estimated treatment effect if ever trastuzumab will be given. 12,16 In version 1.2, the proliferation marker Ki-67 was added as a prognostic parameter. 17 In 2017, model re-fitting (version 2.0) was done, with improvement on the age at diagnosis, exact size of the tumor in millimeters, and the number of positive lymph nodes. 12,18 Addition of bisphosphonates as treatment option and addition of 15-year outcomes were done for version 2.1. In the latest model, version 2.2, option for extended hormonal therapy for an additional 5 years was added. 12 PREDICT version 1 was validated in several studies from various countries, such as Canada, 19 the Netherlands, 9,20 Malaysia, 7 and the United Kingdom. 17,21 However, the validity of PREDICT has not yet been verified on the general Japanese breast cancer patients.
This study aimed to conduct an external validation of the prognostication tool PREDICT version 2.2 by evaluating its predictive accuracy of the 5-and 10-year OS outcomes among female patients with nonmetastatic invasive breast cancer in Japan.

| Patient selection
Data were retrieved from a hospital registry of consecutive patients diagnosed with breast cancer, identified through the breast cancer databank of Kyushu University Hospital (KUH), Fukuoka, Japan. All female patients who were diagnosed from 2001 to 2013 with unilateral, nonmetastatic, invasive breast cancer and who underwent corresponding surgery (mastectomy or breast-conserving surgery), were selected. Patients with unknown age, tumor size, tumor grade, estrogen receptor (ER) status, number of positive lymph nodes, and adjuvant treatment, were excluded because these are required variables in PREDICT, and absent values for such will not be accepted. Patients with age <25 and >80 years were excluded as well, since PREDICT only accepts age within that range. Data regarding follow-up of each patient, including date of last follow-up and date of death were also acquired from the same breast cancer databank, and patients with unknown follow-up time were excluded.

| 1607
The treatment approach for the patients were based on the National Comprehensive Cancer Network (NCCN) clinical practice guidelines for breast cancer, 22

| Follow-up
The total duration of inclusion of each patient in the study was computed from the date of surgical treatment for breast cancer until death, or until censored at the completion of follow-up period (October 1, 2019), or when 10 years of followup was reached.

| Immunohistochemistry (IHC) staining
The different tumor subtypes were determined using IHC staining of the resected specimens. The tissue specimens that were used for IHC were immediately fixed within 1 h of surgical removal in 10% neutral-buffered formalin for 6 to 72 h. ER-positive and progesterone receptor (PR)-positive specimens were interpreted as having ≥1% of tumor cells that stained positive for ER or PR, respectively. 28 Tissue specimens were labeled as HER2-positive when the IHC staining obtains a score of 3+ based on the standard criteria, or when the score is 2+ and the fluorescence in situ hybridization shows HER2 gene amplification. 29,30 Tissue specimens were labeled as luminal B when Ki-67 status was high (> 20%) or PR status was low (< 20%) in an ER-positive disease.

| PREDICT scores
Predicted OS rates were obtained by manually entering the necessary details for each patient in the PREDICT tool (version 2.2), with blinding to patient outcomes. If there is a missing information on any of the nonmandatory variables, such as the menopausal status, HER2 status, Ki-67 status, mode of detection, and presence of lymph node micrometastasis (if only one positive lymph node was harvested), patients were not excluded in the study, but the "unknown" option was selected instead. Ki-67 status was not routinely requested until late 2010, hence, for all cases before that, the "unknown" tab was selected for this variable.
The resulting predicted 5-and 10-year OS outcomes based on the actual therapy given to each patient was documented. Since the prognosticator variables were manually entered in PREDICT, the results are prone to encoding error. Hence, all the PREDICT scores were calculated three times to ensure accuracy of the obtained data.

| Statistical analyses
All statistical analyses were done in IBM SPSS Statistics version 25. A p-value of ≤ 0.05 was defined as statistically significant.
The Kaplan-Meier method was used for the survival analysis. The 5-and 10-year observed OS rates for the whole validation population and the subgroups were based from the survival estimates on the Kaplan-Meier curve. The median values were used for the predicted 5-and 10-year OS outcomes for the whole population and for the subgroups. 7 To evaluate the goodness-of-fit of the tool, the observed and predicted events for the entire study population as well as for all subgroups were analyzed using Chi-squared test. In line with the Dutch validation study, an a priori assumption was set, which states that PREDICT tool correctly prognosticated the OS rates if the difference of the predicted and observed outcomes is not more than 5%, since a difference more than this value will be considered as clinically relevant. 6 Calibration of the model was assessed using Chi-squared test and by making a calibration plot for the survival rates. The entire study population was initially binned into quintiles of the predicted survival rates. A calibration plot was then made showing the observed 5-and 10-year OS outcomes per quintile, against the median of the predicted OS outcomes per quintile. 7,16 To further evaluate the effect of endocrine therapy (ETx) on OS, model calibration was stratified into the presence and duration of ETx given.
The discriminatory performance of PREDICT was assessed by using receiver operator characteristic (ROC) curve analysis, and by computing for the area under the ROC (AUC) for both the 5-and 10-year OS. A plot was made comparing the number of patients who were alive at the duration of study and were prognosticated accurately (sensitivity), against the number of patients who were deceased but were prognosticated to be alive (1-specificity). The AUC was utilized to measure the discriminatory accuracy of the tool, and can be interpreted as the probability that patients were accurately prognosticated to be alive or deceased at 5 and 10 years. An AUC value lies between 0.5 and 1, wherein a value of 0.5 suggests that the model has no capacity for discrimination, and a value of 1.0 suggests perfect discrimination. 6,7

| RESULTS
A total of 1213 patients diagnosed with breast cancer who had undergone surgical treatment from 2001 to 2013 were identified. Patients who are male (n = 9), those with bilateral breast cancer (n = 13), and those with noninvasive breast cancer (n = 106) were excluded since the data on which PREDICT was based did not include information on the presence of these characteristics. Patients with unknown age (n = 1), tumor size and grade (n = 188), ER status (n = 227), number of positive lymph nodes (n = 26), chemotherapy regimen (n = 3), and follow-up date (n = 4) were also excluded because these are mandatory variables in PREDICT, and absent values for such will not be accepted by the tool. After all the exclusions, a total of 636 patients remained in the study population (Figure 1).
Observed and predicted 5-and 10-year OS outcomes of the whole population and of the subgroups are presented in PREDICT accurately prognosticated the overall shortterm survival of the study population. The difference between the observed 5-year OS (94.6%) and the predicted 5-year OS (93.7%) was only 0.9%, p = 0.322, which was not statistically significant. The largest difference was noted in the 4-9 positive lymph nodes subgroup, the OS was underestimated by 10.2% (p = 0.12), however, the difference was also not statistically significant. The 5-year OS was significantly underestimated in patients that are ≥65 years old (6.7%, p = 0.004), those with Luminal A subtype tumors (2.85%, p = 0.021), and in those who received ETx only (2.5%, p = 0.032) as adjuvant systemic therapy. However, only the ≥65 years old subgroup had a difference over 5% ( Table 1).
A and 172 patients (36%) extended it up to 10 years. Figure 2 shows that the calibration of 5-year OS versus 10-year OS was accurate for the higher quintiles of PREDICT score and less accurate for the lower quintiles. The use of ETx and extending it up to 10 years was associated with more accurately predicted OS rates. Statistical analysis revealed that the 5-year OS (p = 0.322) and the 10-year OS (p = 0.086) were not significantly different from the perfect line (x = y).

| DISCUSSION
This study shows that PREDICT version 2.2 can accurately prognosticate the 5-and 10-year OS in the whole study population and in several subgroups. The 5-year OS outcomes were prognosticated accurately, except for the ≥65 years old subgroup, wherein the OS was underestimated by 6.7%. The 5-year OS difference for Luminal A subtype and ETx only subgroups were below 5%, but were statistically significant and even increased for the 10-year OS. The 10-year OS outcomes were predicted quite well, although significant underestimations were observed in the subgroups of ≥65 years old,  The discrepancies may be attributed primarily to the differences between the study population (Japan) and the population wherein PREDICT tool was based (United Kingdom). In 2018, the life expectancy of females in Japan was 87 years, while in the United Kingdom it was 83 years. 31 The higher life expectancy of the Japanese population could have contributed to the underestimation of PREDICT in patients ≥65 years old. The higher observed survival may also be due to the high number of censored data. Only 48 patients reached the event (death), and the other 588 were censored.
Luminal A subtype generally has better survival compared to the other molecular subtypes. The underestimation on this subgroup could have been affected by the unknown Ki-67 value for majority of patients. Introduction of Ki-67 testing into clinical practice was delayed in Japan until around 2011, including KUH, because most clinicians have been skeptical of the significance of Ki-67 expression until that time. Hence, unknown Ki-67 status was 70.6% in this study. Currently, a value of >20 for Ki-67 differentiates an ER positive and PR positive tumor into a Luminal B subtype versus Luminal A subtype; if Ki-67 is unknown, such will be categorized as Luminal A.
The underestimation in long-term survival of patients with histologic grade 3 tumors can be caused by the lack of representation of this group in the study validation population. It can likewise be due to the variations in the treatment approaches or some other prognostic variable differences among the study population and the tool development population.
PREDICT also underestimated the impact of breast-conserving surgery, no systemic therapy, ETx only, and 10 years ETx on survival. These are usually utilized in patients with lower-risk tumors, thus, explains the higher survival. Since PREDICT was based on a population gathered from 1999 to 2003, patients who had these treatments were probably underrepresented. Nowadays, substantially more patients are being treated with breast-conserving techniques and endocrine therapy.
The key strengths of this study are the large population size and the nearly complete data on nonrequired yet important variables on PREDICT, such as mode of detection and HER2 status. To our knowledge, this is the first validation study on PREDICT tool version 2.2 that was based on the Japanese population. A limitation of this study is the deficiency of Ki-67 data on majority of the validation population. Testing for Ki-67 was not routine in KUH until the latter months of 2010, as well as throughout Japan. Ki-67 serves a significant role in breast cancer prognosis, 32 and the missing Ki-67 data on majority of patients may have affected the results. Another weakness of this study is the heavy censoring in the validation population. This might have affected the observed survival estimates based on the Kaplan-Meier curve.

| CONCLUSION
PREDICT tool accurately estimated the 5-and 10-year OS rates in the entire Japanese validation population. Hence, PREDICT may be considered as a valid prognostication tool for breast cancer patients in Japan. However, caution should F I G U R E 3 Discriminatory accuracy of (A) 5-year and (B) 10-year overall survival be used for interpretation of the 5-year OS outcomes in patients that are ≥65 years old, and also for the 10-year OS outcomes in patients that are ≥65 years old, those with histologic grade 3 and Luminal A tumors, and in those considering ETx or no systemic treatment.