Fecal Fusobacterium nucleatum for the diagnosis of colorectal tumor: A systematic review and meta‐analysis

Abstract The fecal Fusobacterium nucleatum has been reported as a potential noninvasive biomarker for colorectal tumor in several studies, but its exact diagnostic accuracy was ambiguous due to the wide range of sensitivity and specificity. To assess the diagnostic accuracy of fecal F. nucleatum for colorectal tumor, we searched electronic databases including PubMed, Cochrane Library, Embase, and Web of Science, without any date and language restrictions. Two reviewers independently extracted data and appraised study quality with Quality Assessment of Diagnostic Accuracy Studies. We included ten studies comprising 13 cohorts for colorectal cancer (CRC) and seven cohorts for colorectal adenoma (CRA). A total of 1450 patients and 1421 controls for CRC and 656 patients and 827 controls for CRA were included. The pooled sensitivity and specificity of fecal F. nucleatum for CRC were 71% (95% CI, 61%‐79%) and 76% (95% CI, 66%‐84%), with the area under the receiver‐operating characteristics (AUC) curve of 0.80 (95% CI, 0.76‐0.83). The pooled sensitivity and specificity of fecal F. nucleatum for CRA were 36% (95% CI, 27%‐46%) and 73% (95% CI, 65%‐79%), with an AUC of 0.60 (95% CI, 0.56‐0.65). Substantial heterogeneity among studies existed, which was partly caused by DNA extraction kits, regions of study, sample size, and demographic characteristics of participants. Fecal F. nucleatum was valuable for the diagnosis of CRC although it performed below expectation. For CRA, the specificity of fecal F. nucleatum indicated the possibility of noninvasive screening. Subgroup analyses for adenoma were incomplete due to lack of data. Heterogeneity limited the credibility of the study.


| INTRODUCTION
Colorectal cancer (CRC) is the third most common cancer in the world and ranks as one of the five most common fatal cancers worldwide. 1 The high morbidity and mortality are mainly due to the fact that CRC is usually not diagnosed until it has reached an advanced stage. Both incidence and death rates of CRC declined during recent years, largely thanks to the use of screening methods. 2 Several screening strategies including colonoscopy and fecal immunochemical test (FIT) are recommended by international guidelines. 3,4 However, due to economic limitations and screening process, a majority of people has not undergone colonoscopy. 3 On the other hand, FIT had a large range of the sensitivity for CRC, from 25% to 100%, 5 and worked weakly when detecting colorectal adenoma (CRA), [6][7][8][9] although it has been a widely accepted screening strategy for CRC. Moreover, some conditions such as hemorrhoids could increase the risk of false-positive FIT results. 10 Thus, more noninvasive and economic biomarkers for CRC detection are urgently needed.
Recently, increased attention has been paid to the effect of microbiome in CRC. Since Fusobacterium nucleatum (F. nucleatum) infection was reported to be prevalent in CRC tissues, 11 many studies have focused on its role in the carcinogenesis and development of colorectal tumor. 12 In addition, F. nucleatum was also detected significantly more in the feces of CRC patients than healthy controls, 13 suggesting that fecal F. nucleatum may be helpful for noninvasive CRC screening. Fecal F. nucleatum has been reported as a potential novel biomarker for CRC, even with a higher detection rate for CRC than FIT. 14,15 Another study also reported that the diagnostic accuracy of fecal F. nucleatum for CRC was as well as that of FIT, with a better diagnostic accuracy for CRA. 16 These studies were encouraging, indicating that fecal F. nucleatum would be a promising biomarker for colorectal tumor and even be comparable with FIT. However, there were also some studies producing conflicting results. [17][18][19] The diagnostic characteristics of fecal F. nucleatum for CRC have been ambiguous, with sensitivity ranging from 45% to 100% and specificity ranging from 10% to 92%. 15,17,18 These all have added difficulties to assess the diagnostic accuracy of fecal F. nucleatum for colorectal tumor. Therefore, we conducted a meta-analysis to explore the diagnostic accuracy of fecal F. nucleatum for CRC or CRA.

| Search strategy
We did a systematic search of several electronic databases, including PubMed, Cochrane Library, Embase, and Web of Science, without any date and language restrictions, for all studies about diagnostic performance of fecal F. nucleatum for CRC or CRA. We used medical subject headings and keywords for literature retrieval. The last search was performed in June 2018. Further details of the search strategy are provided in Supplementary Method.

| Selection criteria
Two reviewers independently checked titles and abstracts of all retrieved articles and determined final eligibility according to full texts. All disagreement was settled by discussion and reached consensus. Studies were included if they met the following criteria: (a) studies evaluated the diagnostic accuracy of fecal F. nucleatum for CRC or CRA; (b) studies presented sufficient data to infer a two-by-two diagnostic table, comprising true positives (tp), false positives (fp), false negatives (fn), and true negatives (tn). We excluded letters, reviews, conference abstracts, and duplicate publications.

| Data extraction and quality assessment
The same reviewers independently extracted study data to obtain relevant information. We contacted study authors to achieve data for extraction when necessary. For studies presenting results with different cutoff values, 16 different subspecies 18 and different subject-recruiting sites, 15,19 we extracted all data to get the most information of these studies. The revised Quality Assessment of Diagnostic Accuracy Studies (QADAS-2) 20 was used to score the quality of included studies, and discrepancies were resolved.

| Data synthesis and analysis
We performed data synthesis and analysis with STATA 14.0 (StataCorp, College Station, Texas), using midas and metandi modules. P value was regarded as statistically significant when it was equal to or less than 0.05. We calculated the pooled sensitivity, specificity, positive-likelihood ratio, negative-likelihood ratio, diagnostic odds ratio (DOR), and summary receiver-operating characteristics (SROC) curve with 95% confidence interval (CI) by using hierarchical models. Hierarchical models include the bivariate model 21 and the hierarchical SROC model. 22 The former directly models the sensitivity, specificity and the correlation between them. The latter defines a hierarchical SROC (HSROC) curve by modeling functions of sensitivity and specificity.
We used a Deeks' funnel plot to test the publication bias. It tests association between log diagnostic odds ratio (lnDOR) and "effective sample size," which is a sample function of diseased and nondiseased individuals. 23 The slope coefficient and relevant P value are associated with publication bias. We explored heterogeneity among included studies because study characteristics may be related to the study size and test accuracy. 23 The I-square (I 2 ) was calculated to estimate the heterogeneity. 24 In our analysis, we used the one with higher sensitivity for the study 18 presenting results with two different subspecies; we used the combined cohort, which included discovery cohort and validation cohort, for the study 16 presenting results with three cohorts (discovery cohort, validation cohort, and combined cohort) and different cutoff values.
We excluded study one by one to assess the robustness of our findings.
We performed a series of predefined subgroup analyses according to DNA extraction kits (QIGEN or not QIGEN), regions (Asia or non-Asia), and sample size (<200 or >200). Cutoff values and internal controls were different among included studies, so we could not stratify studies by them. Furthermore, we conducted a bivariate random-effects metaregression analysis to explore the sources of heterogeneity further, with the following variables: the percent of earlystage patients in all CRC patients (<50% or >50%), the percent of patients in all participants (<50% or 50%), the percent of males in patients (<60% or >60%), and the average age of all participants.

| Study selection
We retrieved 243 articles through electronic databases at first, comprising 72 articles from PubMed, six articles from Cochrane Library, 86 articles from Embase, and 79 articles from Web of Science. After removing duplicates, we screened titles and abstracts of 146 articles and excluded 116 articles. We read full text of 30 studies, and 10 studies [13][14][15][16][17][18][19][25][26][27] were eligible in the meta-analysis in the end and six 14,16,17,19,25,26 of them also reported diagnostic results of CRA. We included 13 cohorts of CRC and seven cohorts of CRA in the end because three articles 15,19,27 recruited two cohorts independently from different sites. The procedure of study selection is shown in Figure 1

| Study characteristics
The main characteristics of 13 cohorts of CRC and seven cohorts of CRA are shown in Table 1.
Other than one 25 of these studies did not mention the exclusive criteria; the other studies [13][14][15][16][17][18][19]26,27 all excluded patients with conditions (such as a vegetarian diet or use of antibiotics within the recent 3 months) that may influence the intestinal microbiome or medical conditions (such as inflammatory bowel disease or a history of cancer) which were relevant to colorectal tumor. One 18 of these studies grouped patients with small adenoma into the control. All of these studies were case-control study. One 25 study counted the absolute copy number of fecal F. nucleatum, while the other studies [13][14][15][16][17][18][19]26,27 detected the relative abundance of it. However, studies evaluating the relative abundance of fecal F. nucleatum chose different internal controls.
We analyzed data for CRC firstly. These studies included 1450 patients and 1421 controls. Sample sizes of cohorts ranged from 16 to 569. The exact cutoff values were available in four studies [14][15][16]25 and different from each other.
The data available for CRA were fewer. In total, 652 patients with adenoma and 827 controls were included in the evaluation of diagnostic ability of fecal F. nucleatum for CRA. Sample sizes of cohorts varied widely, ranging from 17 to 386. The exact cutoff values were reported in three studies 14,16,25 and also different. Three 14,17,25 studies used the same cutoff values to diagnose adenoma as to diagnose CRC.

| Quality assessment
The results of the QADAS-2 for CRC and CRA are shown in Figure S1 and Table S1, indicating that highest risk of bias existed in "patient selection" and "index text." The former one is because that all of the included studies were case-control study and the percentage of colorectal tumor patients in all subjects was inconsistent with its prevalence rate. However, except for the one 25 that did not mention the way of subject selection, the other studies all recruited participants consecutively or randomly. The latter one is caused by cutoff values not being determined beforehand in all studies but one. 18 The highest concern about applicability came from "patient selection." Four studies [14][15][16]18 included subjects with gastrointestinal symptoms such as changes of bowel movement  [15][16][17]26 included participants aged 50 years or older. Two studies 13,19 included participants aged 40 years or above. One study included asympotomatic participants. 26 One study 25 did not report specific information about the inclusion and exclusion criteria of study cohort.

| Assessment of publication bias and heterogeneity
The publication bias of studies for CRC and CRA is displayed in Figure S2. Both of the two funnel plots were almost symmetrical. P values of slope coefficient of the two regression lines were 0.7 and 0.67, both more than 0.1, suggesting a low likelihood of publication bias. Obvious heterogeneity existed when we calculated pooled sensitivity (I 2 = 88.45%), specificity (I 2 = 86.44%), positive DLR (I 2 = 83.57%), and negative DLR (I 2 = 85.83%) to evaluate diagnostic ability of fecal F. nucleatum for CRC. The proportion of heterogeneity likely due to threshold effect was 31%. When it comes to CRA, substantial heterogeneity also existed in pooled sensitivity (I 2 = 86.18%), specificity (I 2 = 65.79%), positive DLR (I 2 = 47.61%), and negative DLR (I 2 = 72.96%). The percent of heterogeneity owing to threshold effect was 13%. Different thresholds among studies contributed to the heterogeneity limitedly for both CRC and CRA.

| Sensitivity analysis
We explored the robustness of our results by removing studies one by one. The results of sensitivity analyses are shown in Table 2.

| DNA extraction kit
The pooled sensitivity of studies 13

| Region
We calculated the pooled diagnostic accuracy of fecal F. nucleatum for Asian studies. 15,16,19,25,27 For CRC, the pooled diagnostic accuracy of Asian studies was better than that of non-Asian studies. But the heterogeneity of sensitivity and negative DLR in non-Asian studies decreased significantly.
For CRA, compared with the overall pooled results, the pooled sensitivity and positive DLR of Asian 16,19,25 studies were higher, especially the sensitivity, with the lower pooled negative DLR. Furthermore, heterogeneity of these indicators all decreased significantly except that of specificity.

| Meta-regression
Studies of CRA diagnosis had limited information, so we only conducted a meta-regression for CRC. The meta-regression showed that the average age of all participants contributed to the heterogeneity of sensitivity. Besides, the percent of early-stage patients in all CRC patients, the percent of the males in all participants, and the average age of all participants were responsible for the overall heterogeneity (Table S2).

| DISCUSSION
There have been many studies investigating the mechanism of F. nucleatum instigating and potentiating colorectal tumor by using tumor tissues. It has been reported that the accumulation of F. nucleatum in colorectal tumor is partly due to fusobacterial lectin Fap2 28 and FadA adhesin 29 and that F. nucleatum can induce the carcinogenesis and development of colorectal tumor by the microRNA-21-mediated pathway 30 or inhibition of host adaptive immunity, 31-33 etc Recently, our group also found that F. nucleatum promoted chemoresistance by modulating autophagy in CRC. 34 F. nucleatum was reported to be associated with the prognosis of CRC as well. 35,36 There also have been some studies exploring diagnosing colorectal tumor with F. nucleatum from tumor tissues 26,37 or feces. However, it was reported that the level of F. nucleatum in feces did not correlate with that in tumor tissues. 38 Our study focused on the fecal F. nucleatum to explore its potential as a noninvasive screening method for colorectal tumor. In our meta-analysis, the overall pooled diagnostic accuracy of fecal F. nucleatum (AUC) was 0.80 for CRC; the pooled sensitivity and specificity of fecal F. nucleatum was 71% and 76% for CRC, indicating that fecal F. nucleatum has a certain value for the diagnosis of CRC. The pooled sensitivity (36%), specificity (73%), and AUC (0.60) for CRA were low. It is worth noting that the diagnostic performance of FIT for CRA was also poor, with sensitivity ranging from 15% to 26.3%. 7,9 What is more, it cannot be denied that FIT has many limitations, such as false-positive results mentioned above. And FIT could not diagnose patients with nonbleeding lesions, 14 which could be complemented by fecal F. nucleatum. It was reported that combining fecal F. nucleatum with FIT or other microbial markers enhanced the detection ability of FIT for both CRC and CRA. 15,16,39 Thus, it is hopeful for F. nucleatum to be a biomarker for the noninvasive screening of colorectal tumor.
We conducted sensitivity analyses excluding studies one by one and found our results were stable, especially for CRC.
To investigate the source of heterogeneity, we did subgroup analyses and meta-regression. The DNA extraction kits, cohort regions, sample size, average age, and the percent of early-stage patients and the males were responsible for the high heterogeneity. The percent of heterogeneity owing to threshold effect was low.
It was reported that variations in DNA extraction protocol had a great impact on the observation of fecal microbial composition. 40 It is consistent with the better diagnostic accuracy of not QIAGEN group for CRC compared with QIAGEN group. However, high heterogeneity among studies using the same DNA extraction kit still existed although it decreased comparing with the overall one. For CRA, QIAGEN group also existed high heterogeneity of sensitivity and specificity despite decreased heterogeneity of DLR. Besides the difference of DNA extraction kit, we found that methods of storing fecal samples before DNA extracting, targeted genes, primers' sequences designed for PCR, and internal controls were different among studies as well. But we could not analyze their exact effect on heterogeneity due to limited data. Further studies are needed to explore whether these factors are responsible for heterogeneity and find the best way to utilize fecal F. nucleatum for diagnosis of colorectal tumor. The diagnostic accuracy of fecal F. nucleatum for CRC in Asian studies was better compared with diagnostic accuracy in non-Asian studies. And the heterogeneity also decreased to some extent. For CRA, the heterogeneity of sensitivity even dropped to zero. There was evidence suggesting that intestinal F. nucleatum could be influenced by diet. 41 Other studies also reported that the change of diet could alter intestinal microbiome. 42,43 That may be associated with the different diagnostic accuracy among different regions and the decreased heterogeneity in region subgroups.
In addition, the subgroup analysis according to sample size indicated that the fecal F. nucleatum had performed better for the diagnosis of colorectal tumor in larger cohorts and sample size was an important factor for heterogeneity.
The percent of early-stage patients and males and the average age were the source of heterogeneity as well. It was reported that copy number of fecal F. nucleatum was the highest in stage II and the lowest in stage IV. 25 Another study also reported that the relative abundance of fecal F. nucleatum in CRC stages II and III was higher than stage I. 39 This may explain the effect of stage composition on heterogeneity. Enrichment of fecal F. nucleatum in CRC stage II suggested that fecal F. nucleatum may play a role in detecting earlystage CRC.
Furthermore, it is a remarkable fact that colorectal tumor is a disease due to different genetic and epigenetic alterations, such as microsatellite instability and activation of oncogenes including KRAS and BRAF. It was also reported that F. nucleatum, which was influenced by food, lifestyle, and medications, was related with clinical and molecular pathologies of colorectal tumor. 16,36 Molecular pathological epidemiology (MPE), a transdisciplinary and interdisciplinary integrative field, studies associations of an exposure with molecular or pathological features of a certain disease. 44 In view of this, microbial MPE also contributes to the heterogeneity of colorectal tumor as well as our study. In addition, in the era of precision medicine, MPE has the potential to play an important role in the future. 45 Thus, combining MPE research with fecal bacterial analyses, such as F. nucleatum, may advance investigations of pathogenesis of colorectal tumor, help diagnose or classify this disease, and improve the development of precision medicine for colorectal tumor.
Our study has some limitations. First, the sizes of studies and samples for CRA were small, despite the use of a comprehensive search strategy. This limited the precision of pooled results and prevented complete subgroup analyses for CRA.
All of these studies were case-control designs, causing spectrum effects by restricting sampling of cases and controls. 46 That were likely to lead to high bias and inflated estimates of results. And some included studies had small sample size and did not conform to the principles of diagnostic tests such as blind control. 47 All above did harm to the validity of results in these studies.
What is more, there were no unified detecting methods and threshold because assessing the diagnostic accuracy of fecal F. nucleatum just emerged over the past few years. The sensitivity and specificity were determined by the threshold of AUC curve. To some extent, the sensitivity and specificity are mutually dependent, where lowering the threshold to increase sensitivity will decrease specificity and vice versa. Most of the included studies defined their own threshold by their ROC curve and obtained different cutoff values. In terms of this, sensitivity or specificity in these studies was not parallel. For these reasons, further studies with rigorous design and randomized clinical trials are needed to define the best cutoff value for the diagnosis of colorectal tumor with fecal F. nucleatum and assess its performance.
In conclusion, fecal F. nucleatum is promising for the noninvasive diagnosis of colorectal tumor. It is a potential complementary method of FIT, especially for CRA. Given the results of subgroup analyses and meta-regression, further studies should be performed to determine standard F. nucleatum detecting methods and diagnostic threshold to reduce heterogeneity and enhance clinical effectiveness of fecal F. nucleatum.