Deciphering intrafamilial phenotypic variability by exome sequencing in a Bardet–Biedl family

Bardet–Biedl syndrome (BBS) is a model ciliopathy characterized by a wide range of clinical variability. The heterogeneity of this condition is reflected in the number of underlying gene defects and the epistatic interactions between the proteins encoded. BBS is generally inherited in an autosomal recessive trait. However, in some families, mutations across different loci interact to modulate the expressivity of the phenotype. In order to investigate the magnitude of epistasis in one BBS family with remarkable intrafamilial phenotypic variability, we designed an exome sequencing–based approach using SOLID 5500xl platform. This strategy allowed the reliable detection of the primary causal mutations in our family consisting of two novel compound heterozygous mutations in McKusick–Kaufman syndrome (MKKS) gene (p.D90G and p.V396F). Additionally, exome sequencing enabled the detection of one novel heterozygous NPHP4 variant which is predicted to activate a cryptic acceptor splice site and is only present in the most severely affected patient. Here, we provide an exome sequencing analysis of a BBS family and show the potential utility of this tool, in combination with network analysis, to detect disease-causing mutations and second-site modifiers. Our data demonstrate how next-generation sequencing (NGS) can facilitate the dissection of epistatic phenomena, and shed light on the genetic basis of phenotypic variability.


Introduction
Ciliopathies are an emerging group of clinical disorders characterized by large genetic heterogeneity and clinical overlap. Due to the ubiquitous nature of the primary cilium, ciliopathies can affect many organ systems (Fliegauf et al. 2007). Bardet-Biedl syndrome (BBS; MIM 209900) is considered a model of ciliopathy (Badano et al. 2006a), and its prevalence varies from 1:160,000 in Europe (Klein and Ammann 1969;Beales et al. 1997) to 1:13,500 and 1:17,500 in Kuwait and Newfoundland, respectively (Farag and Teebi 1989;Green et al. 1989). BBS is a pleiotropic disorder that has variable expressivity and a wide range of clinical variability observed both within and between families (Beales et al. 1999). Primary features include retinitis pigmentosa (RP), polydactyly, obesity, genital defects, renal anomalies, and learning disabilities. Secondary features include speech delay, developmental delay, diabetes mellitus, dental anomalies, congenital heart disease, brachydactyly/syndactyly, ataxia/poor coordination, and anosmia/hiposmia (Forsythe and Beales 2013).
To date, mutations in 18 genes have been associated with BBS (Retinal Information Network website. Available: https://sph.uth.edu/retnet/) accounting for~70% of affected individuals (Muller et al. 2010). BBS is typically inherited in an autosomal recessive manner. However, the involvement of mutations in more than one locus has been substantially growing. Digenic triallelic inheritance has been reported in some patients with three mutations across two BBS loci that interact to cause disease (Katsanis et al. 2001;Badano et al. 2003;Beales et al. 2003). In addition, the presence of second-site modifiers that may modulate the expression of the clinical phenotype has been proposed to explain, in part, the significant interand intrafamilial clinical variability in BBS (Katsanis 2004;Badano et al. 2006b;Khanna et al. 2009).
In the last few years, the introduction of the next-generation sequencing (NGS) has revolutionized clinical genetics, making whole-exome sequencing (WES) a rapid way to elucidate the genetic basis of clinically and genetically heterogeneous Mendelian disorders (Bamshad et al. 2011;Ionita-Laza et al. 2011;Rabbani et al. 2012). At present, it is expected that NGS, in combination with network analysis (Minguez et al. 2009) and other advanced bioinformatics tools that allows prioritizing candidate genes because of their functional relationships (Ideker and Sharan 2008), will play an increasingly important role in the diagnosis of complex and oligogenic disorders. Moreover, WES provides a unique possibility for investigating the presence of additional mutations that may modify the expressivity of BBS phenotype. Other authors have been interested in determining the contribution of epistasis in BBS (Katsanis et al. 2002;Hichri et al. 2005;Smaoui et al. 2006;Abu-Safieh et al. 2012), but their approaches have been based on Sanger sequencing of known BBS loci (Katsanis et al. 2002;Hichri et al. 2005;Smaoui et al. 2006;Abu-Safieh et al. 2012) or on targeted exon capture sequencing of a number of genes (Redin et al. 2012).
In this study, we conducted a WES approach in a Spanish family with significant intrafamilial variability in the BBS phenotype, from full blown to relatively mild forms of BBS. Using this technology, compound heterozygous mutations have been found in MKKS and the contribution of additional changes has been dissected.

Subjects and clinical assessment
Our study involved one Spanish family (RP42) comprising three affected and four unaffected individuals ( Fig. 1), all derived from the Ophthalmology Department to our Genetic Department. The study was carried out in accordance with the tenets of the Declaration of Helsinki and the ethical guidelines of our institutions. A group of 200 matching control individuals was also recruited. Written informed consent was obtained from all participants.
Clinical diagnosis of retinal degeneration was based on visual acuity, fundus photography, computerized testing of central and peripheral visual fields, and electroretinography (ERG) findings. Typical ocular features include initial night blindness, restriction of visual field, bone spicule pigmentation, attenuation of retinal vessels, waxy disk pallor, and abnormal ERG findings in a rod-cone pattern when recordable.
All subjects underwent a peripheral blood extraction for genomic DNA isolation from leukocytes using standard protocols. DNA samples from individuals I:1, I:2, II:3, and II:4 were processed for NGS.
After the identification of the MKKS gene defects, affected individuals underwent clinical reevaluation focused on the identification of extraocular features associated with BBS.

Description of DNA library preparation and sequencing
Library preparation and exome capture were performed according to a protocol based on the Baylor College of Medicine protocol version 2.1 with several modifications. Briefly, 5 lg of input genomic DNA was sheared, end repaired, and ligated with specific adaptors. A fragment size distribution ranging from 160 bp to 180 bp after shearing and 200-250 bp after adaptor ligation was verified by Bioanalyzer (Agilent Technologies, Santa Clara, CA). The library was amplified by precapture linker-mediated polymerase chain reaction (LM-PCR) using Fast-Start High Fidelity PCR System (Roche, Indianapolis, IN). After purification, 2 lg of LM-PCR product was hybridized to NimbleGen SeqCap EZ Exome libraries V3. After washing, amplification was performed by postcapture LM-PCR using FastStart High Fidelity PCR System (Roche). Capture enrichment was measured by qPCR according to NimbleGen protocol. The successfully captured DNA was measured by Quant-iT TM PicoGreen â dsDNA reagent (Invitrogen, Carlsbad, CA) and subjected to standard sample preparation procedures for sequencing with SOLiD 5500xl platform as recommended by the manufacturer. Shortly, emulsion PCR was performed on E80 scale (about 1 billion template beads) using a concentration of 0.616 PM of enriched captured DNA. After breaking and enrichment, about 276 million enriched template beads were sequenced per lane on a six-lane SOLiD 5500xl slide.

Analysis of data from deep sequencing
SOLiD 5500xl reads were aligned against the human genome reference (hg19) using the program BFAST (Blat-like Fast Accurate Search Tool) allowing reads to map only to a unique position in the reference genome. Improperly mapped reads were filtered out with the SAMtools package, which was also used to sort SAM files and to generate and index BAM files. Variant calling was performed with the software GATK (Genome Analysis Toolkit) taking into account variants from NCBI database of Single Nucleotide Polymorphisms (dbSNP) for recalibration and realignment. Secondary analysis was performed by a custom shell script that queries VARIANT database (Medina et al. 2012), which includes SIFT (Kumar et al. 2009) and Polyphen-2 (Adzhubei et al. 2010) scores, to annotate all single-nucleotide variations (SNVs) and small insertions and deletions (INDELS). Variants with an allele frequency higher than 5% in the 1000 Genomes Project database were discarded. Only, exonic variants which produce a synonymous change in the open reading frame were discarded, whereas other type of variants in the exomes and variants in splice sites were kept for further analysis. Then, variants found in affected individuals were compared with variants present in not affected relatives. A last step was performed to compare the remaining variants with variants obtained from a group of healthy controls from the same local population as the family in study obtained from The Medical Genome Project (www.medicalgenomeproject.com). Finally, genes with variants in both alleles present in affected but not in healthy individuals of the family nor in the control local population and with a variant only in one allele in the carrier were ranked based on the analysis of the interactome, using Net-workMiner(Garcia-Alonso et al. 2012), to generate a list of candidate genes.

Verification and assessment of the pathogenicity of variants
Each predicted disease-causing variant was confirmed by Sanger sequencing, and cosegregation analysis was performed in the rest of the family members DNA samples.
As mentioned above, we used Polyphen-2 and SIFT scores to evaluate the potential impact of novel missense substitutions on the function of the encoded protein. through the alignment of orthologous MKKS protein sequences (Pan troglodytes, Mus musculus, Canis lupus familiaris, Gallus gallus, Xenopus tropicalis, and Danio rerio) with the human MKKS protein sequence, using Clustal Omega Tool (Sievers et al. 2011). Furthermore, splice site tool Prediction by Neural Network (Reese 1997;Reese et al. 1997), exonic splicing enhancer prediction programs ESE Finder (Cartegni et al. 2003;Smith et al. 2006), and NetGene2 server (Hebsgaard et al. 1996) were applied to estimate the pathogenic nature of intronic sequence variants that could affect the splicing process. The correct nomenclature for mutation was checked applying Mutalyzer (Wildeman et al. 2008) using the corresponding Genbank reference sequences (MKKS; NG_009109.1 and NPHP4; NG_011724.2). Novel variants included in this article were submitted to the respective Locus Specific Database (LSDB) (http://grenada.lumc.nl/LOVD2/eye/home. php?select_db=MKKS).

Network analysis
Network enrichment analysis has been performed using the program SNOW (Minguez et al. 2009) included in the Babelomics (Medina et al. 2010) package (http:// www.babelomics.org). Genes are mapped onto the interactome (obtained from the STRING [Franceschini et al. 2013] database), and the subnetwork connecting them is obtained. Several relevant parameters are calculated for this subnetwork, such as the connectivity or the number of components. An empirical distribution of the random expectation of these parameters is obtained by repeatedly sampling random sets of the same number of genes from the complete genome and calculating the average connectivity of their corresponding subnetworks. Thus, real values of the parameters obtained for the genes analyzed can be contrasted with respect to their random expectations (Minguez et al. 2009;Minguez and Dopazo 2010).

Clinical assessment
The clinical findings of the affected individuals (II:2, II:4, and II:5) are reported in Table 1. Ocular manifestations of the disease in the three siblings were fairly typical of an early onset and severe form of RP. Night blindness was reported from the first decade. Thereafter, the disease rapidly progressed, and by the age of 20, II:2, II:4, and II:5 were severely visually disabled. The fundus examination showed the typical signs of RP, including pale waxy disks, attenuation of the retinal vessels, and bone spicule pigmentation in the midperiphery. Photopic and scotopic ERGs were extinguished (II:2) or diminished (II:4) on examination at ages 21 and 16, respectively. Patient II:4 showed extraocular features commonly associated with BBS such as postaxial polydactyly, overweight, polycystic kidney, learning disabilities, and mild psychomotor delay. Patients II:2 and II:5 exhibited a less severe phenotype consisting of RP, postaxial polydactyly, and mild learning disabilities. Subject II:4 underwent renal transplantation due to polycystic kidney disease, whereas the other two did not manifest any renal abnormalities. Therefore, patient II:4 fulfilled the criteria for the diagnosis of classic BBS, whereas II:2 and II:5 were diagnosed of mild BBS.

Identification of causal mutations
Upon exome sequencing, 43,570 sequence variants were identified in patient II:4 ( Table 2). The two most probably pathogenic mutations located on exons 3 and 5 of   the apical domains via flexible hinges (Stone et al. 2000) (Fig. 3A). Alignment of MKKS amino acid sequence of various orthologs showed that the substituted residues (aspartic acid at position 90 and valine at position 396) are highly conserved across species from different evolutionary branches (Fig. 3B). In silico tools predicted that both p.D90G and p.V396F are probably damaging by Polyphen-2 (score = 1 for p.D90G and score = 0.987 for p.V396F) and SIFT (score = 0.01 for p.D90G and score = 0 for p.V396F).

Epistasis evaluation
Whole-exome data were subjected to exhaustive evaluation paying special attention in the identification of additional variants that may be acting as second-site modifiers making patient's II:4 BBS phenotype more severe.
Although  (Table  S1), we evaluated the exome data and found one novel variant in intron 8-9 of the NPHP4 gene (c.992 + 71G>T). This variant has never been reported in databases such as dbSNP and EVS. As coverage information for this position was not clear in the EVS data set, we checked the absence of c.992 + 71G>T in additional genomic databases which also include intronic variants (1000 genomes and 5000 genomes). Family segregation showed the presence of the heterozygous NPHP4 variant only in patient II:4 and in his unaffected parent (I:1) (Fig. 1). In silico tools predicted that the substitution of a G > T in this position activates a cryptic splice acceptor site in the intron 8-9 of NPHP4 (Table 3). The complete coding sequence was scanned to eliminate the possibility of the presence of other mutation in NPHP4. The absence in 400 control chromosomes and the results of the in silico prediction supported the pathogenic role of the NPHP4 c.992 + 71G>T variant.

Network analysis
In order to explore the possible physical interactions among the BBS proteins and NPHP4, network enrichment analysis, using the program SNOW (Minguez et al. 2009), has been performed. Figure 4 shows the proteins significantly connected (significantly smallest number of components, P-value = 0.01) allowing the introduction of one extra connecting node among the proteins studied.
The network documents the dense network of physical protein-protein interactions that connect BBS proteins and also documents how NPHP4 connects to several BBS proteins through different intermediates. Of special interest is RPGR, which connects NPHP4 to CEP290. RPGR, the RP GTPase regulator, is also involved in RP X-Linked, cone-rod dystrophy X-linked, and macular dystrophy X-linked. Another interesting gene is INVS, Inversin, which encodes a protein that protein may function in renal tubular development and function. Such connections reinforce the possible role of NPHP4 as modulator of the penetrance of the disease.

Discussion
In this report, a Spanish BBS family with three affected siblings is described. The mode of inheritance and the main clinical features correspond to autosomal recessive BBS. Exome analysis of four individuals (I:1, I:2, II:3, and II:4) led to the identification of two novel compound heterozygous mutations in the MKKS gene (c.269A>G; p.D90G and c.1186G>T; p.V396F) of affected individuals. These variants were absent in 200 control individuals and showed segregation with disease in the entire family. Mutation D90G lay within the predicted equatorial domain, the most conserved region among group I and II chaperonins. In the group II chaperonin family, this domain is responsible for ATP hydrolysis and the substitution might alter its structure. Mutation V396F lay within the predicted intermediate domain which connects the equatorial and the apical domains via flexible hinges (Stone et al. 2000). Bioinformatic analysis predicted pathogenic consequences for both missense mutations.
The molecular diagnosis is generally helpful to confirm a clinical diagnosis. Although many studies have dissected the clinical overlaps due to MKKS mutations, genotypephenotype correlations are still not well understood. The clinical evaluation of the affected members of the family did not reveal hydrometrocolpos, suggesting that these two missense mutations in the MKKS gene did not cause genital malformations typical of MKKS phenotype. In this family, mutations in MKKS resulted in a spectrum of BBS phenotypes, ranging from BBS with severe renal involvement in patient II:4 to milder forms of BBS (patients II:2 and II:5). The expression of the phenotype and the disease progression can vary greatly from patient to patient, even among members of the same family. Whereas retinal dystrophy, digit anomaly, and learning disabilities were highly penetrant traits for all patients, kidney abnormalities and obesity showed incomplete penetrance in our family.
The index patient investigated here had previously undergone selected genotyping (APEX analysis, Asper http://www.asperbio.com/asper-ophthalmics), but this mutational screening approach failed to identify the underlying gene defect in this family. The implementation of a reliable diagnostic system able to detect novel disease-causing mutations, even in genes not previously associated with an assumed clinical diagnosis, is necessary. Exome sequencing has proven to be an important diagnostic tool for disorders that are characterized by significant genetic heterogeneity. In the case of BBS, besides the high number of genes involved, oligogenic inheritance is also well documented (Katsanis 2004) adding a layer of complexity to the genetic characterization of such patients.
One advantage of analyzing the whole exome simultaneously is the unique possibility to investigate the involvement of second-site mutations which may be modulating the expression of the BBS phenotype (Katsanis et al. 2001;Badano et al. 2003;Beales et al. 2003;Katsanis 2004;Hjortshoj et al. 2010). However, the use of unbiased methodologies often produces many candidates that must be filtered out with in silico prioritization techniques. Thus, network analysis captures the relationships of mutated genes with already known disease genes allowing a rational prioritization of candidate genes. The combination of NGS with network analysis approach conducted here allowed us to identify a novel variantlocated on intron 8-9 of NPHP4which was predicted to enhance the use of a cryptic splice acceptor site. This would probably cause the introduction of a premature termination codon and the reduction in NPHP4 mRNA levels. Only the most severely affected patient (II:4) carries the variant (c.992 + 71G>T) in heterozygous state. In addition, the absence of the variant in control individuals seems to indicate that it is a pathogenic allele.
NPHP4 is a ciliary protein that is known to belong to a multifunctional complex and colocalizes with RPGRIP1, RPGR, and the serologically defined colon cancer antigen-8 (SDCCAG8), a protein thought to partake in the RPGRIP1 interactome and implicated also in retinal-renal ciliopathies (Roepman et al. 2005;Schaefer et al. 2011;Won et al. 2011;Patil et al. 2012). Mutations in NPHP4 have been associated with related ciliopathies such as NPH (Mollet et al. 2002;Hoefele et al. 2004) and SLS Schuermann et al. 2002).
Few splice site variants have been found to modulate the pathogenesis of BBS. One example is the allele C430T of MGC1203 which enhances the use of a cryptic splice acceptor site, causing the introduction of a premature termination codon and the reduction in the steady-state MGC1203 mRNA levels (Badano et al. 2006b). Although the MGC1203 mutations are probably insufficient to cause BBS, this gene may be involved in the pathogenesis of BBS by contributing hypomorphic mutations to an already sensitized genetic background. The authors conclude that suppression of MGC1203 exerts an epistatic effect on the developmental BBS phenotype. It is tempting to speculate that NPHP4 allele reported herein may also show a similar effect. We hypothesize that the NPHP4 variant may be acting as a second-site modifier affecting the likelihood of renal degeneration in the context of the other MKKS mutations. This hypothesis is also supported by the known physical relationships revealed by network analysis. The MKKS protein forms a significant cluster of proteins with several BBS proteins, as shown in Figure 4. The severe BBS phenotype observed in patient II:4 may be due to an epistatic effect of MKKS and NPHP4 mutations. This would provide an explanation for one of the distinct clinical manifestations observed in this family. Moreover, it is possible that other intrafamilial phenotypic variations may be due to alterations in additional loci as well as nonstrictly genetic factors not evaluated in this study. Therefore, our findings warrant further studies to evaluate the role of putative disease-modifying variants.
Identifying the presence of epistatic variants can be challenging, given that a large number of common and rare alleleslocated in known and in novel geneshas been shown to influence the expressivity of the BBS phenotype.
Our data demonstrate that NGS is a powerful approach to identify rare and high penetrant disease variants and to study the genetic contribution to phenotypic variability.
This genetic diagnostic tool can now be applied to large cohorts of Mendelian and oligogenic disorders and should rapidly provide the molecular diagnosis and the prevalence of associated genes.
In summary, we show how WES has clearly improved the molecular diagnosis of heterogeneous disorders such as BBS. In addition, our study highlights the usefulness of NGS approaches for the dissection of epistatic phenomena and provides promising findings to decipher the genetic basis of phenotypic variability.