New specific primers for amplification of the Internal Transcribed Spacer region in Clitellata (Annelida)

Abstract Nuclear molecular evidence, for example, the rapidly evolving Internal Transcribed Spacer region (ITS), integrated with maternally inherited (mitochondrial) COI barcodes, has provided new insights into the diversity of clitellate annelids. PCR amplification and sequencing of ITS, however, are often hampered by poor specificity of primers used. Therefore, new clitellate‐specific primers for amplifying the whole ITS region (ITS: 29F/1084R) and a part of it (ITS2: 606F/1082R) were developed on the basis of a collection of previously published ITS sequences with flanking rDNA coding regions. The specificity of these and other ITS primers used for clitellates were then tested in silico by evaluating their mismatches with all assembled and annotated sequences (STD, version r127) from EMBL, and the new primers were also tested in vitro for a taxonomically broad sample of clitellate species (71 specimens representing 11 families). The in silico analyses showed that the newly designed primers have a better performance than the universal ones when amplifying clitellate ITS sequences. In vitro PCR and sequencing using the new primers were successful, in particular, for the 606F/1082R pair, which worked well for 65 of the 71 specimens. Thus, using this pair for amplifying the ITS2 will facilitate further molecular systematic investigation of various clitellates. The other pair (29F/1084R), will be a useful complement to existing ITS primers, when amplifying ITS as a whole.

Various universal primer pairs ( Figure 2 and Table 1) have been used for amplification of the entire or parts of the ITS region in F I G U R E 1 The head end of a typical freshwater member of Naididae (Clitellata), Limnodrilus hoffmeisteri Claparéde, 1862, today known to be a complex of cryptic species. The specimen is preserved and mounted on a microscope slide. The region of the clitellum (i.e., the "girdle") is the slight widening of the body in about the middle of the picture clitellate studies. However, universal primers sometimes have low success rate in the polymerase chain reactions (PCR) (Oceguera-Figueroa, 2012;Shekhovtsov, Golovanova, & Peltek, 2013;Trontelj & Utevsky, 2012;Vivien et al., 2015), due to poor specificity of these primers (Bellemain et al., 2010;Sipos et al., 2007). Furthermore, mismatches between primer and DNA templates might also introduce biases in PCR-based high-throughput Next Generation Sequencing (Aird et al., 2011;Deakin et al., 2014;Schirmer et al., 2015).
Universal primers thus often have to be modified to make them suitable for amplifications of specific organisms (Bellemain et al., 2010;Cheng et al., 2016;Kohout et al., 2014;Toju, Tanabe, Yamamoto, & Sato, 2012). For example, Källersjö, Von Proschwitz, Lundberg, Eldenäs, and Erséus (2005) amplified ITS sequences of freshwater bivalves using the more bivalve-specific forward primer MITS1F together with the universal primer ITS4, instead of using the primer pair ITS5/ITS4 (White, Bruns, Lee, & Taylor, 1990), which were originally developed for Fungi but are now used as a universal primer (see https://unite.ut.ee/primers.php). PCR failure may also be caused by intra-individual polymorphism (Kook et al., 2015), which has been found, for example, in the European earthworm Aporrectodea longa (Martinsson et al., 2017).
As yet, no clitellate-specific ITS primers have been formally proposed. In this paper, two new pairs of primers specifically designed to amplify the whole ITS region and ITS2 spacer in clitellates are proposed. One of them (606F/1082R for ITS2) was successfully tested also by Martinsson et al. (2017), and Liu et al. (2017).
F I G U R E 2 Diagram mapping primers for amplification of ITS2, and the ITS region as a whole, in clitellate worms. Forward (cyan arrows) and reverse primers (orange arrows) of newly designed (arrows with a black arrowhead inside) and previously published primers (without arrowhead) were marked. In addition, the commonly used primer 28SC1 (Jamieson et al., 2002;purple arrow) for amplifying 28S, the reverse of ETTS1, is also shown here. The alignment shows partial sequences of the 5.8S rDNA (located between the two Internal Transcribed Spacers, ITS1 and ITS2) of the 27 haplotypes found in our newly amplified complete ITS sequences, ranked by numbers of mismatches (high-lighted). The location of three conservative motifs (CM1-3), recognized for eukaryotes by (Harpke & Peterson, 2008), are also shown. *VIII refers to a cryptic species in the L. hoffmeisteri complex (Liu, Fend, et al. 2017

| Primer design
In contrast to the fast-evolving ITS1 and ITS2 spacers, the flanking 18S and 28S rDNA, as well as 5.8S rDNA between the two spacers, are more conserved and thus suitable as annealing regions for primers.
An alignment was generated from a collection of 742 ITS sequences referred to Clitellata, that is, all those publicly available in GenBank (NCBI), and which include at least a part of 5.8S rDNA; several of them also include parts of 18S and/or 28S rDNA. Annotation and separation of ITS1, ITS2 and 5.8S rDNA are crucial for proper alignment, but aligning ITS sequences from divergent taxa may be problematic due to length variations (Alvarez & Wendel, 2003;Simmons & Freudenstein, 2003). Therefore, the three partitions of each downloaded ITS sequence were first identified using ITSx (Bengtsson-Palme et al., 2013).
In addition, boundaries of rDNAs were tested against the Rfam databases (Nawrocki et al., 2015), and the annotations of ITS2 were also checked using the Hidden Markov model (HMM) in the ITS2 database (E-value < .001, metazoan) ). Alignments of each ITS partition were conducted using the MAFFT V 7.017 plugin with default settings as implemented in Geneious 6.1.8. Based on the consensus sequence of this alignment, primer candidates were identified within the retained series of multiple conservative sites (each >14 nucleotides long), and two primer pairs with the highest possible scores, for ITS as a whole and ITS2, respectively, were identified using the software Oligo 7 (Rychlik, 2007). Heterozygosity within PCR primer binding sites do have negative effects for amplification, but in most cases, heterozygosity is more commonly found in ITS spacer sequences than in the short flanking rDNA sequences (see Martinsson et al., 2017).

| Experimental verification of new primers
The universality of the new primers among clitellates was tested by PCR, amplifying specific fragments from 71 genomic DNA samples (47 genera, 11 families; Table 2); for extraction protocols, see Liu, Fend, et al. (2017). The samples were chosen to represent as many available families as possible, but also to cover several genera in the highly diverse family Naididae and to include some samples of very closely related species; three nominal naidids (Doliodrilus tener, Limnodrilus grandisetosus, and L. rubripenis) were even each represented by two specimens that are likely to be different (cryptic) species. A typical naidid, Limnodrilus hoffmeisteri, is shown in Figure 1. This mixture was chosen to obtain general information about ITS variability within both higher and lower taxa, which will facilitate a better annotation of new clitellate amplicons (as future reference sequences, for example, in secondary structure-based analyses of ITS). In addition, samples that did not successfully amplify with the new primers were also tested using the universal primer pair ITS5/ITS4 without additional primers (see Table 1).
The entire ITS and the ITS2 sequences were amplified, each with its new primer pair. The PCR reaction mixtures consisted of 15 μl of VWR red Taq Master Mix kit (We Enable Science, Denmark), 1 μl of primer (10 mmol/l), 2 μl of DNA template, and 6 μl distilled water.
The PCR protocol for both pairs was as follows: initial denaturation at 95°C for 5 min; 35 cycles of denaturation at 95°C for 45 s, annealing at 55°C for 60 s and elongation at 72°C for 90 s, followed by a final extension at 72°C for 8 min. Gel electrophoresis (1% agarose in 10 × TAE buffer) was carried out to check the quality of PCR products, which were then were purified using 5 μl ExoTAP (Exonuclease I and FastAP Thermosensitive Alkaline Phosphatase). Amplicons were sequenced by Eurofins (Germany). For both of the new primer pairs, amplicons at least 200 bp long were regarded as successful. The amplified sequences were then checked for adherence to clitellates by blasting them against the NCBI database.

| Primer evaluation in silico
The specificity of the new primers to clitellates (relative to other organisms) was evaluated in silico by the number of mismatches between DNA templates and primers, and the results of this were also compared with the specificity of primers previously used in clitellate studies ( Figure 2). These analyses were performed using ecoPCR (Ficetola et al., 2010) against assembled and annotated sequences (STD, version r127) in EMBL. To achieve simulation under realistic PCR conditions, up to three mismatches between a primer and its annealing sequence were allowed. The complete length of clitellate ITS sequences at NCBI normally varies between 500 and 900 bp; however, members of Branchiobdellida have a rather long (about 1200 bp) ITS1 spacer (Williams, Gelder, Proctor, & Coltman, 2013). Thus, in the simulations, sizes of ITS (as a whole) between 400 and 2500 bp were allowed, and the minimum and maximum amplified ITS2 lengths were set as 200 and 1250 bp long, respectively.

| Annotation of ITS sequences and primer design
As mentioned above, 742 GenBank sequences, representing a total of at least 46 genera belonging to 14 clitellate families (Table S1), were obtained, annotated, and aligned. As expected, in this alignment, sequence variation is much greater in the ITS spacers than in the 18S, 5.8S, and 28S rDNA partitions. The majority of the published complete 5.8S sequences contain 153 ± 1 nucleotides. Figure 1  The motif CATTA was identified as the end of 18S by the software ITSx, and this ending motif was found in eukaryote sequences from the Rfam database. In addition, it also has been found that, in some fungi, the ITS1 spacer starts after this motif CATTA (Nagy et al., 2012;Schoch et al., 2014). The complete ITS1 sequences, which begin after the conserved motif CATTA, ranged from 314 to 1117 bp in the published clitellate sequences.
Two new primer pairs suggested by Oligo 7, and now referred to as 29F/1084R and 606F/1082R, were found to be suitable for amplifications of the whole ITS region, and the ITS2 subregion, respectively, of Clitellata. The forward primer 29F (AAAGTCGTAACAAGGTTTCCGTA) matches the terminal end of 18S but after E18S-2, with its anchoring sites partly overlapping with those of the old primers ITS5 and ETTS2, and the reverse primer 1084R (YGTTAGTTTCTTTTCCTCCGCTT) partly overlaps with ITS4 but is separated from ETTS1 and E28S-2 ( Figure 2 and Figure S1). The new forward primer for ITS2, 606F (GTCGATGAAGAGCGCAGCCA), partly overlaps with ITS3 and 5.8SF but was designed to fully match the motif CM1 (Figure 1), and the corresponding reverse primer, 1082R (TTAGTTTCTTTTCCTCCGCTT), is almost identical to 1084R (Figure 2 and Figure S1), but two nucleotides shorter at the 5′ end, which makes its melting properties similar to those of 606F.

| Experimental verification of new primers
From our 71 genomic samples, 52 (73%) ITS amplicons were successfully amplified using the primer pair 29F/1084R, and 65 (91.5%) ITS2 amplicons were successfully amplified using 606F/1082R. Sequences are deposited in the NCBI database (for more details see  Table 2).
Interestingly, our attempt to amplify ITS of Chamaedrilus sphagnetorum (CE11317) using the new primer pair 29F/1084R failed, while a 909-bp-long ITS sequence (KF672519) was successfully amplified from the same individual using two pairs of primers (Martinsson & Erséus, 2014). Nevertheless, our new ITS2 amplicon (primers 606F/1082R) of this worm is identical to the corresponding part in KF672519.
The mismatches between the primers and their targeting 5.8S were investigated (see Figure 2 and Figure S1).

| Primer evaluation in silico
The in silico results varied considerably across simulations with different primer pairs (Figure 3 and Table S2). Generally, only a few ITS sequences of clitellates were successfully (in silico) amplified due to the limited number of full-length ITS sequences available. A much larger number of nonclitellate amplicons come from fungal groups, in particular, followed by, for example, chlorophytes (green algae) and some of the more species-rich invertebrate groups, such as Cnidaria, Nematoda, Arthropoda, and Plathyhelminthes (Table S2). The amplified nonclitellate sequences using 5.8FS/ITS4 were also fewer than those using 606F/1082R, and even fewer than those using ITS3/ITS4.
In addition, the possible mismatches between each primer and the haplotypes of the corresponding template regions in the newly amplified ( Figure 2) and previously published clitellate ITS sequences were estimated, and differences in all these mismatches (number and position) are summarized in Figure S1. with recognizable 5.8S region were selected for primer design. Many such published ITS sequences are commonly co-amplified with some rDNA residues, but the various parts of the (18S)-ITS1-5.8S-ITS2-(28S) sequences are neither properly annotated nor partitioned. It is widely accepted that an accurate alignment of positional homologies is highly important for the final phylogenetic reconstruction (Katoh & Standley, 2013;Ogden & Rosenberg, 2006). However, indel events make multiple alignment of divergent ITS sequences challenging, due to a high risk of inferring false-positive positional homologies and increasing artefactual support for incorrect relationships (Nagy et al., 2012

| Limitations of universal ITS primers
Universal ITS primers do not perfectly match their annealing template sequences of all organisms (see https://unite.ut.ee/primers. php). Even for the well-studied Kingdom Fungi, it is difficult to amplify the whole ITS region of all groups using a single universal primer pair (Konieczny, Roterman-Konieczna, & Spólnik, 2014). The in silico analyses of published data showed that the ITS primers traditionally used for clitellates are neither universal nor efficient enough for this group; for example, the primer 5.8SF may have up to five mismatches with its template DNA (Figure 2). Although this result may have been biased by the limited number of clitellate sequences (and lacking representation of some families) in the EMBL database, we also observed notable mismatches ( Figure S1) between the newly amplified complete ITS sequences (using 29F/1084R) and primers targeting 5.8S rDNA: E58S-F1, ITS3, 5.8SF, 5.8SR, ITS1B, and E58S-R1 (see also Figure 2). Unfortunately, there is not much information about the flanking 18S rDNA ( Figure S1) to optimize the specific clitellate primers for amplification of the whole ITS region. Still, however, as noted above, Martinsson and Erséus (2014)  For primers, in general, even one or a few mismatches between primer and DNA template may jeopardize amplification (Bellemain et al., 2010;Bru, Martin-Laurent, & Philippot, 2008;Huang, Arnheim, & Goodman, 1992;Ihrmark et al., 2012;Wright et al., 2014;Wu, Hong, & Liu, 2009). In addition, especially for clitellates feeding on plant material and fungi (Bonkowski, Griffiths, & Ritz, 2000;Curry & Schmidt, 2006;Uchida et al., 2004), it could be hypothesized that universal primers may amplify fragments of contaminating plant or fungal sequences instead of sequences of clitellates. However, it is likely to avoid, or at least minimize, contamination, and also amplification of pseudogene sequences, using the new primer 606F, which targets a specific conservative motif in the clitellate 5.8S. The sensitivity of PCR success rate to primer mismatches probably needs further investigation, but amplification of GC-rich ITS sequences may be improved by following a combination strategy of adding enhancers and modifying the PCR cycle conditions (Mamedov et al., 2008;Sahdev, Saini, Tiwari, Saxena, & Singh Saini, 2007). In our case, however, the GC contents of the whole ITS and its partial ITS2 sequence are almost equal. It seems that the length of target loci is more critical for successful amplification and sequencing than any of the other factors mentioned above. To use a single primer pair to amplify ITS sequences longer than about 1,500 bp is challenging. Thus, to choose one of the generally much shorter ITS spacers (with flanking rDNAs providing reliable primer templates) may be the optimal option for broad samples of clitellate taxa.

| Choosing primers
Although only two-thirds of the clitellate samples were successfully amplified using the primer pair 29F/1084R, the in silico test showed that the specificity of this primer pair is better than that of ITS5/ ITS4 and ETTS1/ETTS2 (Figure 3). Therefore, when this pair proves to work for some clitellate taxa, it is likely to be a good option for sequencing the ITS region as a whole; that is, if it is <about 1,500 bp long.
The in silico results not only give a hint about the relative performance of commonly used and new ITS primer pairs, but they also predict potential nontarget amplicons and length of amplicons before selecting a primer pair for studies of a specific clitellate group. In the in silico test of different ITS2 primers, 5.8SF/ITS4 theoretically performed better than 606F/1082R, that is, the former pair amplified more clitellate sequences and less nonclitellate sequences than the latter ( Figure 3). However, this was only under rather relaxed conditions (2-3 mismatches allowed). Moreover, poor specificity of the 5.8SF (as shown in the Figure S1), originally designed for bivalves (Källersjö et al., 2005), limits the potential number of ITS2 amplicons.
Because of this, while ITS5/ITS4 produced almost 70,000 nonclitellate ITS amplicons, 5.8SF/ITS4 could only generate a very low number of ITS2 amplicons (Figure 3). On the other hand, 606F, targeting a conservative and unique 5.8S motif of clitellates, was much more specific than any of the older primers for clitellates (Figure 2; Figure S1). The pair 606F/1082R also had a low success rate in silico amplifications of nonclitellate groups (Figure 3). Therefore, this new primer pair is more suitable than other published primers to amplify the ITS2 regions from a taxonomically broad range of clitellates.
The primer with a 3′-terminal "A" nucleotide, that is, our new primers 29F, 606F, and 1082R, may be less efficient in amplifications using Taq DNA polymerase, regardless of the corresponding nucleotide in the template strand (Arezi, Xing, Sorge, & Hogrefe, 2003;Ayyadevara, Thaden, & Shmookler Reis, 2000). Therefore, alternative polymerases may help to increase the success rate for some clitellate specimens.
For some polyploid clitellates (e.g., within Lumbricidae, Enchytraeidae, and Naididae (see Casellato, 1984;Gregory & Hebert, 2002) with multiple copies of the ITS region, however, sequencing using our new primer may still be challenging. This is because the Sanger sequencing method can only be performed on a single pure amplicon. Using a particular PCR primer pair to amplify multiple copies of a gene may lead to double peaks in the chromatograms at sites that differ between the copies. The PCR may even fail completely because all sites after indels (introns leading to sequence length differences) will produce seemingly undecipherable double peaks (Griffin, Robin, & Hoffmann, 2011). In such cases, the software Champuru (http://seqphase.mpg. de/champuru/), which is able to detect and separate the gene copies, may be useful for diploids (Flot, 2007), while cloning or Next Generation Sequencing may be more practical tools for polyploids (Aversano et al., 2012;Brassac & Blattner, 2015;Griffin et al., 2011). Megascolecidae, Naididae, Phreodrilidae, and Randiellidae. This will facilitate many kinds of molecular systematic studies of this common and ecologically important group of worms. The other pair, 29F/1084R amplifying the whole ITS, will be a useful complement to existing ITS primers.

ACKNOWLEDGMENTS
The first author was sponsored by a PhD student fellowship from the F I G U R E 3 In silico PCR output, that is, numbers of GenBank ITS and ITS2 sequences amplified, using different primers pairs, and allowing 0-3 nucleotide mismatches between published sequences and primers. Primers for amplifying complete ITS sequences are in bold face, those for ITS2 are not, and the newly designed pairs are marked with an asterisk (*). The colors blue, gray, yellow, and orange, respectively, separate sequences with no mismatches with primers (e0), or, for at least one primer in the pair, with 1 mismatch (e1), 2 mismatches (e2), or 3 mismatches (e3