The methylome is altered for plants in a high CO2 world: Insights into the response of a wild plant population to multigenerational exposure to elevated atmospheric [CO2]

Unravelling plant responses to rising atmospheric CO2 concentration ([CO2]) has largely focussed on plastic functional attributes to single generation [CO2] exposure. Quantifying the consequences of long‐term, decadal multigenerational exposure to elevated [CO2] and the genetic changes that may underpin evolutionary mechanisms with [CO2] as a driver remain largely unexplored. Here, we investigated both plastic and evolutionary plant responses to elevated [CO2] by applying multi‐omic technologies using populations of Plantago lanceolata L., grown in naturally high [CO2] for many generations in a CO2 spring. Seed from populations at the CO2 spring and an adjacent control site (ambient [CO2]) were grown in a common environment for one generation, and then offspring were grown in ambient or elevated [CO2] growth chambers. Low overall genetic differentiation between the CO2 spring and control site populations was found, with evidence of weak selection in exons. We identified evolutionary divergence in the DNA methylation profiles of populations derived from the spring relative to the control population, providing the first evidence that plant methylomes may respond to elevated [CO2] over multiple generations. In contrast, growth at elevated [CO2] for a single generation induced limited methylome remodelling (an order of magnitude fewer differential methylation events than observed between populations), although some of this appeared to be stably transgenerationally inherited. In all, 59 regions of the genome were identified where transcripts exhibiting differential expression (associated with single generation or long‐term natural exposure to elevated [CO2]) co‐located with sites of differential methylation or with single nucleotide polymorphisms exhibiting significant inter‐population divergence. This included genes in pathways known to respond to elevated [CO2], such as nitrogen use efficiency and stomatal patterning. This study provides the first indication that DNA methylation may contribute to plant adaptation to future atmospheric [CO2] and identifies several areas of the genome that are targets for future study.


| INTRODUC TI ON
Increased atmospheric CO 2 concentration ([CO 2 ]) is a significant driver of climate change, with [CO 2 ] of 430-1,000 ppm predicted for 2100, for the first time for several million years (IPCC, 2014;Pearson & Palmer, 2000). Understanding the molecular mechanisms underpinning plant response to rising [CO 2 ] is critical to a wider understanding of ecosystem change (Barnaby & Ziska, 2012), for crop performance (Myers et al., 2014) and the conservation of landscapes (Monroe et al., 2018). could be a driver of evolutionary change over macroevolutionary timescales (Franks & Beerling, 2009;Haworth, Elliott-Kingston, & McElwain, 2011). Additionally, selection experiments conducted in controlled environments generally suggest a role for elevated [CO 2 ] as a potential driver of contemporary evolution (Frenck, Linden, Mikkelsen, Brix, & Jørgensen, 2013;Ward, Antonovics, Thomas, & Strain, 2000). However, these studies are limited in number and infer evolution solely from morphological phenotypes. Furthermore, in realistic field conditions, adaptation to elevated [CO 2 ] is highly dependent on ecological context (Grossman & Rice, 2014;Kleynhans, Otto, Reich, & Vellend, 2016). Thus, a greater understanding of the mechanistic basis of plant adaptation to elevated [CO 2 ] is required. Natural CO 2 springs provide a powerful resource for investigating plant response to elevated [CO 2 ] over multiple generations without the extensive labour, time and financial costs associated with other systems (Körner & Miglietta, 1994). The responses of plants exposed to elevated [CO 2 ] at such springs are generally consistent in direction and magnitude to those observed in single generation free air CO 2 enrichment experiments, for a range of morphological traits (Saban, Chapman, & Taylor, 2018). However, crossed factored experiments have also shown that plant responses to elevated [CO 2 ] at natural CO 2 springs are not solely plastic, with adaptation to elevated [CO 2 ] inferred in some morphological traits in these plant populations (Nakamura et al., 2011;Watson-Lazowski et al., 2016).
One study identified adaptation to elevated [CO 2 ] in the gene expression profiles of plants at a CO 2 spring and nearby control site, but with very little genetic divergence between the two populations (Watson-Lazowski et al., 2016). This highlights a potential role for indirect genetic effects (e.g. DNA methylation changes) in the response of plant populations to multigenerational elevated [CO 2 ] exposure but to date, these have yet to be investigated.
Methylation of cytosine in DNA occurs more extensively in plants than animals, and with large variation in patterns between species (Niederhuth et al., 2016). In plants, DNA methylation occurs in three cytosine contexts, CG, CHH and CHG (where H is any base except G) (Henderson & Jacobsen, 2007) with different mechanisms of establishment and apparent function of methylation depending on both the cytosine sequence context and the wider genomic context (Song & Cao, 2017). Broadly, DNA methylation appears to function to silence the mobility of transposable elements, contribute to genome stability and integrity and may also play a role in gene expression regulation (Eichten, Schmitz, & Springer, 2014;Zhang, Kimatu, Xu, & Liu, 2010). Currently, there is no understanding of the relevance of this mechanism in determining plastic and adaptive responses to elevated [CO 2 ].
Given the potential role of DNA methylation in modulating gene expression as part of a plastic response to environmental cues (Garg, Chevala, Shankar, & Jain, 2015), coordination of some element of plant response to elevated [CO 2 ] by reprogramming of global DNA methylation is an attractive hypothesis to explain previous observations (Watson-Lazowski et al., 2016). Furthermore, the observation that in plants methylation can be maintained through mitotic and meiotic cell division (Quadrana & Colot, 2016;Verhoeven, Jansen, van Dijk, & Biere, 2010) has led to the hypothesis that methylation could provide transgenerational 'memory' of ancestral environment, contributing to phenotype expression in offspring (Heard & Martienssen, 2014;Quadrana & Colot, 2016). However, experimental evidence of environmentally induced methylation patterns that both influence phenotype and are inherited into the next generation is rare (Crisp, Ganguly, Eichten, Borevitz, & Pogson, 2016;Quadrana & Colot, 2016). The role of DNA methylation in coordinating plastic response to elevated [CO 2 ] has not been explored, despite recent progress in understanding trait responses. For example, the mechanistic basis of altered stomatal patterning (Engineer et al., 2016;Xu, Jiang, Jia, & Zhou, 2016) has identified genes regulating this pathway, while patterns of plant growth and metabolism are well established in response to exposure to elevated [CO 2 ] (Gamage et al., 2018). Plants growing at natural CO 2 springs provide a resource to explore the role of DNA methylation in both the plastic and evolutionary mechanisms coordinating the response to elevated [CO 2 ] exposure, where candidate genes involved in the plastic morphological response may have already been identified. The aim of the research described here was to explore the mechanistic basis for the plastic and adaptive response to elevated [CO 2 ] in the non-model plant species, Plantago lanceolata L., utilizing populations exposed to naturally elevated [CO 2 ] for many generations in the Bossoleto CO 2 spring in Italy. This spring is thought to be more than 100 years old, with P. lanceolata populations documented since at least 1992 (Miglietta, Raschi, Bettarini, Resti, & Selvi, 1993). A crossed factored experiment was conducted with progeny of these plants and those from a nearby control site, by growing seed in a common environment for one generation before growth in either elevated or ambient [CO 2 ] growth chambers. We combined previous phenotypic analysis of this experiment with a reanalysed RNA-Seq dataset and novel whole-genome sequencing (WGS) and whole-genome bisulphite sequencing (WGBS) datasets for 24 individuals to analyse genetic variation, DNA methylation patterns and gene expression. In applying multi-omic techniques to a crossed factored experiment, we addressed the following questions: (a) Is there evidence of genetic divergence between populations growing at elevated and ambient [CO 2 ]? (b) Does DNA methylation respond to single or multigenerational elevated [CO 2 ] exposure? (c) What role do genetic differentiation and methylation variation play in the plant response to elevated [CO 2 ]? This study also addresses the challenge of working with a non-model organism, deploying tools developed in model species to advance genomic resources of a non-model but ecologically important species. This approach is critically needed to advance the study of plant ecological epigenetics (Richards et al., 2017).

| Plant material and sampling site
The study organism for this work was P. lanceolata L. (Plantaginaceae), an herbaceous perennial with widespread geographical distribution. The experimental approach has been described previously in Watson-Lazowski et al. (2016) (Scholefield et al., 2004). Seeds obtained from the CO 2 spring and control sites were grown for one generation in the glasshouse at the University of Southampton and then crossed within maternal families to standardize parental effects. Seeds from crosses were transferred into one of eight experimental growth chambers. Four chambers were set to ambient [CO 2 ] (410.63 ± 33.74 ppm) and four chambers were set to elevated [CO 2 ] (718 ± 46.81 ppm, respectively). On the 58th day, the second or third youngest leaf was harvested and stored at −80°C for RNA and DNA extractions. Further details of the experimental design (previously reported in Watson-Lazowski et al., 2016) and the analysis summarized here are available in the extended methods, Methods S1.
The bioinformatics pipeline for analysis of genetic, methylation and gene expression variation in this experiment includes the integration of multi-omic data sets for this non-model species ( Figure 1).

| Genome assembly
To facilitate analyses of methylation and genetic variation in plants in the experiment, the P. lanceolata genome was assembled from short reads. DNA was extracted from leaf material from an individual taken as seed from the CO 2 spring site and grown in the University of Southampton glasshouse. DNA was sequenced as 150 bp paired ends with 350 bp insert size by Novogene Bioinformatics Institute (Beijing, China). Raw sequencing data were filtered for contaminants with FastQ Screen v0.11.3 (http://www.bioin forma tics.babra ham.ac.uk/proje cts/fastq_scree n/) and trimmed with Trimmomatic v0.36 (Bolger, Lohse, & Usadel, 2014). Error correction was performed using SOAPec (Luo et al., 2012) and reads were assembled using ABySS (Simpson et al., 2009; Table S1). The genome was filtered to contigs larger than 2 kb for further analyses using seqtk (Li, 2016), as a trade-off to increase computational efficiency while maximizing sequence availability for downstream analysis. Genome completeness was assessed using BUSCO v3 (Simão, Waterhouse, Ioannidis, Kriventseva, & Zdobnov, 2015).

| Chloroplast assembly
The chloroplast genome was assembled to test bisulphite conversion efficiency of sample sequences in methylation analysis.
Singleton SNPs were removed, and SNPs were filtered to include only those with identity information in at least eight of the 12 individuals per population. Variation was visualized using smartPCA within Eigensoft (Price et al., 2006) and basic population genetic F I G U R E 1 A pipeline for integrating multi-omic datasets in a non-model species. Plants derived from CO 2 spring and control sites were grown in ambient and elevated [CO 2 ] treatments. Leaf material was taken from six plants per treatment group (n = 24 total) and DNA and RNA were extracted for genome, methylome and transcriptome analysis. To facilitate these analyses, the genome of Plantago lanceolata (1C genome size = 1.4 Gb; Wong & Murray, 2012) was sequenced with short reads and assembled statistics were calculated with VCFtools (Danecek et al., 2011). D xy and F ST were calculated in 10 kb complete sliding windows with 1 kb step size, using publicly available python scripts (Martin, 2017).

| Whole-genome bisulphite sequencing
Bisulphite conversion of DNA was carried out using the Zymo EZ DNA Methylation-Lightning kit (Zymo Research) and bisulphite converted reads were prepared using the TruSeq DNA Methylation kit (Illumina). WGBS libraries were sequenced as 150 bp single end reads with a 20% phiX spike-in. Raw reads were trimmed using Trimmomatic v0.32 (Bolger et al., 2014) and phiX DNA and contamination was removed with FastQ Screen. Bisulphite conversion efficiency was calculated as the percentage of unmethylated cytosines that were not converted to thymine in chloroplast sequences, since the chloroplast genome is expected to be unmethylated (Fojtová, Kovařıḱ, & Matyášek, 2001).
Due to the high level of fragmentation in the assembled P. lanceolata genome, all WGBS reads were aligned to the single reference rather than WGS for each individual. This approach has the advantage of being computationally tractable and is widespread in the literature (e.g. Lu et al., 2017;Yaish et al., 2018) but has the disadvantage that cytosine (C) to thymine (T) SNPs between individuals cannot be distinguished from an unmethylated C in one individual and a methylated C in another following bisulphite conversion.
Therefore, sites of C-T polymorphism identified in the population were removed from the analysis of differential methylation.

1,000 bp tile analysis in methylKit. Methylation call files were
converted to coverage and bed files and sorted by chromosome with bismark2bedgraph (Krueger & Andrews, 2011) and then converted to report files with coverage2cytosine and read into Methylkit with coverage ≥3 (Akalin, 2016) in R v 3.4.1 (R Core Team, 2015). 1,000 bp tiles present across all samples were identified using Methylkit (Akalin et al., 2012). Global analysis of methylation in unfiltered 1,000 bp tiles was conducted using the PCAsamples function in Methykit.

DMS identification in DSS.
Differentially methylated sites were identified from methylation calls using the R package DSS (Park & Wu, 2016). DSS implements algorithms for the identification of differential methylation using the dispersion shrinkage method and Wald tests assuming a betabinomial error distribution. DMSs were identified from filtered coverage report files, including only sites with coverage ≥3 and with ≥3 samples per treatment group. Downstream analyses were conducted using DMSs filtered by FDR < 0.05. The number of DMSs was calculated in 10 kb complete sliding windows with a 1 kb step size and these were plotted against D xy and F ST in that region. Correlations were assessed with Spearman's rank correlation using ρ 2 to approximate the proportion of shared variance between the ranked variables.

| Gene expression
The RNA sequencing dataset previously reported in Watson- Adapters and poor-quality bases were trimmed from raw reads with Trimmomatic v0.32 (Bolger et al., 2014). The reference transcriptome was assembled de novo from normalized reads of 12 samples using Trinity v2.4.0 (Haas et al., 2013). Transcripts from the assembled transcriptome were aligned to The Arabidopsis Information Resource and transcripts were considered DE when FDR < 0.05. Transcripts were functionally annotated and assigned to Gene Ontology (GO) categories using DAVID (Dennis et al., 2003) and visualized using GOplot (Walter, Sánchez-Cabo, & Ricote, 2015).

| Co-location of DMSs, SNPs and DE transcripts
For analysis of the co-location of DE transcripts to regions of the genome with DMSs and SNPs, we mapped transcripts to the 2 kb filtered genome with BLAST (e < 1 × 10 −4 and >95% similarity). The locations of mapped DE transcripts were cross-referenced with the location of DMSs and F ST outlier SNPs using custom R scripts with GenomicRanges (Lawrence et al., 2013). LD decay was calculated using PopLDdecay (Zhang, Dong, Xu, He, & Yang, 2019) and 1 kb was used to identify DMSs and SNPs that were potentially in LD with DE transcripts, since r 2 had decreased to 0.11 at 1 kb ( Figure S3). For predicted gene features and transcripts mapping to the genome, methylation of those features was plotted against the expression of the transcript.

| RE SULTS
The study species for this analysis, P. lanceolata is a non-model species with previously limited genetic resources available. Here, a genome assembly, WGS, WGBS and RNA-Seq have been integrated in parallel, to explore the mechanistic basis of plant response to elevated [CO 2 ] (Figure 1).

| Genome assembly and feature prediction
Short read genome assembly produced a genome of 1,425,357 kb of sequence, which corresponds closely to the 1.4 Gb estimated F I G U R E 2 Principal component analysis of variation in omic datasets of Plantago lanceolata plants derived from CO 2 spring or control site populations, and grown under elevated or ambient [CO 2 ] (n = 24, six reps per treatment group). Principal components one and two are given with percentage of variance explained by the component. Treatment groups; CA, Control Ambient; CE, Control Elevated; SA, Spring Ambient; SE, Spring Elevated. All analyses show clustering by population site of origin (spring vs. control) in principal component 1 or principle component 2. (a) Single nucleotide polymorphism (SNP) variation from 8 million SNPs, with singletons removed and SNPs filtered to those that contain information for at least 8/12 individuals per population; (b) variation in gene expression profiles; (c) variation in methylation at cytosines in the CG context across complete 1 kb tiles; (d) variation in methylation in the CHG context; and (e) variation in methylation in the CHH context. Note that principal component analysis of SNP variation was conducted using smartPCA (Price et al., 2006), while prcomp() in R was used for gene expression and methylation variation. The scale of the principal components differs between these two methods but this does not affect interpretation of clustering genome size of P. lanceolata (Wong & Murray, 2012). The assembly was highly fragmented, consisting of 4,075,744 contigs with N50 of 1.8 kb (Table S1). When the genome was filtered to sequences >2 kb, 13.7% of the full sequence was represented, with 63.2% of single-copy orthologs (SCO) from Arabidopsis thaliana identified with BUSCO (Simão et al., 2015) as being complete and a further 7.2% present but fragmented (Table S2; Figure S4). The Maker annotation pipeline (Cantarel et al., 2008) predicted 16,039 genes with 26% of predicted genes having AED < 0.5 (90% is considered well-annotated; Campbell, Holt, Moore, & Yandell, 2014).
Gene prediction was generally poor, likely because of the high fragmentation of the contig assembly and lack of protein sequence evidence from Plantago or closely related species for training, an issue common to genetic analyses in non-model species.
Genes with AED < 0.5 were considered sufficiently supported for analysis of general trends of methylation across gene features.
The chloroplast was assembled as two alternative sequences of 149.6 kb, differing in the orientation of an inverted repeat sequence ( Figure S5).

| Low overall genomic differentiation between spring-and control-derived populations
WGS produced 57 million paired end reads per sample (15× cover-  (Table 1). Nucleotide diversity was significantly higher in the control population (Table 1). The exon regions of the spring population had a negative Tajima's D, indicating some purifying selection. Within each population, Tajima's D was positive across the whole genome, signifying low levels of extreme frequency polymorphisms characteristic of balancing selection or a decrease in population size. The most parsimonious explanation for this pattern of differentiation is that the spring population originated from the control population and underwent weak purifying selection in exon regions ( Figure S6). Of the SNPs, 974 (0.05%) were identified as putative targets of divergent selection between the spring and control populations as F ST outliers (q < 0.05).

| Divergence in methylation profiles between control and spring populations, with limited methylome remodelling under single generation exposure to elevated [CO 2 ]
Whole-genome bisulphite sequencing produced 113 million single end reads per sample (12× coverage). Conversion efficiency was >98.5% for all samples ( Note: Estimates are provided as mean ± SD. Tajima's D and nucleotide diversity were calculated in 100 bp windows. T tests were used to determine whether nucleotide diversity and Tajima's D significantly differed between the population derived from the spring site and from the control site.

TA B L E 1
Population genomic analysis statistics calculated from SNPs for wholegenome and exon-only regions relatively high with 83%, 70% and 15% of cytosines methylated in the CG, CHG and CHH contexts, respectively. As for other plant species, methylation was depleted at the transcription start site of predicted genes and CHG and CHH methylation were additionally depleted across the gene body (Figure 3d; Figure S7). Methylation in the CG context increased to near non-genic levels across the gene body and was depleted again at the transcription end site.
Differences in methylation between the two populations dominated the methylation variation identified in this experiment, both in unfiltered tile analysis and in differential methylation analysis. In global methylation patterns in unfiltered 1 kb tiles, PCs dividing the individuals by site of origin explained 8.8% of the variation in CG methylation (PC1), 5.6% of the CHG methylation (PC2) and 5.3% of the CHH methylation (PC2) (Figure 2c-e). This is similar in magnitude F I G U R E 3 There were more differentially methylated sites (DMSs) and regions (DMRs) associated with individuals that originate from different sites (spring or control) than there were associated with growth [CO 2 ] treatment (ambient or elevated).  Table S4).
Differential methylation was disproportionately represented among cytosine contexts in DMSs analysis, with CG and CHG DMS enriched relative to the distribution of total cytosine sites at which methylation status was called (Figure 3c; Table S5).
It is estimated from this analysis that 9.3% of identified DMS would have been called erroneously if the genome for each individual had not been sequenced to identify C->T variants in the population (Table S6). This exemplifies that alignment of WGBS reads to a reference genome other than the genome of the individual can erroneously inflate the number of DMSs called and serves to caution the interpretation of DMSs in the absence of WGS for each individual. This approach cannot identify differential methylation at sites if there is also C-T polymorphism, but exclusion of these sites at least gives a conservative estimate of differential methylation.

| Correlations between genetic and methylome variation account for approximately 1% of differences in methylation between populations
When the genome was analysed in complete 10 kb tiles with a 1 kb step size, there were small but significant positive correlations between the number of DMSs associated with population of origin and both absolute (D xy ) and relative (F ST ) sequence divergence across the tile (Figure 4). Correlation with sequence divergence explained approximately 1% of the variation in number of DMSs in a 10 kb window across the three sequence contexts.
This suggests that to some extent, regions of the genome with high sequence divergence are also more likely to harbour more DMSs (Table S7).

| Divergence in the gene expression response of spring-and control-derived plants to elevated [CO 2 ]
A total of 160,279 transcripts were de novo assembled in 100,890 components (loosely comparable to genes) in Trinity, and 86% of transcripts had a BLASTN match with e < 1 × 10 −

| Co-location of DE transcripts, DMSs and high divergence SNPs highlight potential targets for differential methylation and selection
In total, 7,089 transcripts (corresponding to 4,205 components) the exon region of a gene were identified, as well as a weak association between CHG methylation across the exon region and the expression of that region ( Figure S9; Table S9).
Among the 64 transcripts identified as co-locating to within 1 kb of a DMS or SNP, many were annotated as relevant to plant physiolog-  (a) and (b) are given with significance of adjusted p-value indicated as; **p < .01; ***p < .001 hypermethylated in plants derived from the spring site population, but this differential methylation does not correlate with the observed expression differences (Figure 7c).

| Low overall genomic differentiation between the spring-and control-derived populations
The low sequence divergence between spring-and control-derived populations calculated here is supported by previous calculations from transcriptome data (Watson-Lazowski et al., 2016) and F I G U R E 7 Selected differentially expressed transcripts that map to within 1 kb of a differentially methylated site (DMS) or single nucleotide polymorphism (SNP). These examples illustrate the potential role of methylation in plant adaptation to elevated [CO 2 ] the potential for adaptation through selection on the genetic sequence and the potential for variation in methylation and expression to be a component of broader processes in the plant response to elevated [CO 2 ]. TMM-normalized expression of the four treatment groups based on site of origin (spring or control) and growth [CO 2 ] (ambient or elevated; left-hand side) and a schematic of the location of DMS or SNPs within 1 kb of the region to which the differentially expressed transcripts map to in the fragmented genome (right-hand side). The location to which the transcript mapped is indicated with a striped box, but note that this represents the length of the match, not the length of the whole transcript (white boxes). The position of a DMS or SNP is indicated by a black triangle, with the reference/alternate nucleotide given for SNPs and the average percentage methylation with 95% confidence intervals (calculated with the adjusted Wald method) given for DMS: (a) putative nitrate transporter (NPF6.2), (b) NAD(P)-binding Rossmann-fold superfamily protein (FLDH) and (c) BURP domain-containing protein (RD22) between populations of P. lanceolata in other studies (Bos, Harmens, & Vrieling, 1986;Tonsor, Kalisz, Fisher, & Holtsford, 1993;Van Dijk, Wolff, & De Vries, 1988). Limited pollen dispersal distance (1.5 m; Bos et al., 1986), passive seed dispersal (0.08 m; Bos et al., 1986) and obligate outcrossing likely results in high within population genetic diversity and high genomic heterogeneity (Gáspár, Bossdorf, & Durka, 2018), and therefore interpopulation divergence is generally low. This is especially likely given the proximity (200 m) of the in situ spring and control populations used in this study. Lower genetic diversity in the spring population suggests that the spring population originated from the control population.

| Divergence in methylation profiles between control and spring populations, with limited methylome remodelling under single generation exposure to elevated [CO 2 ]
Both methods identified significant methylation differences between control and spring populations and these differences were 10-fold greater than those induced by exposure to elevated [CO 2 ] for a single generation. The divergence in CG and CHG methylation profiles between the control and spring populations was stable even after at least one generation of growth at ambient [CO 2 ], highlighting that a significant proportion of the methylation differences are not (or no longer) responsive to [CO 2 ]. Furthermore, since these methylation differences occur at high enough frequency to distinguish the spring and control population, the causative underlying genetic or epigenetic variation must be subject to either selection or drift.

| Correlations between genetic and methylome variation only account for approximately 1% of the large differences in methylation between populations
Three possible processes could contribute to the large variation in methylation (particularly CG) between the two populations.
Methylation variation may be associated with genetic variation either (a) in cis or (b) in trans and this is subject to selection or drift.
Alternatively, (c) spontaneous epimutations may arise and reach high frequency in the population as a result of selection or drift.
We found that only 1% of the variation in number of DMSs in a 10 kb window was explained by a positive correlation with sequence divergence of that window, and that DE transcripts more frequently co-located to DMSs than to SNPs. This implies that genetic variation associated with methylation variation in cis has a relatively small contribution to these observed differences. The amount of DNA methylation variation explained by genetic variation has previously been estimated in P. lanceolata at 2%-3% from MSAP and AFLP markers but with limited resolution (Gáspár et al., 2018). However, these analyses (including this one) do not consider the relationship between larger structural variants with variation in methylation profiles, which may provide further insight Kawakatsu et al., 2016;Schmitz et al., 2013). This may be particularly important considering the role of methylation in silencing transposons and the link between transposon activity and mutation rates (Wicker et al., 2016).
Since only 13% of the P. lanceolata genome is analysed here it has not been possible to characterize the association between genetic variation and methylation variation acting in trans. A study in Arabidopsis revealed gene body CG methylation variation in populations along a longitudinal gradient was linked to trans-acting polymorphisms subject to selection (Dubin et al., 2015). However, a review by Vidalis et al. (2016) argues that over short time scales methylome evolution is more likely to be driven by spontaneous epimutations while long-term methylome evolution is driven by genomic changes. Establishing the relative contribution of each of these processes to the observed methylation differences between populations would facilitate a greater understanding of the evolutionary processes driving this variation.
This could be achieved with the existing methylation dataset but will require a more complete genome assembly from long read sequencing. Whether these epimutations arise 'spontaneously' or whether they arise due to genetic variation, they could play a significant role in coordinating plant response to elevated [CO 2 ] if they impact the regulation of genes (Chinnusamy & Zhu, 2009).
Epigenetic mechanisms more broadly (including DNA methylation, histone modification or RNA interference) have well-documented roles in response to multigenerational abiotic stresses in some species (Lämke & Bäurle, 2017). Eriksson, Szukala, Tian, and Paun (2020) propose that epigenetic responses to such abiotic stimuli are complex with multiple pathways to modulate gene expression and are diffused across the genome rather than highly localized. This is consistent with our findings of DNA meth-   (Barnes et al., 1997;Nakamura et al., 2011). Although the mechanistic basis of this adaptation has not been well studied, a loss of plasticity in selected traits that are initially responsive to elevated [CO 2 ] has been observed in CO 2 spring plant populations, in support of our own findings (Barnes et al., 1997;Nakamura et al., 2011). Furthermore, a study of an annual plant species in a 7-year experiment identified genetic assimilation as the mechanism for this reduction in trait plasticity after multigenerational exposure (Grossman & Rice, 2014 (Körner & Miglietta, 1994) or altered microbial activity (Šibanc et al., 2018).

| Co-location of DE transcripts to DMSs and high divergence SNPs highlight the potential role of differential methylation and selection in plant adaptation to elevated [CO 2 ]
Through the identification of DE transcripts and their co-location to DMSs and outlier SNPs, we highlight a potential role for methylation and genetic variation underlying differential expres- This analysis would greatly benefit from an improved reference assembly, which would also facilitate an analysis of larger-scale genetic variation and provide greater resolution for identifying potential candidates. The current genome assembly facilitated the analysis of a proportion of the genome (13%, with 44% of the transcriptome mapping) where co-location of DMSs/outlier SNPs and DE genes will be missed if they map to different contigs that are, in reality, adjacent or in linkage disequilibrium. The co-location approach is also limited in that it can only detect where methylation could impact gene regulation through a cis acting mechanism and not those that act in trans (Niederhuth & Schmitz, 2017). Furthermore, identification of co-locating DE transcripts and DMSs/outlier SNPs is limited by estimates for LD, which is known to be highly heterogeneous across the genome (Gupta, Rustgi, & Kulwal, 2005). As genomic resources are further developed for non-model species found at CO 2 spring sites, we anticipate the identification of candidate genes whose expression are impacted by cis or trans methylation differences.
This study provides insight into the potential mechanisms coordinating plant plastic and evolutionary response to growth at elevated [CO 2 ] over multiple generations. A crossed factored experiment was combined with multi-omic approaches to study these evolutionary mechanisms in a high CO 2 spring system. The results highlight a role for both genetic variation and methylome variation in plant adaptation to elevated [CO 2 ]. We found substantial divergence in methylation profiles between populations naturally exposed to elevated [CO 2 ] at a CO 2 spring relative to a control population. This variation may be attributable to genetic variation in association with methylation variation in cis or trans or to evolutionary processes acting on spontaneous epimutations. In addition, there was some responsivity of the methylome to growth at elevated [CO 2 ], a proportion of which may be stably transgenerationally inherited. Taken together, these data, alongside plant phenotypic responses and altered gene expression profiles, suggest there is local adaptation to elevated [CO 2 ] in these CO 2 spring populations. The co-location of DE genes to differential methylation and/or genetic polymorphisms provide examples of the potential role of these mechanisms in coordinating adaptation.
A critical implication of this is that DNA methylation changes could facilitate adaptation to rising [CO 2 ] even in the absence of high intrapopulation genetic variation, with implications to conservation and crop-breeding. Of the 59 areas of the genome where DE genes co-locate to differential methylation or putatively divergent sequences, many were annotated as components of pathways responsive to elevated [CO 2 ]. For example, two DE transcripts in the stomatal patterning pathway co-located to these regions. Stomatal development and function is one of the few plant phenotypic traits where there is some evidence for adaptation to elevated [CO 2 ] in both the fossil record (Franks & Beerling, 2009;Hetherington & Woodward, 2003) and studies of plants at natural CO 2 springs (Saban et al., 2018).
Since changes in stomatal function and patterning have wider consequences for crop water use and hydrological cycling (Franks, Berry, Lombardozzi, & Bonan, 2017), these candidate genomic regions and evolutionary mechanisms are worthy of further exploration.

| CON CLUS ION
Utilizing P. lanceolata seed from a natural CO 2 spring in a crossed

ACK N OWLED G EM ENTS
This work was supported by the use of the IRIDIS High Performance Computing Facility and associated support services at the University of Southampton. We thank F. Miglietta for assisting with access to the CO 2 spring site and Y. Lin for experimental work. We also thank three anonymous reviewers whose insightful comments helped to substantially improve and clarify this manuscript. This work was

DATA AVA I L A B I L I T Y S TAT E M E N T
The raw whole genome sequencing and whole genome bisulfite sequencing data have been uploaded to the NCBI short read archive as BioProject PRJNA649873, alongside the contig genome and chloroplast assemblies. The corresponding RNA Seq raw reads are available from the NCBI short read archive BioProject PRJNA338760.