High‐Density RNA Microarrays Synthesized In Situ by Photolithography

Abstract While high‐density DNA microarrays have been available for over three decades, the synthesis of equivalent RNA microarrays has proven intractable until now. Herein we describe the first in situ synthesis of mixed‐based, high‐density RNA microarrays using photolithography and light‐sensitive RNA phosphoramidites. With coupling efficiencies comparable to those of DNA monomers, RNA oligonucleotides at least 30 nucleotides long can now efficiently be prepared using modified phosphoramidite chemistry. A two‐step deprotection route unmasks the phosphodiester, the exocyclic amines and the 2′ hydroxyl. Hybridization and enzymatic assays validate the quality and the identity of the surface‐bound RNA. We show that high‐density is feasible by synthesizing a complex RNA permutation library with 262144 unique sequences. We also introduce DNA/RNA chimeric microarrays and explore their applications by mapping the sequence specificity of RNase HII.


General Experimental Methods
All solvents were purchased in anhydrous form from Biosolve or Sigma Aldrich and stored under activated 4 Å molecular sieves. Chemicals and reagents for microarray synthesis were purchased from Sigma Aldrich and ChemGenes and used with further purification. DNA phosphoramidites were obtained from Sigma Aldrich, Orgentis or Flexgen, while 5ʹ-NPPOC 2ʹ-O-ALE RNA phosphoramidites were prepared by ChemGenes according to published procedures with various synthesis and purification improvements leading to the isolation of high-quality phosphoramidites in gram quantities. [1] General Microarray Fabrication Procedure Slide functionalization Microarrays were synthesized according to procedures described elsewhere, relating to multiple technical improvements that provide the basis for our current array synthesis protocol. [2] In the paired array approach, glass microscope slides (Schott Nexterion Glass D) are silanized using N-(3-triethoxysilylpropyl)-4-hydroxybutyramide (Gelest SIT8189.5). The silane reagent (10 g) is diluted into 500 ml EtOH/H2O 95:5 + 1 ml AcOH and the slides are submerged in the functionalization solution for 4 h at room temperature (r.t.) with gentle shaking. The slides are then washed twice in a 500 ml EtOH/H2O 95:5 + 1 ml AcOH for 20 min at r.t., transferred into a dry, clean rack and cured overnight in a pre-heated oven at 120 °C under vacuum. After functionalization, the silanized slides are kept in a desiccator until further use. Prior to functionalization, one of the slides is drilled at two positions with a 0.9 mm diamond bit, washed then rinsed in an ultrasonic bath for 30 min.
Nucleic acid synthesis by photolithography The instrumental setup for the synthesis of microarrays by photolithography consists of four interconnected devices: a DMD, a UV source traversing a series of optical elements, a computer and an automated DNA synthesizer (Expedite 8909, PerSeptive Biosystems). The UV source, a 365 nm high-power UV-LED (Nichia NVSU333A), produces UV light that is first homogenized by passing through a square cross section light-pipe before reflecting on the DMD, where the mirrors are tilted in either an ON or an OFF position. The reflected image off the ON mirrors is then projected onto an Offner relay optical system, providing a 1:1 image of the light pattern on to S3 the synthesis surface. UV illumination of the slide triggers the removal of the 5ʹ-NPPOC protecting group only at defined locations ("features"), corresponding to the pattern of ON mirrors. The glass slides are encased in a reaction chamber attached to the DNA synthesizer, which controls the delivery of reagents to the surface. The computer synchronizes the exposure to UV with the synthesizer and instructs the DMD to tilt its mirrors in the required position. Light exposure is performed at an irradiance of ~100 mW/cm 2 for 60 s in order to reach a radiant energy density of 6 J/cm 2 . During UV exposure, the slides are covered with a solution of DMSO containing 1% (w/w) imidazole. Besides the additional communication with the computer, the DNA synthesizer operates in a similar manner to traditional solid-phase synthesis via the phosphoramidite chemistry: • DNA and RNA phosphoramidites are diluted as 30 mM solutions in dry ACN • Dicyanodiimidazole (DCI) 0.25 M in ACN is used as the activator • The oxidation step is performed using a mixture of I2 in pyridine/H2O/THF DNA phosphoramidites are protected with a tert-butylphenoxyacetyl protecting group (tac) for dA, iPrPac for dG and isobutyryl for dC. They are coupled for 15 s, followed by drying of the surface with helium for 10 s, a short (3 s) oxidation step and finally the exposure to UV. In RNA synthesis, coupling time is the only change to the above protocols, with 5 min coupling time for rA, rC and rG, and 2 min for rU.
Synthesis area, number of features and density The DMD is a digital light processor containing an array of 1024 × 768 mirrors with a ~14 μm pitch (0.7 XGA). Thus, the total number of individually addressable features amounts to 786432, each feature being 14 × 14 μm in size. All 786432 features, or "spots", are contained within a synthesis area of ~1.4 cm 2 . The number of features used during microarray photolithography depends on the type of experiment and the "complexity" of the design (number of sequence permutations, control sequences and replicates). Higher densities for the photolithographic synthesis of nucleic acids microarrays may be obtained using higher resolution DMDs, such as those of dimensions 1920 × 1080 or 4096 × 2160 (~2 and ~9 million mirrors, and a ~11 μm and 7.3 μm pitch, respectively). Alternatively, higher spot densities could in principle be reached without the need for higher resolution DMDs via optical demagnification. mM ethylenediaminetetraacetic acid, 0.01% Tween20, 0.05% bovine serum albumine (BSA)). The microarrays were covered in aluminum foil and placed in a hybridization oven (Boekel Scientific) at a slow rotation rate for 2 hours. The assay was performed at 42 °C for the 25 and 28mers whose sequences are given below:

Deprotection, Hybridization and RNase H assays
• 25mer on array (sequence given in DNA form):

S5
After hybridization, the chamber was stripped off and the array washed sequentially in three wash buffers: first in Non-Stringent Wash Buffer (SSPE; 0.9 M NaCl, 0.06 M phosphate, 6 mM ethylenediaminetetraacetic acid, 0.01% Tween20) for 2 min, then in Stringent Wash Buffer (100 mM 2-(N-morpholino)ethanesulfonic acid, 0.1 M NaCl, 0.01% Tween20) for 1 min and in Final Wash Buffer (0.1X sodium saline citrate) for a few seconds. The arrays were then dried by centrifugation and scanned in a microarray scanner at either 2.5 μm or 5 μm resolution (GenePix 4400A or 4100A respectively, Molecular Devices) with an excitation wavelength of 532 nm. The recorded fluorescence intensities are reported as arbitrary units and were extracted from the scanned images using NimbleScan (Roche NimbleGen) and further processed using Excel.
The signal/noise ratios in hybridization experiments on DNA and RNA arrays were found to be largely similar and to vary between 300:1 and 500:1, or in other terms, hybridization intensities between 10000 and 30000 a.u. were recorded with background levels as low as ~45 a.u. S6 RNase H assay A microarray containing the 25mer DNA and RNA sequences was first hybridized to the complementary, Cy3-labelled DNA strand as described above. A solution of 5 Units of RNase H (New England Biolabs) in RNase H buffer (75 mM KCl, 50 mM Tris-HCl pH 8.3, 3 mM MgCl2, 10 mM dithiothreitol) was pipetted in a hybridization chamber attached to the array. The assay was performed for 1 h at 37 °C in a hybridization oven, after which the chamber was removed, the array quickly washed in Final Wash Buffer, dried in a microcentrifuge and scanned. Next, the remaining duplexes on the microarray surface were washed off in H2O at 37 °C for 10 min. The array was dried, scanned, revealing fluorescence values reduced to background levels. Finally, the microarray was rehybridized to the same Cy3-labelled complement at 42 °C for 2 h according to the procedure described above.

Determination of the coupling efficiency
To measure the coupling efficiencies of DNA and RNA phosphoramidites, we employed the method of terminal labelling. Homopolymers of a single base and of various lengths, 1 to 12 nucleotides, were synthesized on multiple microarrays. The DNA and RNA versions of each homopolymer were synthesized in parallel on the same array. In toto, four microarrays were fabricated: • Array #1 containing: poly-dA (1 to 12-nt) and poly-rA (1 to 12-nt) • Array #2 containing: poly-dC (1 to 11-nt) and poly-rC (1 to 11-nt) • Array #3 containing: poly-dG (1 to 12-nt) and poly-rG (1 to 12-nt) • Array #4 containing: poly-dT (1 to 12-nt) and poly-rU (1 to 12-nt) Each homopolymer of each length is terminally labelled with Cy3. Terminal labelling consists in two consecutive coupling events of Cy3 phosphoramidite (50 mM, Link Technologies) for 5 min each. After each DNA or RNA phosphoramidite coupling, a capping step is performed with the additional coupling of DMTr-dT phosphoramidite (30 mM, 1 min). Indeed, since microarray synthesis by photolithography bypasses the use of an acidic detritylation event, coupling with a DMTr-protected monomer can essentially be regarded as capping. In addition, a certain number of NPPOC-dT couplings are performed on each homopolymer so as to keep a total sequence length of 12-nt ( Figure S1). For example, the sequence dG2 was synthesized over a dT10 oligonucleotide, and dG8 over a dT4. This is important, as the intensity of the Cy3 dye is known to be dependent on the distance between the fluorophore and the surface of the array. However, those preliminary dT couplings are performed without capping. Finally, each homopolymer receives a final uncapped 5ʹ-dT10 linker, regardless of oligonucleotide sequence, so as to distance the fluorophore from other nucleobases which are known to influence the intensity of Cy3 fluorescence. [3] S8 Figure S1. Schematic representation of the sequences synthesized on a microarray for the determination of the coupling efficiency of a given phosphoramidite (here, dT). t = uncapped dT; T = capped dT. Each homopolymer is terminally labelled with Cy3. • dA: 99.9% • dC: 99% • dG: 97.7%  reported. We found a decrease of 30% of the fluorescence intensity relative to the Cy3-labelled DNA after the Et3N and hydrazine method, and this decrease adds up to 50-60% when performing an extra ethylenediamine step, suggesting indeed degradation ( Figure S2). The ability of the deprotected RNA to still strongly hybridize may be the result of a higher duplex stability, but could also potentially stem from variations in oligonucleotide surface density on the feature. [4] RNA degradation should be minimized by shortening the overall deprotection times, and the EDA step S10 may be avoided altogether with an alternative dG base protection strategy. We also note that oligonucleotides containing a single RNA unit undergo very little degradation. Figure S3. Fluorescence intensities of rX12-Cy3 sequences, relative to those of the corresponding dX12-Cy3, recorded before, during and after RNA deprotection. Error bars are SEM. Figure S4. Fluorescence intensities (in arbitrary units) of the 25mer (sequence given above) either in pure DNA form or with all 6 dT positions substituted with rU, hybridized to the same Cy3labelled complement. Error bars are SEM. The slightly weaker fluorescence signals for the rUmodified 25mer relative to the pure DNA sequence may be attributed to multiple A-B-form helical junctions within the DNA/RNA duplex. S12

Construction of the 4 9 high-density RNA library
The sequences to be synthesized on the microarray are first written as three separate text files: one for each of the conserved 5ʹ and 3ʹ tails, and one for the 9-nt permutation table. The microarray is designed to contain two replicates of each permutation as well as multiple replicates of various single-point mutations of the full-match sequence and multiple replicates of extended and shortened regions. In detail, the RNA library is composed of the sequences listed in Table S1

RNase HII assays
In a text file are stored all sequences to be synthesized on a microarray, which is then transformed into a series of virtual masks using a custom-built program on MATLAB. The array design was chosen so as to include >30 replicates of each sequence, as well as negative controls and background features in a 4:9 feature size, totaling 85000 features. The synthesis area was covered with 80% background features, and 20% of actual hairpin sequences. Negative controls of the hairpin sequences included a DNA-only sequence as well as a hairpin where the single RNA insert Where I stands for fluorescence intensity (in arbitrary units). Sequence motifs were generated by feeding in a given list of sequences, containing the variable region only, to the WebLogo generator (weblogo.threeplusone.com). [5] After logo generation, the middle nucleotide (A, C, G, or T) was manually replaced with rA, rC, rG or rU, respectively.  As stated before, cytosine appears to be the preferred 5ʹ DNA base in the best RNase HII substrates, while the less-cleaved substrates very often show thymine 5ʹ to the RNA. This over-representation of dC in the most-cleaved sequences and dT in the least-cleaved sequences is observed regardless of the RNA base, though less pronounced in the better-cleaved rU-containing hairpins and the poorly-cleaved rA-containing hairpins ( Figure S8). The table below sums up the information found in Figure S7 about the nature of the DNA nucleobase found 5ʹ to the RNA in the top 20 mostcleaved and top 20 least-cleaved sequences for all possible, fixed rX nucleotides: We examined whether abundance of GC base pairs in the stem of the hairpin positively correlates with higher cleavage efficiency, since the corresponding hairpins are expected to be more thermally stable. The melting temperatures for all hairpin sequence combinations were predicted to range between 61 and 86 °C, sufficiently higher than the assay temperature (37 °C) to assume nearcomplete hairpin formation for all combinations.
We nonetheless found that the 100 most-cleaved candidates had an average of 3.5 GC base pairs (out of 5), hinting at the possibility of higher RNase HII activity on GC-rich constructs. But this observation may be partially explained by the fact that the best substrates for RNase HII activity preferentially show rC as the RNA base, and dC 5ʹ to the RNA. Whether an increased cleavage rate with a cytosine base at these positions is due to the nature of the nucleobase itself or to the presence of a GC base pair is unclear at this point. However, it seems fair to assume that if the nature of the base pair is central to the cleavage efficiency, then dG and rG nucleotides would have been equally represented within the most-cleaved sequences.
On the other hand, AT-rich hairpins do not necessarily lead to lower cleavage efficiencies, as the 100 least-cleaved hairpins only had an average of 2.5 AT base pairs, and this is in spite of the fact that a dT base is preferentially found 5ʹ to the RNA in the worst RNase HII substrates.