Development of SCAR Molecular Markers in Eucalyptus saligna and Eucalyptus tereticornis

The genus Eucalyptus includes over 700 species, some of which are the most widely planted hardwoods worldwide. Each species of Eucalyptus present different characteristics regarding its wood quality and yield. This fact makes it very important to work with known species to optimize handling and conservation of forest resources. Some of them are morphologically similar, making it difficult to differentiate by simple observation. An alternative approach is to develop molecular methods for the species differentiation. Using a Bulk Segregant Analysis (BSA) with 59 RAPD (Random-Amplified Polymorphic DNA) primers of Operon Technologies Inc. Kits, polymorphic DNA fragments between Eucalyptus species were isolated and SCAR (Sequence Characterized Amplified Regions) markers designed for Eucalyptus saligna and Eucalyptus tereticornis .


Introduction
Eucalyptus genus is an important forest culture in the world economy, due to certain characters that confer advantages, both in their introduction and maintenance in different regions [1]. The genus presents wide species-diversity, with many varieties and hybrids (more than 900) [2,4].
Eucalypts are native to Australia and north islands, occurring from the tropics to latitude 43° south [5]. The latest taxonomic revision [3] of the eucalypts recognizes over 700 species that belong to 13 main evolutionary lineages. Most species belong to the subgenus Symphyomyrtus, and it is mainly species from three sections of this subgenus that are used in plantation forestry such as Eucalyptus grandis and Eucalyptus urophylla (section Transversaria), Eucalyptus globulus (section Maidenaria), Eucalyptus camaldulensis (section Exsertaria), Eucalyptus saligna (section Latoangulatae) and Eucalyptus tereticornis. E. saligna, native to the southern coast of New South Wales and south Queensland, Australia, is to be found in a region where frost is frequent (more than 60% of the year), which justifies its wide use in breeding programs aimed at cloning frost-resistant species, with the specific characteristics of increased growth and density in cold regions [1,7]. The natural occurrence of E. tereticornis in Papua New Guinea and Australia (Victoria, New South Wales and Queensland), regions with dry periods of up to seven months during the year, explains its importance in drought-tolerant-clone development [1]. Besides drought tolerance, other outstanding characteristics are high disease-resistance potential and wood density [7].
Each species of Eucalyptus present different characteristics regarding its wood quality and yield. This fact makes it very important to work with known species to optimize handling and conservation of forest resources. There are two moments in which the correct identification between these species is very important and cannot be performed by visual methods: in the nursery, as a seedling before planting it, and in debarked wood.
The precise classification requires tools comprising specific molecular-biology techniques [4], applicable to morphological identification. The development of species-specific molecular markers becomes a feasible alternative in solving this conflict accurately, quickly and at low cost.
Forestry companies, by using species-specific molecular markers to determine matrices and launch authentic hybrids [9][10][11][12], manage to avoid taxa introgression, with the possible aftermath of negative consequences, such as variability-loss and genetic-assimilation. ISSR (Inter-Simple Sequence Repeat) molecular markers have already been developed for E. urophylla, E. grandis and E. camaldulensis species [13]. The RAPD (Random-Amplified Polymorphic DNA) analysis has become a method for estimating genetic diversity in plant populations or cultivars [14,15]; it was also used by Paran and Michelmore to develop a technique known as sequence characterized amplified regions (SCAR) [16].
We have used this technology to develop SCAR markers for E. saligna and E. tereticornis to quick differentiation of these two species. These specific primers lead to positive or negative amplification in target-containing and non-targetcontaining samples, respectively; they also can be used to generate amplification products of different sizes in closely related samples.

Plant Material
SCAR marker development was carried out in two groups of four-month-old Eucalyptus sp. seedlings (Table 1), supplied by Suzano Pulp and Paper breeding program.

DNA extraction
DNA extraction was based on [23] protocol, with a reduction in CTAB concentration of 10% to 5%, and by using twice volume. Extracted DNA was quantified by a spectrophotometer and comparison of band intensities with known standards of GeneRuler 1 KB DNA Ladder (Thermo Fisher Scientific, GA, USA) on 0.8 % agarose gels. Each DNA concentration was adjusted to 20 ng/μl in sterile miliQ water and stored at -20 °C.

BSA (Bulk-Segregant Analysis) and RAPD (Random-Amplified Polymorphic DNA)
Bulk Segregant Analysis (BSA) Technique [17] was used to identify RAPD Markers. Two separate DNA bulks were prepared, each containing an equal amount of DNA (500ng) from ten individuals for each eucalyptus species, in a final concentration of 50ng/μl. Each bulk was identified with the first letter of the species epithet. A total of 59 RAPD primers (Operon Technologies Inc.) were screened between the pools. RAPD reactions were performed in a 96-well thermal cycler (MJ Research -PTC 100), with one step at 96°C for 3 min and 41 cycles at 92°C for 1 min, 35°C for 1 min, and 72°C for 2 min and 30s, followed by one step at 72°C for 10 min. Following 1.5% agarose gel electrophoresis and ethidium bromide staining, amplified patterns were visualized over a UV transilluminator and photographed by a digital camera. RAPD amplified bands were scored visually according to their presence or absence for the species studied. Only clear, unambiguous and reproducible RAPD molecular markers were taken into account. The reproducibility of each scored marker was checked by two RAPD experiments.

SCAR Marker Development
DNA fragments of selected RAPD markers were isolated from agarose gel with a Gel Band Purification kit (Amersham), as recommended by the manufacturers. Purified fragments were cloned into a pGEM-T Easy Vector System I vector (Promega), and then inserted into competent cells of a DH5α-FT UltraMax strain (Life Technlogies, GibcoBRL), as recommended by the manufacturers.
DNA sequencing reactions were obtained using 400 to 500 ng of plasmid DNA with forward and reverse M13 primers (1 mM), according to the protocol of ABI Big Dye Terminator Version 3.1 Cycle Sequencing kit (Applied Biosystems). Amplification conditions in a thermocycler (MJ Research -PTC 100) were an initial 2 minutes at 96°C, followed by 40 cycles, each of 30 seconds at 96°C, 30 seconds at 55°C and a final 4 minutes at 60°C. The reaction was purified by adding 80 uL of 75% isopropanol, incubated for 15 minutes at room temperature, and then centrifuged for 45 minutes at 4000 rpm. The supernatant was discarded, and 1ml of 70% ethanol subsequently added for washing. Centrifugation was repeated for 10 minutes at 4000 rpm, whereupon the supernatant was again discarded and the pellet dried at room temperature. After resuspension in 10μL of formamide, the pellet was sequenced on an automated ABI / Hitachi 3100 Genetic Analyzer sequencer (Applied Biosystems).
Nucleotide sequence data were compared against the GenBank nucleotide sequence database (BLAST search) and the Phytozome database (http://www.phytozome.net). They were then analyzed with the Primer3 [18] software in order to design pairs of PCR primers (approximately 20-mers) to obtain SCAR molecular markers characteristic of E. saligna and E. tereticornis. Oligonucleotides were synthesized by Sigma-Aldrich.
PCR conditions for the amplification of SCAR molecular markers were prepared in a final volume of 25 μl containing 150 ng DNA, 1x PCR buffer, 1.92 mM MgCl2; 1μg/μL of Bovine Serum Albumin, 0.8 mM dNTPs, 15 ng / primer (Operon Technologies Inc.) and 1U Taq DNA polymerase (Invitrogen -Life Technologies). The final reaction volume was completed with 13 μL of autoclaved di-ionized water. All amplifications were repeated at least in three independent experiments.

Results and Discussion
59 RAPD primers were tested on two bulks with ten individuals per bulk of E. saligna and E. tereticornis to select a set of RAPD primers that produced reliable and reproducible fingerprints for these two species. Reproducibility of the amplification pattern was checked by repeating each reaction at least twice without alteration in the protocol.
Only two RAPD primers (OPAD-01 and OPH-03; Table 2) among the 59 tested produced RAPD patterns that allowed differentiation of the E. saligna and E. tereticornis species. Selection criteria was a sufficient DNA length in order to maximize the availability of convenient sites for designing PCR primers and a well-defined DNA bands to increase the chance of cloning the targeted molecular marker.
OPAD-01 proved to have a RAPD product to Eucalyptus saligna around 750 bp ( Figure 1A). OPH-03 had no RAPD product in E. tereticornis at 700 bp region (Figure 1B), implies a negative marker for this species. The bulks were screened for polymorphism confirmation and marker selection, and thus, SCAR marker development.
Two SCAR markers were developed, CAS and CHT. CAS presented a common band to all individuals of E. saligna and absence in all individuals of the other species studied. The blast results as compared to sequences deposited in GenBank revealed that most of the RAPD product sequences had no homology with known sequences at different sequence-similarity levels (data not shown). CHT presented absence bands, which means a negative marker for E. tereticornis. The PCR primer pairs were further challenged using 40 individuals of each species-specific marker, plus 20 individuals of each of the remaining species. The percentage of efficiency was calculated by the primer pairs that did not amplified DNA from the other species. The CAS primer demonstrated 90% efficiency in E. saligna detection and band-absence on other species ( Figure 1C), although CHT revealed band-absence in E. tereticornis, band-presence on different percentages at E. saligna and E. urophylla -20%, E. brassiana -40% and E. grandis -60% ( Figure 1D).  CAS marker achieved 90% efficiency in E. saligna identification, thus proving to be a reliable tool in breeding-programs. Analysis of this consensus-sequence marker in the E. grandis genome, revealed its localization between the 3' region of the AMP-binding-protein gene and the adjacent intergenic region. As in this region, there is identity with only one primer (CAS-R), amplification, not only in E. grandis itself, but also in other species, is impossible. Due to possible mutation in the intergenic region in E. saligna, the CAS-F primer probably presents specificity. Although not fixed in the E. saligna genome, this mutation is very frequent population wise (90%).
Amplification of the negative species-specific marker CHT for E. tereticornis in other commercial species ranged from 20 to 60% (E. grandis 60%, E. brassiana 40%, and both E. saligna and E. urophylla, 20%). Even though in silico analysis of this marker in the E. grandis genome indicated its location in an intergenic region, individually sequenced representatives of the species presented no identity with CHT primers, thereby indicating it to represent 40% of the E. grandis population without a tag. This implies that, although marker-detector alleles are absent in E. tereticornis, this may not be so in other species, thereby demonstrating the need for developing additional and more efficient strategies for identification.
One problem encountered in silviculture is the precise identification of both pure-species and interspecific-hybrid origin. Combining species-specific molecular marker usage and morphological analysis can do this. With a view to minimizing the economic losses involved, emphasis was given to the potential use of these markers in initial screening in breeding-company Eucalyptus populations. Screening would facilitate individual selection according to band presence or absence, and be of assistance in guiding and monitoring breeding and hybridization programs.

Conclusions
A set of molecular markers (CAS and CHT) specific to E. saligna and E. tereticornis species were designed to distinguish these species. This is a quick and efficient method with high specificity and reproducibility that can be used in seed, seedlings and stocked wood in the management of populations in forest breeding programs.