Introduction

Studies of endemic floras of high altitudes of the Northern and Central Andes are of great importance to understand the evolution of one of the most diverse mountain ecosystems in the world in terms of species richness and endemism, the Páramo (Luteyn and Balslev 1992; Luteyn 1999; Sklenář et al. 2011, 2014; Madriñán et al. 2013; Hughes and Atchison 2015). Páramo habitats, characterized by tussock grasses, large rosette plants, shrubs with evergreen, coriaceous and sclerophyllous leaves, and cushions are found in the tropical Andes from 3000 to 4500 m (Luteyn et al. 1999). They emerged by the end of the north-Andean orogeny ca. 3–5 Mya (van der Hammen and Cleef 1986; Graham 2009), and despite its recent age, the Páramo flora is especially rich in various groups, e.g., Asteraceae in general, Valeriana, Gentianella, Lupinus (von Hagen and Kadereit 2001; Briceño and Morillo 2002; Bell and Donoghue 2005; Hughes and Eastwood 2006; Sklenář et al. 2011). Many authors pointed out the important contribution of both temperate areas of the Americas as well as Neotropics as main sources in colonizing the Páramo (see Sklenář et al. 2011 for a summary). In recent years, many studies have focused on the evolution of plant groups from the Páramo, which have been very useful to understand the colonization of these environments (Bell and Donoghue 2005; Hughes and Eastwood 2006; Jabaily and Sytsma 2013). Yet we still know little about the processes governing the diversification within the Páramos itself. The scarce information available suggests a complex interplay of the Andean uplift and Pleistocene glacial oscillations triggering allopatric differentiation, ecological speciation and hybridization, processes involved in the origin of novel species within the Páramos (Kolář et al. 2016Loricaria, Dušková et al. 2017Senecio, Vargas et al. 2017Diplostephium, Pouchon et al. 2018Espeletia, Vásquez et al. 2016Lupinus).

The Asteraceae is among the richest plant families in the Andes and was used to define phytogeographically the Páramo itself (Cabrera and Willink 1973). Within the diverse tribe Astereae, Oritrophium (Kunth) Cuatrecasas is an American genus comprising 22 species (Arnelas et al. 2020). The genus is morphologically defined by its rosulate habit, scapose and monocephalous synflorescences, functionally male disk florets with the style lacking the stigmatic lines, and narrowly infundibuliform tubular shape of the disk corollas. At first, the members of the genus were described as a section within Aster (Kunth 1818) and later treated within Erigeron or Celmisia by several authors (e.g., Weddel 1857; Solbrig 1960). Cuatrecasas (1961) was the first recognizing Oritrophium as a distinct genus. Since then, Oritrophium has been placed in the subtribe Hinterhuberinae, based mostly on morphology and ecological inference (Nesom 1994; Cuatrecasas 1997; Nesom and Robinson 2007) but also molecular data (Karaman-Castro and Urbatsch 2009). Molecular studies mostly focused on examining the origin and relationship of the Astereae tribe in North America (Noyes and Rieseberg 1999) or clarifying relationships in other groups (Karaman-Castro and Urbatsch 2009; Vargas et al. 2017), and included only two species of Oritrophium. Even though these studies clarified the position of Oritrophium in the tribe, the evolutionary history, monophyly and phylogenetic relationships within the genus have not been examined yet.

Most Oritrophium species grow commonly in wet places like swamps in páramos, superpáramos and jalcas (as Páramo is called locally in northern Peru) habitats of the Northern and Central tropical Andes of South America (Sklenář et al. 2005). However, there is a striking disjunction of two species, O. orizabense Nesom and O. duranguense Nesom, which inhabit the ‘zacatonal alpino’ or tropicalpine grassland of Mexico (Aranguren et al. 2008), approximately 2500 km north of the genus’ main distribution range, raising the question on the genus' monophyly. The majority of Oritrophium species are clumped within two areas, i.e.: the Venezuelan Andes which host nine species (six of which are endemic of that area), and southern Ecuador-northern Peru which host nine species (seven are endemic). Based on the spatial distribution of the species, Cuatrecasas (1997) postulated that Oritrophium originated in South America and explained the disjunct distribution by a long-distance dispersal mediated by migratory birds, probably from the Venezuelan diversity center. However, the species richest area does not necessarily correspond to its center of origin (Chen et al. 2012) and such hypothesis calls for further validation in a phylogenetic framework. Although most species are restricted to one country, often being endemic to a specific region (e.g., O. blepharophyllum and O. marahuacense from Venezuela, O. ollgaardii from southern Ecuador), there are two widely distributed species spanning the entire Páramo region and even beyond, ranging from Venezuela to Bolivia (O. limnophilum and O. peruvianum) (Aranguren et al. 2008). Because widely distributed and variable species provide a great opportunity to infer mechanisms driving an on-going differentiation within a group and potentially incipient speciation (Chalcoff et al. 2008), studying geographic variation of O. limnophilum and O. peruvianum can provide valuable insights into the evolutionary history of Oritrophium as well as contribute to understand the processes governing population and species diversification in the Páramos.

Here we aim to unravel the evolutionary history of Oritrophium with a specific focus on disentangling the center of its origin and population genetic differentiation within the two widespread species to elucidate the drivers of species diversification within the genus. To address these points, we have (i) reconstructed phylogenetic relationships of Oritrophium and using nuclear (ITS) and plastid DNA sequences (trnLF), (ii) reconstructed the ancestral area for the genus and most likely biogeographic scenario, and (iii) inferred population genetic diversity and structure within the two widely distributed species of Oritrophium using range-wide population-level sampling and sequencing of the same markers. We addressed the following specific biogeographic hypotheses: (1) Oritrophium originated in the Ecuador/Peru border region, where it is currently most species rich, and Venezuela represents a secondary diversification center, explaining the current bi-modal pattern of species diversity, (2) The 2500 km Andean-Mexican disjunction reflects a long-distance dispersal from Venezuelan diversity center (Cuatrecasas 1997), (3) The interspecific diversification pattern is mirrored by intraspecific diversity and structure of the two widespread species O. peruvianum and O. limnophilum.

Materials and methods

Taxon sampling

A total of 19 species of Oritrophium were sampled, representing more than 85% of the species within the genus and covering its entire distribution range from Bolivia to Mexico (hereafter referred to as 'genus-wide dataset'). One of the two Mexican species, O. duranguense, could not be included in this study due to its null record in the herbaria consulted. For more widely distributed species we included multiple accessions covering the entire distribution range of the species. Additionally, for the two most widely distributed species, Oritrophium limnophilum and O. peruvianum group (O. peruvianum and related species O. callacallense, O. crocifolium, O. llanganatense, and O. mucidum), additional 3–5 specimens from 19 and 24 populations, respectively, were sampled and analyzed separately to infer inter-population differentiation (two 'population-level datasets'). These samples were not included in the genus-wide dataset to avoid unbalanced design and redundancy. Vouchers and plant tissue samples were collected between 2008 and 2019 in Venezuela, Colombia, Ecuador, Peru, and Bolivia and deposited in BOL, QCA, PRC, F, and VEN. Some other relevant samples were sequenced from herbarium material (F, US, NY) and one ingroup was available from Karaman-Castro and Urbatsch (2009) (Supplementary Information Table S1). Based on taxonomy (Cuatrecasas 1997) and molecular data (Noyes and Rieseberg 1999; Vargas and Mandriñan 2012; Vargas et al. 2017), we downloaded sequences of closely related genera from Genbank and included them in the phylogenetic analyses (Supplementary Information Table S2; Chiliotrichum, Darwiniothamnus, Diplostephium, Erigeron, Hinterhubera, Lepidophyllum, Linochilus, and Llerasia).

DNA extraction and sequencing

Genomic DNA was isolated from silica dried leaf tissue or herbarium material using Invisorb Spin Plant Mini Kit (INVITEK) according to the manufacturer’s instructions. The nuclear ribosomal internal transcribed spacer DNA (ITS 1 and 2 region) has been proven to be a powerful marker for phylogenetic reconstruction in the Asteraceae, particularly within the Astereae tribe and thus it was selected here for the nuclear phylogenetic reconstruction. The chloroplast regions psA-ycf3ex3 (primers Cp051L-Cp052R), trnG-trnG2, trnL-trnF and psbA-trnH were tested at the beginning of the laboratory work, and the last one was selected because of the highest percentage of variable sites.

The plastid DNA region trnL-trnF (primers c and f, Taberlet et al. 1991), and nuclear ribosomal DNA regions ITS1 and ITS2 (primers ITS4 and ITS5, White et al.1990), were amplified. PCR reactions were performed in 20 μl final volume with approx. 5 ng of genomic DNA, 4 μL My Taq Red Reaction Buffer, 0.4 μL of each 20 μM primer, 0.2 μL of My Taq HS Red DNA Polymerase (Bioline). An initial denaturation step at 95 °C for 1 min was followed by 35 cycles of denaturation (95 °C for 20 s), annealing (50–58 °C for 30 s) and extension (72 °C for 30 s−1 min) steps, and final extension at 72 °C for 7 min. PCR products were purified using the Agencourt AMPure XP PCR Purification Kit (Agencourt Bioscience) following the manufacturer’s instructions and subsequently sequenced (Macrogen, inc.).

DNA sequences were edited and assembled in AliView v. 1.26 (Larsson 2014) and aligned in the MAFFT v. 7 online application (Katoh and Standley 2013) using the default settings. For the ITS data set, we newly sequenced 183 samples of Oritrophium. To analyze the ITS matrix, we used the IUPAC codification of the ambiguities. For the trnL-trnF data set, we newly sequenced 176 samples of Oritrophium. The alignments generated for this study are available from the corresponding author.

Phylogenetic analyses and divergence time estimation

To test the monophyly and phylogenetic relationships of Oritrophium, we performed phylogenetic analyses under both Bayesian inference and maximum parsimony, for the ‘genus-wide dataset’. MrBayes v. 3.2.2 (Ronquist et al. 2012) was used in analyses under Bayesian Inference, run on the CIPRES Science Gateway (Miller et al. 2010). We ran ITS and trnL-trnF separately, treated gaps as missing data, and applied the SYM and GTR substitution model for ITS and plastid region respectively (as selected by the Akaike Information Criterion -AIC- in JModeltest 2 v.2.1.10; Darriba et al. 2012) with a gamma distribution of rate heterogeneity. We ran simultaneously two MCMC runs with four chains for 20,000,000 generations each, sampling every 1000th generation using the default priors. Chain convergence was evaluated in Tracer v. 1.7 (Rambaut et al. 2018). The posterior probability of the phylogeny and its branches was determined from the combined set of trees, discarding the first 25% of trees of each run as burn-in.

TNT v.1.1 (Goloboff et al. 2008) was used for phylogenetic analyses of ‘genus-wide dataset’ under the parsimony criterion. All characters were considered unordered and parsimony-uninformative characters were excluded from the analysis. For all matrices, we performed a first search using ‘Traditional Search’ and after that we ran New Technologies using trees in memory, under default settings for Sectorial Searches and Tree fusing, retaining a maximum of 20,000 total trees. A strict consensus tree was generated from the most parsimonious trees. Branch support was calculated by Bootstrap support (Felsenstein 1985) performing 5000 replicates, and a heuristic search strategy of ten additional sequences swapped with Tree Bisection and Reconnection (TBR) with five trees saved per replication.

We also employed a molecular dating approach using BEAST2.5.2 (Bouckaert et al. 2019) to estimate divergence times of the genus Oritrophium as well as the two widespread species within the genus. We used BEAUti v. 2.6.2 to configure the settings. As there are no fossils of Oritrophium, we used a secondary calibration approach. Therefore, we used a previously dated Asteraceae-wide phylogeny based on ndhF, rbcL and trnLF (Kandziora et al. in submitted; see details below). For the fossil-calibration, we used a starting tree and for the secondary calibration a random starting tree. For all analyses, an uncorrelated lognormal relaxed clock and a Yule speciation process as tree prior was used. We conducted two independent runs of 50 million generations, sampling every 1000 generations, respectively. The two independent runs were combined excluding a burn‐in of 20% for the Asteraceae-wide dating and 10% for the Oritrophium phylogeny using LogCombiner 2.6.2 and TreeAnnotater v. 2.6.2 (BEAST package). In all dating analyses, chains converged, and the effective sample size was above 200 as checked with Tracer v. 1.7 (Rambaut et al. 2018). We visualized the trees in Figtree v. 1.4.4 (Rambaut 2018).

The Asteraceae-wide molecular dating is based on two fossil calibration points with an exponential distribution as prior for the fossil calibrations with a mean of 1.5, and the offset corresponding to the fossil ages (Famatinanthoideae‐fossil: 47.5 Myr; Barnadesioideae‐fossil: 72.1 Myr; Barreda et al. 2010, 2015). The age of the root of the tree was calibrated with a uniform distribution of 73–101 Myr (maximum age of Asterales according to Beaulieu et al. 2013). The Asteraceae phylogeny does not completely agree with the phylogenetic relationships shown by Panero et al. (2014) despite using a starting tree, but the age estimates are congruent with other age estimates (Diplostephium and Linochilus clade by Vargas et al. (2018); Senecioneae by Kandziora et al. 2017; Euryops by Devos et al. 2010). We then used the resulting crown age estimate of the Erigeron-Diplostephium clade (7.7–22.06 Million years [Myr] according to the highest posterior density [HPD] values) as a secondary calibration point to date the root of the nuclear genus-level Orithrophium matrix, which had been expanded using PhylUp (Kandziora 2020) to include a larger outgroup (Diplostephium, Erigeron, Parastrephia, Symphyotrychum).

To infer in detail the divergence times of the most widely distributed species, O. limnophilum and O. peruvianum, we applied two additional secondary calibrations for the population-level datasets. The HPD interval of the crown node, as obtained from the genus-level phylogeny, was used to calibrate the crown node of O. limnophilum with a uniform distribution ranging from 0.34 to 2.32 Mya and the O. peruvianum species complex with a uniform distribution ranging from 0.63 to 2.95 Mya respectively. Based on the results of jModelTest, we used a TN93 model with four gamma categories and base frequencies set to all equal to fit the TrNef + G model selected and a TN93 model with four gamma categories and invariant sizes, and base frequencies set to all equal to fit the TrNef + I + G respectively.

Biogeographic analyses

Ancestral area estimation was conducted using the R (R Core Team 2020) package BioGeoBEARS v.1.1.2 (Matzke 2013) and following the code available from the developer (available at https://github.com/nmatzke/BioGeoBEARS). As Oritrophium was not retrieved as monophyletic, the analyses were carried out for Oritrophium s.s. only, and for that we trimmed the dated BEAST tree using the drop.tip function in ape (Paradis et al. 2004) in order to have only one representative per species. We defined four areas that encompass the distribution of the clade (Fig. 3): Mexico (M), Northern Páramo from Cordillera de Merida to Pasto (N), Pasto-Giron/Paute (E), South Giron/Paute-Bolivia (S). We performed different analyses with several combinations and constrains: (1) with no constrains (‘4Areas’), (2) maximum range size 3 (‘4Areas_range3’), (3) maximum range size 3 and adjacency allowed between all areas in South America (‘4Areas + AdjSA’), (4) maximum range size 3, adjacency allowed between all areas in South America and dispersion with equal probabilities between N-M and S-M (‘4Areas + Adj + dispersion_S-M/N-M’), (5) two other option increasing the relative probability of dispersion (0.1 vs 1) to evaluate Cuatrecasas’ hypothesis of long-distance dispersion from Venezuela to Mexico (4Areas + Adj + dispersion_N-M), and an alternative one from South Ecuador/Peru to Mexico (4Areas + Adj + dispersion_S-M). In each analysis we tested 6 different models: DEC, DIVALIKE, BAYAREALIKE and the three of them considering the J parameter DEC + J, DIVALIKE + J, BAYAREALIKE + J (Matzke 2014). We performed a model selection of our different assumptions based on AIC.

Population analyses

For each of the Oritrophium limnophilum and O. peruvianum group population-level datasets, plastid haplotype and nucleotype (ITS haplotypes) networks were constructed under Parsimony from trnL-trnF plastid and nuclear ITS regions separately. Guided by the results from phylogenetic reconstructions, we performed also a general analysis including all accessions of the Oritrophium s.s. clade. For all the analyses we used the software TCS 1.21 (Clement et al. 2002) implemented in PopART (http://popart.otago.ac.nz). We executed the analyses with a default probability value of 0.95 for connection limit.

To infer population structure and to estimate the optimum number of groups considering geographic information of each population sampled within both species, we performed a spatial analysis of molecular variance using SAMOVA 2.0 (Dupanloup et al. 2002) of each population-level dataset. The approach recognizes groups of populations genetically homogeneous and maximally differentiated from each other without the constraint of being geographically close. Different numbers of groups (K) were tested, and a simulated annealing procedure permitted the identification of the group numbers that maximizes the FCT index (proportion of total genetic variance due to differences between groups). The program was run for two to seven groups (K = 2 to K = 7) each time with the simulated process repeated 100 times.

For both species and their individual populations, genetic diversity was quantitatively estimated using DnaSP 5 (Librado and Rozas 2009) for both ITS (previous phases reconstructed using PHASE v2.1 implemented in DnaSP 5) and trnL-trnF data.

Maps were constructed using QGIS (www.qgis.org).

Results

Oritrophium phylogeny and time of divergence

For the genus-wide dataset, the ITS alignment included 58 taxa and comprised 725 bp, with 5.43% missing data, and 205 parsimony-informative sites. The plastid alignment included 43 taxa (less than in the ITS dataset due to the limited availability of outgroups and one ingroup in Genbank) and comprised 830 bp, with 0.26% missing data, and 29 parsimony-informative sites.

Phylogenetic analyses using nuclear DNA, under Bayesian Inference as well as Maximum Parsimony, recovered Oritrophium as polyphyletic (Fig. 1). Venezuelan O. nevadense and Peruvian O. hirtopilosum clustered with other Southern American Erigeron species in the ‘Erigeron + Oritrophium clade’. Three other Venezuelan endemic species (O. blepharophyllum, O. figuerasii, and O. venezuelense) formed another clade that was part of a polytomy together with Hinterhubera, the aforementioned ‘Erigeron + Oritrophium clade’, and sister to the clade comprising Diplostephium + Linochilus. The remaining species of Oritrophium including the type species of the genus, O. peruvianum, form a well-supported clade (hereafter called Oritrophium s.s.). Within Oritrophium s.s., O. peruvianum appeared non-monophyletic but formed a well-supported clade (hereafter O. peruvianum group) together with several narrowly distributed species (O. mucidum from Colombia and Venezuela, O. llanganatense endemic to Ecuador, O. callacallense endemic to Peru, O. cocuyense endemic to Colombia, and O. crocifolium from Peru and Ecuador). The endemic Ecuadorian species, O. yacuriense, appeared embedded in a well-supported clade together with O. repens from Peru and Ecuador. The Mexican species, O. orizabenze, was sister to Peruvian O. ferrugineum.

Fig. 1
figure 1

Bayesian phylogram for ITS marker rooted with Chiliotrichum diffusum. Above each branch are indicated posterior probabilities from Bayesian analysis and Bootstrap support from MP analysis, before and after the slash, respectively. Lateral bars point out the clades discussed in the manuscript. The capital letters at the end of the accession names refers to the country of collection for the newly sequenced accession: BOL (Bolivia), COL (Colombia), EC (Ecuador), MEX (Mexico), PER (Peru), VZ (Venezuela)

Topologies from Bayesian Inference and Maximum Parsimony analyses using plastid trnL-trnF data also recovered Oritrophium as polyphyletic with high support (Fig. 2). The endemic Venezuelan species fell either within Erigeron (O. nevadense along with the Peruvian O. hirtopilosum) or were at least closer to Erigeron, Diplostephium and Linochilus than to Oritrophium s.s. (O. blepharophyllum, O. figuerasii, and O. venezuelense). In congruence with the ITS tree, the plastid phylogeny also recovers Oritrophium s.s. as monophyletic. The early branching species between the nuclear and plastid phylogeny differ. In the plastid phylogeny the Mexican O. orizabense and the Ecuadorian O. ollgaardii are forming a basal polytomy together with a clade encompassing all remaining species. The earlier mentioned plastid clade, that comprise almost all species is polytomic, including a clade encompassing most sequences belonging to the two widely distributed species O. limnophilum and O. peruvianum. Oritrophium repens, O. yacuriense and O. ferrugineum were also part of this polytomy, in contrast to the ITS tree where they occupied basal position relatively to O. ollgaardii, O. limnophilum and the O. peruvianum group.

Fig. 2
figure 2

Bayesian phylogram for trnL-trnF marker (rooted with Chiliotrichum diffusum but cut out to aid legibility). Above each branch are indicated posterior probabilities from Bayesian analysis and Bootstrap support from MP analysis, before and after the slash, respectively. Lateral bars point out the clades discussed in the manuscript. The two–three capital letters at the end of the names refers to the country of collection for the newly sequenced accession: BOL (Bolivia), COL (Colombia), EC (Ecuador), MEX (Mexico), PER (Peru), VZ (Venezuela)

Our Bayesian analysis to estimate the origin and divergence times of Oritrophium and its closest relatives recovered the ‘Erigeron + Oritrophium clade diverging from the Diplostephium and Oritrophium s.s. clades at 10.28 Myr [7.7–18.94 Myr 95% HPD]. The Diplostephium clade diverged from the Oritrophium s.s. clade at 6.68 Myr [3.32–12.87 Myr 95% HPD; posterior probability (PP) = 0.93]. The time calibration provided an age estimate of Oritrophium s.s at 4.19 Myr (2.01–8.45 Myr 95% HPD; PP = 1). Within Oritrophium s.s., the divergence of the Mexican O. orizabense occurred early in the history of the genus at 2.35 Myr (0.76–5.04 Myr 95% HPD; PP = 1), with the Peruvian species O. ferrugineum as a sister (Supplementary Information Figure S1).

Biogeographic analyses within Oritrophium s.s.

The DEC + J model was the best fitting model according to AIC, compared to BAYAREALIKE(+ J) and DIVALIKE(+ J). Among the several DEC + J scenarios tested, the best scenario was the “4Areas + AdjSA” model, where we configured all areas within South America as adjacent, plus allowed adjacency between the southernmost portion of the distribution within South America (S) and Mexico (M) and no dispersal constraints (Fig. 3, Supplementary Information Table S3). This ancestral area estimation suggested that the most recent ancestor of Oritrophium s.s. has originated in the southern part of its current range; that is, within the area between southern Ecuador and northwestern Bolivia (S). The subsequent biogeographic history of the genus within South America can be explained by dispersal processes from the southern area (S) to the north of Ecuador (E) and further to Colombia and Venezuela (N), with a few back colonizations in the opposite direction.

Fig. 3
figure 3

DEC + J for ‘4Areas + AdjSA’. Ancestral area reconstruction on the pruned ITS BEAST tree with percent probabilities of the different ancestral areas shown as pie charts. Letters indicate biogeographic areas considered in the analysis. Mexico (M), Northern Páramo from Cordillera de Merida to Pasto (N), Pasto-Giron/Paute (E), South Giron/Paute-Bolivia (S). Letters on each pie charts indicate the most probably area for that node

The area ‘S’ was also the most likely source area from which a founder event to Mexico (M) occurred. When we increased the probability of dispersion from N to M to specifically test Cuatrecasas’ hypothesis of Venezuela as the source area of a long-distance dispersal to Mexico, or alternatively dispersion from S to M, considering both centers of diversity as a potential source area, we still obtained the best likelihood for S as source area.

Intraspecific genetic structure within O. peruvianum and O. limnophilum

For population analyses of the two widely distributed species we used an expanded sampling across the complete distribution range of each species (the population-level datasets). To root the phylogenetic trees encompassing these accessions, we also included all other available sequences of Oritrophium s.s.. The complete ITS matrix included 175 sequences, of which 75 corresponded to O. limnophilum and 91 sequences corresponded to the O. peruvianum group. The matrix comprised 674 bp with 0.29% missing data, and 37 gaps from 1 to 4 bp. The trnL-trnF complete matrix included 169 sequences, of which 77 corresponded to O. limnophilum (two accessions included here were unsuccessfully amplified for ITS) and 83 sequences corresponded to the O. peruvianum group (seven accessions for the group were unsuccessfully amplified for trnL-trnF). It comprised 850 bp with 0.12% missing data, and 35 gaps from 1 to 22 bp.

Phylogenetic reconstructions based on ITS data showed Oritrophium limnophilum monophyletic with high support (PP = 1/100; Supplementary Information Figure S2). Venezuelan populations were sister to five Colombian accessions from adjacent Cordillera Oriental (corresponding to nucleotypes N1, N2, N3, N4, N8, N9 and N10), this clade being of likely later Pleistocene origin [0.18 Myr (0.065–0.530 Myr 95% HPD; PP = 0.99); Supplementary Information Figure S3]. The remaining accessions (Ecuador, Peru, Bolivia, and rest of Colombia) formed mostly an unresolved basal polytomy. The analysis expanding the O. peruvianum sampling founded that some narrowly distributed species (O. callacallense, O. cocuyanum, O. crocifolium, O. llanganatense, O. mucidum) were embedded within this species, altogether forming the well-supported clade,‘O. peruvianum group’ (PP = 1; Supplementary Information Figure S4). Within the O. peruvianum group, there was no support for internal clades except for several populations retrieved as monophyletic. (Supplementary Information Figure S5).

Due to limited intraspecific resolution of the markers used, we also reconstructed finer relationships using haplo/nucleotype networks. For ITS data, O. limnophilum as well as O. peruvianum group were well-defined by their unique sets of ITS nucleotypes. In contrast, the plastid trnL-trnF dataset comprised 26 haplotypes, one of which (denoted H1) was widely distributed and shared by several Oritrophium s.s. species including most accessions of O. limnophilum and many of O. peruvianum group, thus representing likely an ancestral haplotype for both species (Supplementary Information Figure S6).

For Oritrophium limnophilum, the ITS network (Fig. 4a, Supplementary Information Table S4) exhibited 22 nucleotypes with a star-like arrangement and some defined groups separated by ≤ 3 mutational steps. One of those groups separated by a single mutational step is the one that gathered all the populations and nucleotypes exclusive to Venezuela (N19, N20, N21, and N22) which were closest to N18 found in 2 accessions from a population in southern Ecuador (PS11125). Another group, well defined by a single mutational step, is the one that grouped four exclusive nucleotypes (N1, N2, N3, and N4) belonging to a single population of the Cordillera Oriental of Colombia (PS12202). The most widely distributed nucleotype (N7) occurred from Colombia to southern Peru, including populations from almost the entire distribution range. The trnL-trnF network for O. limnophilum (Fig. 4b, Supplementary Information Table S4) presented weak signal, showing nine haplotypes with a star arrangement, one most frequent central haplotype (H1) and the other haplotypes differentiated by one to three steps.

Fig. 4
figure 4

Geographic distribution and nucleotype (A) and haplotype (B) diversity in O. limnophilum populations based on ITS and trnL-trnF sequences respectively. In the networks, numbers refer to sampled nucleo/haplotypes, circle sizes are proportional to frequencies, small transversal bars represent mutational steps between them, and white small circles are extinct or unsampled hypothetical nucleo/halpotypes. The color in H1 is shared in Figs. 4B and 5B showing H1 is shared by both species. In the maps, pie charts reflect the occurrence frequency of each nucleo/haplotype in each population, circle sizes are proportional to the number of individuals sampled per population. Asterisks (*), black circles (●), and black squares (■) accompanying the names denote groups 1, 2 and 3 respectively, resulting from the SAMOVA analyses. The single Bolivian population is shown on a remarked Bolivian territory on the side to keep scale and resolution on the map

The ITS network of the O. peruvianum group revealed 21 nucleotypes with a star arrangement and few nucleotypes differentiated by ≥ 3 steps (Fig. 5a, Supplementary Information Table S4). The most widely distributed nucleotype was a central N1. There were two nucleotype groups unique for Venezuela and adjacent Colombian Cordillera Oriental (N20 + N21, N9 + N10, separated from the central nucleotype by one and two steps, respectively. Three nucleotypes (N1, N2, and N20) were shared by populations associated with different putative species (O. crocifolium, O. callacallense, O. mucidum). The plastid trnL-trnF network for O. peruvianum exhibited 14 haplotypes with a star arrangement with one central and widespread haplotype (H1, also shared with O. limnophilum) and two well separated groups (≥ 4 steps). One of them (H3) encompassed all Venezuelan populations. The second group (H4, H5, H6, and H7) was frequent in central and southern Ecuador (Fig. 5b, Supplementary Information Table S4). The haplotypes H1 and H4 were shared by populations associated with different putative species (O. cocuyense, O. crocifolium, O. callacallense, and O. mucidum).

Fig. 5
figure 5

Geographic distribution and nucleotype (A) and haplotype (B) diversity in O. peruvianum group populations (O. peruvianum: Op, O. mucidum: Om, O. callacallense: Ocallac, O. crocifolium: Ocro, O. cocuyanus: Ococ, O. llanganatense: Ollan) based on ITS and trnL-trnF sequences respectively. In the networks, numbers refer to sampled nucleo/ haplotypes, circle sizes are proportional to frequencies, small transversal bars represent mutational steps between them, and white small circles are extinct or unsampled hypothetical nucleo/halpotypes. The color in H1 is shared in Figs. 4B and 5B showing H1 is shared by both species. In the maps, pie charts reflect the occurrence frequency of each nucleo/haplotype in each population, circle sizes are proportional to the number of individuals sampled per population. Asterisks (*) and black circles (●) accompanying the names denotes groups 1 and 2 respectively, resulting from the SAMOVA analyses

Genetic structure and genetic diversity analyses

Applying analyses of spatial molecular variance on plastid data of both widely distributed species, we found genetic structure in the distribution of ITS nucleotypes (Supplementary Information Table S5). For O. limnophilum, we recovered three groups: the first group included all Venezuelan populations, the second group comprised two populations of southern Ecuador and the third group included all Colombian, Peruvian, and Bolivian populations, along with the remaining Ecuadorian populations (Fig. 2a). For the O. peruvianum group, we recovered two groups: the first group comprised three populations of O. peruvianum and one population of O. crocifolium from Ecuador and the second group encompassed all other populations (Fig. 4a).

Results of the nuclear genetic diversity estimation for O. limnophilum showed high diversity (hd ≥ 0.9) in three Colombian populations (25_COL, 38_COL, and PS12202_COL), three Ecuadorian populations (PS11125_EC, PS11556_EC, and PS12447_EC) and the only population sampled in Bolivia (OB_BOL, Table 1). The SAMOVA group 3, encompassing samples from Colombia, Ecuador, Peru, and Bolivia, contained the highest genetic diversity (Table 2).

Table 1 Diversity of Oritrophium limnophillum (above) and O. peruvianum group (below) populations, based on nuclear (left) and plastid (right) data
Table 2 Oritrophium limnophillum (above) and O. peruvianum group (below) SAMOVA grouping based on nuclear data and its Nucleotype (hd) and nucleotide (Pi) diversity, N (number of sequences per group resulting after PHASE reconstruction)

Results of the nuclear genetic diversity estimation for O. peruvianum showed two highly diverse (hd ≥ 0.9) populations from Cordillera Occidental in Northern (PS12238_COL) and Narino in Southern Colombia (PS12390_COL), and three from Southern Ecuador (PS11126_EC, PS11127_EC, and PS11166_EC).

Discussion

Polyphyly of Oritrophium and species relationships

Plastid and nuclear molecular data suggest that Oritrophium as traditionally circumscribed is not monophyletic, which conflicts with previous taxonomic and morphological studies (Cuatrecasas 1961, 1997; Aranguren et al. 2008). Four Venezuelan (O. blepharophyllum, O. figuerasii, O. nevadense, and O. venezuelense), and one Peruvian species (O. hirtopilosum) are more closely related to Erigeron (O. nevadense and O. hirtopilosum), or form a polytomy with Erigeron, Hinterhubera, Diplostephium and/or Linochilus (O. blepharophyllum, O. figuerasii, and O. venezuelense). As Oritrophium peruvianum, the genus’ nomenclatural type, appears in the Oritrophium s.s. clade, those five species should be excluded from Oritrophium to keep the genus monophyletic. Although the species to be excluded share the morphological diagnostic characters defined by Cuatrecasas (1961) for Oritrophium (i.e., rosulate habit, scapose and monocephalous synflorescences, functionally male disk florets, and narrowly infundibuliform disk corollas), further exploration of morphological features is required to shed light on synapomorphies defining Oritrophium s.s. These species seemingly have only one type of uniseriate hair [type A following Nesom (1976), or malpigheaceo in O. olgaardii], whereas species to be excluded from the genus present two types of hair, including typically biseriate-capitate ones [type A + C following Nesom (1976)]. Further, karyological evidence showed that the Peruvian species O. hirtopilosum (Dillon and Turner 1984), and the Venezuelan species O. venezuelense (Spooner 1995) present a chromosome number 2n = 34–36 while the species in Oritrophium s.s. for which data is available mostly share a basic number 2n = 18 (Turner et al. 1967, unpublished data). Although this data is not complete for all the species of the group, it reveals interesting aspects to consider in more detail for Oritrophium s.s. to further explore the mechanisms structuring the intrageneric diversity. The polyphyletic nature of Oritrophium suggests that there could have been convergent evolution in these Astereae taxa resulting in similar morphologies. Especially the two traits defining the genus e.g., male-functionality of the disk florets and monocephalous basal rosette, might be the result of convergent evolution. A similar pattern was found for Diplostephium s.l. regarding the habit and leaf size and extrapolated for the whole tribe (Vargas 2018) and was interpreted as an adaptation to the specific Páramo climate. If these morphological characteristics of Oritrophium s.l. represent cases of convergent evolution as adaptation to the environment needs to be further evaluated. Both the nuclear and plastid, data were congruent and sufficient to recover Oritrophium s.s. as a well-supported clade which is sister to the Erigeron-Hinterhubera-Diplostephium clade. There are few incongruences between the nuclear and plastid data within the Oritrophium s.s. clade, potentially reflecting incomplete lineage sorting or hybridization during the early radiation. The low number of informative characters in the plastid data does not allow to further evaluate these incongruences. For this reason, we mainly discuss the relationships within Oritrophium s.s. clade in the light of the ITS phylogeny, pointing to the incongruence with plastid data where relevant.

Within Oritrophium s.s., we found one clearly monophyletic species, O. limnophilum, and two monophyletic species groups: (i) the O. repens + O. yacuriense group, and (ii) the O. peruvianum group comprising the nominal species together with several spatially restricted species O. callacallense, O. mucidum, O. crocifolium, O. cocuyense and O. llanganatense. Morphologically, O. yacuriense can be differentiated from O. repens by its lanceolate, slightly conduplicate leaves with dentate flat margin (Arnelas et al. 2017). Our field observations support habit differences between the species. However, further studies including extensive sampling and exhaustive molecular and morphological analyses should evaluate relationships within this species group.

Within the Oritrophium peruvianum group, O. llanganatense is well-supported as a distinct lineage characterized by its private haplo- and nucleotypes recovered by the parsimony networks. As morphologically O. llanganatense resembles O. limnophilum rather than O. peruvianum, analyses evaluating its possible hybrid origin would be useful to elucidate the identity of O. llanganatense. In contrast, several other morphologically distinct species that have long been recognized as separate units, are part of this clade but do not form monophyletic entities and instead share plastid haplotypes and ITS nucleotypes among each other (i.e., O. peruvianum, O. callacallense, O. mucidum, O. crocifolium; Cuatrecasas 1961, 1997, Aranguren et al. 2008, Arnelas et al. 2020). Cuatrecasas (1961) pointed to morphological similarity between O. crocifolium and O. peruvianum var. lineatum, nevertheless Arnelas et al. (2020) maintained both as separate taxa. Detailed morphological studies combined with analyses based on phylogenetically more informative markers are needed to clarify species delimitation of O. crocifolium as well as O. callacallense and O. mucidum, or if they shall become part of a broadly circumscribed O. peruvianum.

Our results support the assignment of O. orizabense from Mexico to Oritrophium. As we were not able to gather the second species described from the Mexican zacatonal, i.e., Oritrophium duranguense, its generic status remains to be evaluated.

Biogeographic and evolutionary history of Oritrophium s.s.

Genetic and geographic patterns inferred from our phylogenetic analyses suggest that Oritrophium s.s. originated in the Central Andes at 4.19 Mya (2.01–8.45 Mya 95% HPD, Supplementary Information Figure S1) during the Early Pliocene. Our estimation agrees with the reported age of Oritrophium stem (6.51 Mya, 1.5–13 Mya 95% HPD) by Vargas et al (2017). The origin of Oritrophium s.s. is possibly related to intense Andean uplift and volcanism in this period and the appearance of new high-mountain habitats starting in the Miocene (Simpson 1975; Gregory-Wodzicki 2000; Graham 2009), being in line with timing of the diversification of many other high-Andean plant lineages (reviewed by Luebert and Weigend 2014). Diversification within Oritrophium s.s. occurred after the emergence of the Páramo ca. 3–5 Mya during the Pliocene (van der Hammen and Cleef 1986), with rapid species diversification after 3 Mya., in accordance with the internal node ages of our time estimation.

Two species diversity centers have been traditionally considered for Oritrophium, first, the Andes on both sides of the Huancabamba deflection in the Ecuador-Peruvian border region and second, the Andes of Venezuela (Aranguren et al. 2008). Because our phylogenetic reconstruction excludes the Venezuelan endemic species from Oritrophium s.s., the hypothesized secondary diversification center in Venezuela is rejected for Oritrophium. Instead, the ancestral area reconstruction suggested that the most recent common ancestor of Oritrophium s.s. has originated and diversified in the southern part of its current range, in the area between southern Ecuador and northwestern Bolivia (S). This area also includes the center of extant Oritrophium s.s. species diversity, i.e., Ecuador and Peru border, and harbors most of the basal lineages. The topographically complex valleys of Girón-Paute where this area (S) begins and expands to the south have been recognized as a barrier for many Páramo species (Jorgensen et al. 1995). Our results also showed the southern region (S) as the most probable source area for dispersal processes within South America to the north of Ecuador (E) and then to Colombia and Venezuela (N), and also from which a founder event to Mexico (M) would have occurred in the early history of Oritrophium. The scenario of a long-distance dispersal from the Central Andes of South America to Mexico confirms Cuatrecasas’ (1997) hypothesis about the direction of the migration of Oritrophium between the Americas, however, it rejects the original hypothesis suggesting northern Venezuela as the source area. Cuatrecasas (1997) suggested that this disjunct distribution pattern was mediated by migratory birds. In agreement with this hypothesis, our inferred age of Oritrophium s.s. origin and the median node age of 2.35 Myr at the base of the clade O. orizabenze/O. ferrugineum, overlap with the period when intensified avian interchange from south to north began, following land bridge completion between South and North America (Weir et al. 2009). Nevertheless, because Oritrophium does not have palatable fruits, its achenes are very small, light, commonly hairy (Aristeguieta 1964; Cuatrecasas 1997) with a pappus adapted mainly to wind dispersal (Ulian et al. 2013), bird-mediated dispersal remains yet to be tested. There are numerous trans-tropical disjunction patterns of flora between North and South America e.g., Larrea (Lia et al. 2001), Lithospermum L. (Weigend et al. 2010), Piptochaetium Presl. (Williams 1975). If the long-distance hypothesis is supported, Oritrophium would be the first record of plant long-distance dispersal from this area in the Andes of South America to North America, i.e., the Páramo habitats acting as the source not only as a sink of plant diversity. Long-distance dispersals to the equatorial Andes during a similar time window from North or Central America or Australasia have been implied for Valeriana (Bell and Donoghue 2005), Oreobolus (Chacón et al. 2006), and Hypericum (Nürk et al. 2018). Other initially South American lineages are cases of amphitropical disjunctions in subtropical, warm arid, semi-arid, or desert regions (e.g., Larrea, Lia et al. 2001; Lycium, Levin and Miller 2005; Hoffmanseggia, Simpson et al. 2005; Glandularia and Verbena, Yuan and Olmstead 2008), while Oritrophium is a high-Andean tropical element. In other cases, migrations occurred from North or Central America to South America without knowing the specific mechanism of migration, for example for Gentianella (von Hagen and Kadereit 2001), Astragalus (Scherson et al. 2008), and Polystichum (McHenry and Barrington 2014). Range expansion into contiguous areas (“stepping-stone model”) were inferred for example for Halenia (von Hagen and Kadereit 2003) and Lupinus (Drummond et al. 2012).

Population differentiation patterns in the widely distributed lineages (O. limnophilum and the O. peruvianum group) may inform on the processes shaping recent population divergence and possibly also incipient speciation in the group. In both species, Ecuador harbors the highest frequencies of the ancestral haplotype as well as a high diversity of additional locally restricted haplotypes, suggesting their centers of origin may occur in our defined area ‘S’, being in line with the overall genus phylogeny (Figs. 3, 4). Northward migration from this likely ancestral area is suggested by presence of additional younger Pleistocene derived haplotypes (O. peruvianum) and nucleotypes (O. limnophilum and partly also O. peruvianum) in northern Colombia and Venezuela (Supplementary Information Figs. S2, S3). In fact, the high haplo- and nucleotype diversity in this northernmost outpost of both species’ ranges suggest a stepwise colonization linked with a population bottleneck, a hypothesis that shall be further tested by deeper population sampling. Our results propose congruent centers of genetic diversity and a south-to-north migration from the Páramo of Ecuador towards the Páramo of Colombia and Venezuela during the Pleistocene, following the progressive rise of the Andes (van der Hammen 1974; Simpson 1975; van der Hammen and Cleef 1986).

The genetic patterns within Oritrophium s.s. suggest that the climatic fluctuations of the glacial and interglacial cycles producing connections and fragmentations of islands of Páramo vegetation structured the diversity and genetic structure of the genus, similarly to other taxa distributed in the central and northern Andes (Madriñán et al. 2013). However, the exact biogeographic scenarios reconstructed here exhibit certain differences compared to the later Tertiary/Pleistocene evolutionary history of other genera previously studied for the region (mostly summarized in Flantua et al. 2019) which involve the origin in or a step via North or Central America followed by later immigration or spread towards the Páramo in South America. Instead, Oritrophium s.s. exhibits an evolutionary history consistent with several South American lineages which also followed the south-to-north migration route along the Andes (e.g., Oxalis, Emshwiller 2002; Oreobolus, Chacón et al. 2006, Gómez-Gutiérrez et al. 2017; Calceolaria, Cosacov et al. 2009; Lupinus, Contreras-Ortiz et al. 2018; Puya, Jabaily and Sytsma 2013; Polystichum, McHenry and Barrington 2014; Senecio, Dušková et al. 2017; Loricaria). This coincides in space and time with the rise of this mountain range and the advent of new high mountain environments, such as the Páramo (van der Hammen 1974; Simpson 1975; van der Hammen and Cleef 1986). Southern Ecuador represents a likely ancestral area not only for Oritrophium but also for certain lineages within Senecio, which then migrated northwards to Colombia and Venezuela (Dušková et al. 2017). For Espeletia and Linochilus (Asteroideae), both characteristic elements of the high tropical Andean flora, a greater signal of allopatric speciation was detected as the predominant driver of their diversification history, although ecological differentiation also plays an important role (Cortés et al. 2018; Vargas and Simpson 2019); this pattern remains to be tested for Oritrophium.

Our work is the first approach to understand the evolution between and within the species of this Andean, recently diversified genus. We show that Oritrophium originated in the Early Pliocene, in high altitude areas between southern Ecuador and Bolivia. Our analyses provide the first insights about the species relationships in a phylogenetic context and clearly points out the need of a genus recircumscription. We further formally tested different biogeographic hypotheses explaining its disjunct distribution as a result of long-distance dispersal from the ancestral area. The interspecific dispersal pattern we detected within South America is also congruent with the intraspecific population diversity and structure of the two most widespread species. Future efforts will be necessary to resolve the fine-scale relationships within Oritrophium s.s. and to address the degree of incomplete lineage sorting and/or hybridization within the genus. Further, an evaluation of the potential convergent evolution in Oritrophium and the species excluded from the genus in a broader context for the Astereae tribe will be necessary.