Classic speciation models such as the birth–death process (BDP) assume that new species will emerge and current species will become extinct at certain rates that are measured in unit time (Barraclough and Nee, 2001 (link)). Usually, a time-calibrated tree is required as an input. Thus, for molecular sequence data, a molecular clock model must be applied to calibrate the tree. Coalescent theory also relies on unit time to describe the relationships among ancestors and descendants in a population.
Instead, we may consider the number of substitutions between branching and/or speciation events, by modeling speciations using the number of substitutions instead of the time. The underlying assumption is that each substitution has a small probability of generating a speciation. Note that the substitutions are independent of each other. If we consider one substitution at a time in discrete steps, the probability of observing η speciations for κ substitutions is given by a binomial distribution. Because we assume that each substitution has a small probability ρ of generating a speciation, and the number of substitutions in a population of size η is large, the process follows a Poisson distribution in continuous time with rate . Therefore, the number of substitutions until the next speciation event follows an exponential distribution.
Comparing this with the assumptions of a BDP, we observe that each generation (e.g. with a generation time of 20 years) on a time-calibrated ultrametric tree has a small probability of speciation. The BDP does not model substitutions, thus, substitutions are superimposed onto the BDP, whereas PTP explicitly models substitutions. Substitution information can easily be obtained by using the branch lengths of the phylogenetic input tree. Thus, in our model, the underlying assumptions for observing a branching event are consistent with the assumptions made for phylogenetic tree inference.
We can now consider two independent classes of Poisson processes. One process class describes speciation such that the average number of substitutions until the next speciation event follows an exponential distribution. Given the species tree, we can estimate the rate of speciations per substitution in a straightforward way. The second Poisson process class describes within-species branching events that are analogous to coalescent events. We assume that the number of substitutions until the next within-species branching event also follows an exponential distribution. Thus, our model assumes that the branch lengths of the input tree have been generated by two independent Poisson process classes.
In the following step, we assign/fit the Poisson processes to the tree. Let T be a rooted tree, and Pi be a path from the root to leaf i, where and l is the number of leaves. Let be the edge lengths of Pi, representing the number of substitutions. We further assume that are independent exponentially distributed random variables with parameter λ. Let be the sum over the edge lengths for . We further define . Bik is the number of substitutions of the kth branching event, and is the number of branching events below Bik. Note that constitutes a Poisson process. Thereby, T and together form a tree of Poisson processes, which we denote as Poisson Tree Processes (PTP). To a rooted phylogeny with m species, we apply/fit one among-species PTP and at most m within-species PTPs. An example is shown in Figure 1.

Illustration of the PTP. The example tree contains 6 speciation events: R, A, B, D, E, F, and 4 species: C, D, E, F. Species C consists of one individual; species D, E, F have two individuals each. The thick lines represent among-species PTP, and the thin lines represent within-species PTPs. The Newick representation of this tree is ((C:0.14, (d1:0.01, d2:0.02)D:0.1)A:0.15, ((e1:0.015, e2:0.014)E:0.1, (f1:0.03, f2:0.02)F:0.12)B:0.11)R. The tree has a total of 16 different possible species delimitations. The maximum likelihood search returned the depicted species delimitation with a log-likelihood score of 24.77, and = 8.33 and = 55.05

In analogy to BP&P (Yang and Rannala, 2010 (link)) and GMYC (Pons et al., 2006 (link)), we conduct a search for the transition points where the branching pattern changes from an among-species to a within-species branching pattern. The total number of possible delimitations on a rooted binary tree with m tips ranges between m (caterpillar tree) and , depending on the actual tree shape (Fujisawa and Barraclough, 2013 (link)). Because the search space is generally too large for an exhaustive search, we need to devise heuristic search strategies. Given a fixed species delimitation, we fit two exponential distributions to the respective two branch length classes (among- and within-species branching events). We calculate the log-likelihood as follows:

where x1 to xk are the branch lengths generated by among-species PTPs, to xn are the branch lengths of within-species PTPs, is the speciation rate per substitution and is the rate of within-species branching events per substitution. The rates and can be obtained via the inverse of the average branch lengths that belong to the respective processes. Based on Equation (1), we search for the species delimitation that maximizes L. A standard likelihood-ratio test with one degree of freedom can be used to test if there are indeed two classes of branch lengths. Large P-value indicates that either all sequences are one species or that every sequence represents a single species.
We developed and assessed three heuristic search strategies for finding species delimitations with ‘good’ likelihood scores, which are described in the online supplement. For the experimental results presented here, we used the heuristic that performed best, based on our preliminary experiments.
Free full text: Click here