Event–based biogeographic methods rely on explicit models with states (distributions) and transitions between states (biogeographic processes). The most commonly used model includes four different processes (PAGE, 1995): vicariance, duplication, extinction and dispersal. Vicariance (v) is allopatric speciation in response to a general dispersal barrier (i.e., a barrier affecting many organisms simultaneously). Duplication (d) is sympatric speciation or, alternatively, allopatric speciation due to idiosyncratic events such as a temporary dispersal barrier affecting only a single organism lineage. Extinction (e) may simply mean that organisms become extinct in an area but it can also result from the organisms occupying only part of a large ancestral area and therefore being absent in one of the fragments resulting from division of this area. Dispersal (i) occurs when organisms colonize a new area separated from their original distribution by a dispersal barrier; this is assumed to be followed by allopatric speciation separating the lineages in the new and old areas.
Once each event type is associated with a cost, the cost of fitting a TAC to a particular GAC can be found by simply summing over the implied events. The GAC with the lowest cost, the most parsimonious GAC, is that which best explains the taxon distributions in the TAC. This optimal GAC can be found, for instance, by explicit enumeration of all possible GACs or by heuristic search for the best GAC. Because inference is based on cost minimization, this approach may be referred to as parsimony–based tree fitting. Similar methods are applicable to problems in coevolutionary inference and in gene tree–species tree fitting (RONQUIST, 1995, 1998a; PAGE & CHARLESTON, 1998).
An important problem in event-based methods is to find the cost for each type of biogeographic process. The most common approach is to work with simple event–cost assignments that focus on one or two of the events and ignore the others (RONQUIST & NYLIN, 1990; RONQUIST, 1995). An example of this is Maximum vicariance (or Maximum cospeciation; PAGE, 1995; RONQUIST, 1998a, 1998b), in which vicariance events are maximized by associating them with a negative cost (a “benefit”, v = – 1), whereas the other events are not considered in the calculations (duplication (d) = extinction (e) = dispersal (i) = 0). The other approach is to set the cost assignments according to some optimality criterion. A reasonable optimality criterion is to maximize the likelihood of finding phylogenetically conserved distribution patterns (RONQUIST, 1998a, 1998b, in press). Assume that we test for conserved distribution patterns by randomly permuting the terminal taxa of the TAC and comparing the cost of the permuted data sets with the cost of the original data set. Examination of simulated and real data suggests that, in most cases, chances of finding conserved patterns are best when duplication and vicariance events carry a small cost relative to extinctions and dispersals (RONQUIST, in press). This occurs because both vicariance and duplication are phylogenetically constrained processes, whereas dispersal and extinction are not. In practice, it is often found that the optimal solution is the same under a relatively wide range of event–cost assignments. In the examples discussed in this paper, the cost of vicariance (v) and duplication (d) events are arbitrarily set to 0.01; extinction events (e) to 1.0; and dispersal events (i) to 2.0.
A simple example may illustrate parsimony– based tree fitting in historical biogeography. Consider a TAC with four terminals distributed in four areas (fig. 3a). Each possible GAC for the four areas (there are 15 in all) is fitted in turn to the TAC. For example, only three vicariance events are needed to fit the TAC to GAC1 (fig. 3b), whereas extra dispersal and extinction events must be postulated to explain the observed TAC on GAC2 and GAC3 (figs. 3c–d), and extra duplication and extinction events are needed for GAC4 (fig. 3e). Clearly, GAC1 will be the most parsimonious solution among those considered in figure 3 given the chosen event– cost assignments. Actually, GAC1 will remain optimal under a much wider range of cost assignments: as long as dispersals and extinctions cost more than vicariance events, the optimal solution will be the same. By explicitly enumerating all the 15 GACs and finding the cost of fitting each of them to the given TAC, it can also be demonstrated that GAC1 is the optimal solution.
The optimal reconstruction and the cost for any TAC–GAC combination can be found using fast dynamic programming algorithms (RONQUIST, 1998b). This means that a particular GAC can be fitted to a large set of TACs quickly. Nevertheless, searching for the best GAC using exhaustive algorithms is impractical for problems with more than around 10 areas, in which case heuristic algorithms or other types of exact algorithms should be used instead.