Cladistic biogeography focuses on hierarchical (“branching”) patterns, in which a sequence of vicariance events successively divides a continuous ancestral area and its biota into smaller components (fig. 4a). This history is described by the GAC (fig. 4b). The terminal branches in the GAC correspond to present areas (A, B, C) and the internal branches to ancestral areas (E, D), which are combinations of present areas.
In event–based methods (and in pattern–based methods), organism lineages are commonly assumed to be restricted to a single area at a time (for an exception see RONQUIST 1997); that is, an ancestral distribution must be either a single present area or one of the ancestral areas (combinations of present areas) specified by the GAC. The one area–one lineage assumption makes parsimony–based tree fitting mathematically more tractable but it is also biologically sound: evolving lineages are not normally expected to maintain their coherence over long time periods across major dispersal barriers. However, the assumption causes problems with widespread terminals: how do we reconcile the observation of widespread terminals with the assumption of one area per lineage? The problem is analogous to that of treating polymorphic characters in standard parsimony analysis, in which ancestors are normally assumed to be monomorphic (MADDISON & MADDISON, 1992).
An obvious way of solving the dilemma is to assume that the widespread terminal is in reality not a homogeneous evolutionary lineage but an unresolved higher taxon consisting of a number of lineages, each occurring in a single area (fig. 5a). This does not necessarily imply that the widespread taxon actually comprises different species that have failed to be distinguished (HUMPHRIES & PARENTI, 1986; WILEY1988; ENGHOFF, 1996; ZANDEE & ROOS, 1987; VAN VELLER et al., 1999) but it suggests that the widespread distribution is a temporary condition. Now, assuming that the widespread taxon is a soft (unresolved) terminal polytomy with one lineage for each area occupied by the taxon, we can obtain the minimum cost over all possible resolutions of the polytomy for each ancestral distribution at the base of the polytomy (the node marked with a black dot in the TAC, fig. 5a). For each possible ancestral distribution (i.e., each area in the GAC; fig. 5b), the terminal polytomy is resolved such that the cost of that distribution being ancestral is minimized (fig. 5c). This cost, in turn, is used in the subsequent fitting of the TAC to the GAC. The cost will depend on the GAC because the same ancestral distribution of a widespread taxon may have different costs on different GACs (see fig. 6, table 1).
In determining the possible ancestral distributions of the widespread taxon, we suggest three different options: the recent, ancient and free options. These options constrain the possible ancestral distributions of the widespread taxon in different ways, just like the traditional Assumptions A0, A1 and A2. However, unlike the traditional assumptions, the event–based options constrain the solutions by explicitly specifying the processes allowed in explaining the origin of the widespread distribution. Furthermore, each allowed solution is associated with a specific set of events and a specific cost. When many solutions are allowed, they often differ in cost such that they still convey useful information about the grouping of areas in the GAC. On continuation, the event–based options are described in more detail and compared with Assumptions 0, 1, and 2, both in terms of how they explain the widespread distribution (fig. 5) and how they affect the testing of alternative GACs (fig. 6).
This option is applicable when the widespread distribution can be assumed to be of recent origin. One of the areas inhabited by the widespread taxon is considered the true ancestral area (the center of origin of the taxon) and the others are treated as if added by recent, independent dispersal.
The possible ancestral distributions of the widespread taxon are only those terminal areas occupied by the taxon (B, C, E in fig. 5b). Regardless of whether we are using Maximum Vicariance or any other set of cost assignments is used, the cost C of a present area being the ancestral distribution is simply determined by:
C = (n – 1)i
In terms of explaining the widespread distribution, the recent option (“only dispersal allowed”) is not directly comparable to any of the traditional assumptions. In the context of testing alternative GACs, it will weight against A0 solutions in which the areas inhabited by the widespread taxon form a monophyletic clade (fig. 6b; table 1). It will also weight against “Full set” solutions in which all areas harboring the widespread taxon occur in the GAC in positions other than that predicted by the place of the widespread taxon in the TAC (fig. 6e, table 1). These solutions, of course, violate A2.
This option is applicable when the widespread distribution can be assumed to be of ancient origin. All areas inhabited by the widespread taxon are considered part of the ancestral distribution. Any mismatch between this distribution and the GAC is then explained as due to extinction; dispersals are not allowed. Under the ancient option, the only possible ancestral distribution of the widespread taxon is the most recent common ancestor in the GAC (“MRCA”) of all of the areas inhabited by the widespread taxon (H in fig. 5b). The GAC areas that are not ancestral to all of the recent areas inhabited by the taxon (A–G in fig. 5b) will require at least one dispersal and are therefore disallowed under the ancient option and are assigned infinite cost (fig. 5c). Areas in the GAC that are ancestral to the MRCA (I in fig. 5b) are allowed but will never occur in optimal reconstructions, as they will always be more costly than the MRCA (fig. 5c).
The cost of the MRCA is calculated assuming that the terminal polytomy is resolved so that the topology fits the GAC perfectly. Under these conditions, only extinction and vicariance events need to be considered because duplications are not required and dispersals are, of course, not allowed. The cost (C) of the MRCA is then given by
C = pe + (n – 1)v
where p is the number of required extinction events, n is the number of areas inhabited by the widespread taxon, and e and v the costs of the extinction and vicariance events, respectively. The number of required extinction events (p) is computed as follows:
In the GAC, focus on the subtree subtended by the MRCA: ((B, C), (D, E)) in fig. 5b. Assign 1 to the areas harboring the taxon (B, C, E) and 0 to the other areas (D). Then, find the number of losses (p) in this presence/absence character assuming irreversibility (1--> 0). In fig. 5b, there would be only one loss in area D so the cost is
C = 1e + (3 – 1)v = 2v + e
In terms of explaining the widespread distribution, the ancient option is similar to A1 in that it allows extinctions but not dispersals. In the context of testing alternative GACs, however, it will strongly favor A0 solutions in which the areas inhabited by the widespread taxon form a monophyletic clade (fig. 6b; table 1). Thus, widespread taxa provide strong evidence for grouping the areas inhabited by them under the ancient option.
Under the free option, all possible ancestral areas are considered and any mismatch between the areas inhabited by the widespread taxon and the GAC is explained by the most favorable combination of events. The minimum cost of each possible ancestral distribution is calculated without any constraints on the type of assumed events: dispersals, extinctions, duplications and vicariance events are all allowed.
For the Maximum Vicariance method, the optimal cost of each possible ancestral distribution is found if the terminal polytomy is resolved so that it becomes congruent with the GAC. This might hold for more complex event– cost assignments as well, if the cost of the ancestral distributions is found with algorithms ignoring the complexity of dispersals, the so– called lower bound algorithms (RONQUIST, 1995, 1998b, in press). Why the complexity of dispersals should be ignored is because optimal solutions may occasionally require combinations of dispersals that are impossible on terminal trees congruent with the GAC, but it seems that these conflicts can always be solved by rearranging the terminal tree without increasing the total cost (Ronquist, unpublished data). The lower–bound algorithms are computationally extremely efficient so the implementation of the free option is straightforward if this conjecture is true.
In terms of explaining the widespread distribution, the free option is similar to A2 in that it allows all types of events. However, in the context of comparing alternative GACs, the free option will favor solutions in which the areas inhabited by the widespread taxon form a monophyletic clade, i.e., A0 solutions (fig. 6b, table 1). The relative cost difference between other solutions will depend on the set of areas inhabited by the widespread taxon and their position in the GAC (table 1). It is interesting to note that, although the free option is similar to A2 in terms of allowed events, it obviates one of the main criticisms raised against A2, namely that it is indecisive. According to the traditional view of A2, GACs 1–3 (figs. 6b–6d) would be equally probable solutions, whereas the free option selects GAC 1 (fig. 6b) as the most parsimonious solution. Thus, in this case the free option allows effective selection among alternative GACs.
where n is the number of areas inhabited by the widespread taxon and i is the dispersal cost (e.g., C
in fig. 5c). The cost of all other GAC areas (terminal areas A, D and ancestral areas F–I in fig. 5b) is set to infinity (an arbitrary high cost) (fig. 5c), since they are not allowed as ancestral distributions.