CommonLounge Archive

Character Based Evolutionary Tree Construction

February 16, 2018

In this article, we’ll talk about a method for constructing evolutionary trees, known as character based evolutionary tree construction. It was initially designed to infer evolutionary relationships based on morphological and physiological characters.

In character based tree construction, we are given a DNA segment for multiple species coming from the same part of the genome (for example, the same gene). Given these DNA sequences, we could like to construct the evolutionary tree, i.e. predict which species are more closely related and have a recent common ancestor, vs species that are not closely related and diverged earlier.

Parsimony score

Character based tree construction method is based on Occam’s razor principle which states “when several hypotheses with different degrees of complexity are proposed to explain the same phenomenon, one should choose the simplest hypothesis”. In terms of tree building, all other things being equal, the best hypothesis is the one that requires the fewest evolutionary changes.

Parsimony score is defined as total cost of all mutations in a tree that are required to explain the observed differences. The parsimony score depends on the structure of the tree, the locations assigned to each of the species, as well as the inferred DNA sequences for each of the common ancestors.


Let’s say we have we have 4 species and the nucleotides (DNA sequence) from each of the specifies. They are as follows:

As per the table, species A and B have same nucleotide at 1st position and C and D have same nucleotides at 1st position.

Based on the above information, we might predict a evolutionary tree that looks as follows (figure 1). Note that the prediction also includes the positions of species A, B, C and D as well as the sequences for the common ancestors.

Figure 1

The dashes on the edges label the number of differences between a node and its ancestor. For example, species A (GGG) differs from its parent (GTG) at one location (G vs T, in location 2).

Let us consider another possible evolutionary tree. As per this tree, species A and C are more closely related. Again, dashes on the edges label the number of differences.

Figure 2

We have drawn two trees considering the nucleotide positions of species. But how we know which tree is better or more parsimonious tree?

We can see that the parsimony score of the first tree is 4 and for the second tree it is 5. Since we are trying to minimize the number of mutations, the first tree is better than the second tree.

Weighted parsimony score

The above definition can be extended to include cases different nucleotide mutations are given different scores. This is described below with some examples:

Unweighted Parsimony Score = 5

Weighted Parsimony Score = 20

If we have multiple characters per sequence, we simply compute the above score for each position, and add them up.

Parsimony Problems

Our broad strategy for finding the evolutionary tree with the minimum parsimony score will be as follows:

Small Parsimony problem: We are given a tree, but need to determine labeling of the tree that provides the lowest parsimony score. That is, we need to find the nucleotide sequences of the ancestors.

Large Parsimony problem: Given the solution to the above problem, we’ll see how to find the best tree structure.

Small Parsimony Problem: Sankoff’s Algorithm

Given a tree, we need to determine a labeling for the tree that provides the lowest parsimony score. That is, we need to find the nucleotide sequences of the ancestors.

Assumption: We assume that the characters in the string are independent. Hence, the Small Parsimony problem can be solved independently for each character. Therefore, to devise an algorithm, we can assume that every leaf is labeled by a single character rather than by a string of m characters.


We will use Sankoff’s algorithm to solve the above problem.

Let Sj(i) be the smallest (weighted) number of steps for the subtree at or below node j, given that node j has nucleotide i. Also, let cik be the cost of changing nucleotide i to nucleotide k.

Step 1: We will calculate Sj(i) from the leafs to the root. Initially, at the leafs (say leaf j), we have

$$ S_j(i) = \begin{cases} 0 &\text{if node j has nucleotide i} \\ \infty &\text{if node j has any other nucleotide} \end{cases} $$

Step 2: Then proceeding up the tree for node whose immediate descendants are l and r, we have

$$ S_j(i) = min_k [c_{ik} + S_l(k) ] + min_k [c_{ik} + S_r( k) ] $$

That is, when computing the parsimony score of subtree at node j given it has nucleotide i, we consider all possible nucleotide values for the children of node j.

Step 3: The minimum number of (weighted) steps for the tree is found by computing at the root node (say 0) the S0(i) and taking the smallest of these.


Let’s take an example and proceed stepwise to understand better.

Step 1: Begin at leaves:

  • If leaf has the character in question, score is 0
  • Else score is ∞

Step 2: Values for internal notes (calculated bottom to top). Lets calculate the score for nucleotide A of the first parent node.

Scoring matrix cij (given)

The scoring matrix cij is given above. Sl(k) and Sr(k) are already calculated (tree above).

We calculate Sl(i), CiA, and Sum = Sl(i) + CiA.

We calculate Sr(i), CiA, and Sum = Sr(i) + CiA.

Min(left sum) + Min(right sum) = 0 + 1 = 1. Therefore, sum of the scores we get for nucleotide A = 1.

Other scores are calculated using same approach for all the nodes including root node. After calculating all the scores, tree will look like this:

Step 3: Smallest score at root node is minimum parsimony score. Here the smallest score is 6. Hence, root node assigned as “A”

Step 4: After the root scores are calculated, we can travel down the tree and assign each vertex with optimal character.

Score 6 came from 1 + 5 of A nucleotide from both the descendent nodes (refer figure above). Hence, A will be assigned to both the descendent nodes (refer figure below). And the tree is thus labeled as:

Figure 10

Large Parsimony Problem

Given the solution to the small parsimony problem, we’ll see how to find the best tree structure.

In general, finding the optimal tree is a NP complete problem, i.e. there is no fast algorithm for this problem. Hence, we use an algorithm which won’t always find the optimal tree, but works quite well in practice.


  1. Set current tree equal to arbitrary binary tree structure
  2. Go through all edges and perform nearest neighbor interchanges (we’ll see an example of nearest neighbor interchange below).
  3. Solve small parsimony problem on each tree
  4. If any tree has lower parsimony score, then update the current tree and go to step 2, otherwise stop.

This method is called the Greedy Heuristic method.

Figure 11


  1. Let’s take an unrooted binary tree shown in Figure 11.
  2. Nearest neighbor interchange: We will delete the two nodes “c” and “d” connected by an internal edge. Rearranging the subtrees (a, b, e and f) gives us the following possible tree structure are (shown in figure below).

Figure 12

  1. For every possible tree using small parsimony algorithm lowest score is calculated.
  2. Lowest score tree is returned.


In this article, we saw a character based evolutionary tree construction method based on parsimony score. First, we solved the small parsimony problem, which finds the parsimony score given a tree structure and leaves. Then, we looked at the large parsimony problem which finds a good tree structure starting from a tree structure chosen at random by performing nearest neighbor interchanges.

© 2016-2022. All rights reserved.