Sequence alignment using Longest Common Subsequence algorithm
In molecular biology, DNAs and proteins can be represented as a sequence of alphabets. DNA sequences consist of A, T, G, C representing nucleobases adenine, thymine, guanine and cytosine. Proteins consist of 20 different letters indicating 20 different amino acids.
Comparison of two sequences, known as sequence comparison, either from the same organism or from different organism is an important task in molecular biology. It is helpful in providing solutions to many biological questions, for example:
predicting structure and function of proteins
inferring evolutionary history and relatedness of species
locating common subsequences in genes / proteins to identify common motifs,
as a sub-problem in genome assembly for DNA sequencing
Evolutionary Tree Construction: Neighbor-Joining Algorithm
Evolutionary Tree Construction
The problem of evolutionary tree construction is inferring the topology and the branch lengths of the evolutionary tree that may have produced the given gene sequence data. The number of leaf nodes in the inferred tree should be equal to the number of gene sequences in the given data.
The Neighbor-Joining algorithm
Neighbor-Joining (NJ) tree inference method was originally written by Saitou and Nei in 1987. It belongs to a class of distance-based methods used to build evolutionary trees. NJ method takes a matrix of pairwise evolutionary distances between the given sequences to build the evolutionary tree.
The pairwise distances are typically obtained from s...
Small correction: r(D) = 7 + 10 + 7 + 5 + 9 = 38, you wrote r(D) = 7 + 10 + 7 + 5 + 9 = 41
Read more… (30 words)
Read (30 words)
Character Based Evolutionary Tree Construction
In this article, we'll talk about a method for constructing evolutionary trees, known as character based evolutionary tree construction. It was initially designed to infer evolutionary relationships based on morphological and physiological characters.
In character based tree construction, we are given a DNA segment for multiple species coming from the same part of the genome (for example, the same gene). Given these DNA sequences, we could like to construct the evolutionary tree, i.e. predict which species are more closely related and have a recent common ancestor, vs species that are not closely related and diverged earlier.
Character based tree construction method is based on Occam’s razor principle which states “when several hypotheses with different degrees of complexity are proposed to explain the same phenomenon, one should choose the simplest hypothesis”. In terms of tree buil...
Genes encode and can be used to synthesize proteins, and this process is known as gene expression. In higher organisms like humans, thousands of genes express together by different amounts depending upon various factors such as the type of cell (nerve cell or heart cell), environment and disease conditions. For example, different types of cancers invoke different gene expression patterns in humans. These different gene expression patterns under different conditions can be studied using Microarray technology.
Microarrays and Gene Expression profiling
Data from a Microarray can be imagined as rectangular matrix or a grid with each cell in the matrix corresponding to a gene expression value under a particular condition. As shown in the figur...