In this article, we'll talk about a method for constructing evolutionary trees, known as character based evolutionary tree construction. It was initially designed to infer evolutionary relationships based on morphological and physiological characters.
In character based tree construction, we are given a DNA segment for multiple species coming from the same part of the genome (for example, the same gene). Given these DNA sequences, we could like to construct the evolutionary tree, i.e. predict which species are more closely related and have a recent common ancestor, vs species that are not closely related and diverged earlier.
Character based tree construction method is based on Occam’s razor principle which states “when several hypotheses with different degrees of complexity are proposed to explain the same phenomenon, one should choose the simplest hypothesis”. In terms of tree buil...
Sequence alignment using Longest Common Subsequence algorithm
In molecular biology, DNAs and proteins can be represented as a sequence of alphabets. DNA sequences consist of A, T, G, C representing nucleobases adenine, thymine, guanine and cytosine. Proteins consist of 20 different letters indicating 20 different amino acids.
Comparison of two sequences, known as sequence comparison, either from the same organism or from different organism is an important task in molecular biology. It is helpful in providing solutions to many biological questions, for example:
predicting structure and function of proteins
inferring evolutionary history and relatedness of species
locating common subsequences in genes / proteins to identify common motifs,
as a sub-problem in genome assembly for DNA sequencing
Genes encode and can be used to synthesize proteins, and this process is known as gene expression. In higher organisms like humans, thousands of genes express together by different amounts depending upon various factors such as the type of cell (nerve cell or heart cell), environment and disease conditions. For example, different types of cancers invoke different gene expression patterns in humans. These different gene expression patterns under different conditions can be studied using Microarray technology.
Microarrays and Gene Expression profiling
Data from a Microarray can be imagined as rectangular matrix or a grid with each cell in the matrix corresponding to a gene expression value under a particular condition. As shown in the figur...
Biodiversity is the sum total of the genetically based variety of all organisms in the biosphere and can be susceptible to natural and/or man-made changes. Human activities such as industry, agriculture, mining, transportation, construction, and habitations play a major role in the reduction of biodiversity.
The 5 major human impacts on the environment include-
Deforestation- It refers to the cutting, clearing, and removal of rainforests or related ecosystems into less bio-diverse ecosystems such as pasture, cropland, or plantations.
Protein structure prediction using homology modeling
What are proteins?
Proteins are large biomolecules which are responsible for performing most of the functions within an organisms cells, including responding to stimuli, acting as catalysts for other reactions, transporting molecules from one place to another and performing cell signaling. Just like DNA sequences, protein sequences are strings of molecules but unlike DNA sequences, there are 20 different molecules called amino-acids that make up protein sequences.
Every 1D protein sequence string folds into 3D structures. These 3D protein structures are determine how a protein responds to various environments and which other molecules it interacts with, and hence is critical in the ability of the protein to perform its functions. The 3D structure of protein is described by providing the coo...
Sequence Motifs, Consensus Sequences and The Motif Finding Problem
Sequence Motifs and their Biological Significance
Sequence motifs are nucleic acid sequences that are widespread across or within a genomes and have or are speculated to have certain regulatory or structural biological functions.
Motifs that are found in different parts of the genomes like exons, introns and junk, have different functions. Motifs present in the exons ( coding part of the genome) decide the structure of the protein or label proteins to be sent to certain parts of the cell for processes like phosphorylation. Motifs that are present in introns (which makes up the non coding part of genome) are usually the regulatory sequences which determine the amount of gene expression and binding sites of proteins. Satellite DNA, which is the main component of centromeres and heterochromatin, is an example of motif found in junk parts of the genome.