This list consists of ~10 tutorials to learn bioinformatics. You can think of this list as a “Free Online Nano Book”. We’ll cover important bioinformatics topics, learning (a) the biological significance of each problem, and (b) the computational algorithms used to solve the problem. Everything is 100% free.
Computers are everywhere, and biology is no exception. Bioinformatics is computer science applied to biology. In particular, bioinformatics often deals with application of algorithms, data structures and machine learning methods for the analysis of DNA, proteins and evolutionary history. More detail on topics we’ll discuss are included below within each section.
Subscribe to add this list to the top of your Home Page. Get started with the first article below.
This section (single tutorial) provides the biological context for the rest of this course. It describes what DNAs and proteins are, and how we can model them as a sequence of characters.
Conserver sequences and regulatory motifs are short sequences (say of length 15-30) which occur frequency in the genome. These sequences serve a variety of functions - such as regulating gene expression (and hence how much of a protein is produced) and indicating where genes begin. We’ll see how to find these sequences using brute-force algorithms and randomized algorithms.
- Conserved Sequences: Their Biological Significance, and the K-mer Finding problem
- Sequence Motifs, Consensus Sequences and The Motif Finding Problem
Recently, we’ve figured out cost effective ways of sequencing a human genome, i.e. taking a human genome and reading sequence of 3-4 billion nucleic acids (A, C, G and T) that it comprises of. In this section, we’ll see how this is achieved. We’ll also see how we can find similar regions in two different genomes, which allows us to do things like infer evolutionary history and predict protein function.
- Introduction to Genome Assembly
- Sequence alignment using Longest Common Subsequence algorithm
- Synteny Blocks, Genetic Rearrangements and Synteny Block Construction
An evolutionary tree, or “tree of life”, is a representation of how life evolved on our planet. It shows us which animals are more closely related to each other (dogs and wolves, humans and chimpanzees), and which ones are not. In this section, we’ll see methods to infer evolutionary trees, given parts of the DNA from different species.
- Evolutionary Trees
- Evolutionary Tree Construction: Neighbor-Joining Algorithm
- Character Based Evolutionary Tree Construction
In the first tutorial, we’ll see how we can find similar genes using the clustering algorithms such as K-means clustering and hierarchical clustering. In the second tutorial, we’ll see how we can detect mutations which cause diseases by mapping genes from a diseased human to a reference human genome. For this, we’ll see how to perform exact and inexact string matching efficiently using data structures like tries and suffix trees.
This is a bonus tutorial. It covers the advanced topic of protein structure prediction, which is currently an area of active research with lots of unsolved and open problems.