Synteny blocks are conserved regions within two sets of chromosomes. In other words, they are identical stretches of nucleotides on two different chromosomes.
Lets take an example, of mouse and human chromosomes. The genomic similarity between human and mouse chromosomes is unexpectedly high, at about 85%. This high amount of genomic similarity implies that at the molecular level, a lot of the functions that are being performed in human and mouse cells are the same, even though on the outside a human looks very different from a mouse.
If we look specifically at the X chromosome, the similarity is even higher at 95%. There are 11 stretches of nucleotides (synteny blocks) which occur in both humans and mice X chromosomes, though found at drastically different locations on the chromosome.
Representation of matches between human and mouse chromosomes. Source: Wikipedia
In this above image, we see all the chromosomes present in a human and mouse. The colors indicate matches found between the two chromosomes. If we observe X chromosomes in each organism, we see that X chromosomes of both organisms are very similar.
If we look at nucleotide blocks present in chromosome 15 of the mouse, we see that it is majorly found on chromosomes 5, 8, 12 and 22 in humans. This process of regions of the genome moving from one location to another is called a genetic rearrangement.
Genetic Rearrangement involves overall alteration or modification of certain chromosomal regions through insertion, inversion, relocation or duplication.
Genetic rearrangements can occur on the same chromosome or on different chromosomes. When the rearrangement occurs within the same chromosome on different regions, is called a uni-chromosomal rearrangement, while those that happen among different chromosomes are called multi-chromosomal rearrangements. Following is an example of multi-chromosomal rearrangement of Philadelphia chromosome.
Here we see that “abl” region from chromosome number 9 merges with “bcr” region of chromosome number 22 to produce a “bcr-abl” region. This rearrangement in Philadelphia chromosome is highly oncogenic.
Dot Plots are a graphical technique which is often applied in bioinformatics for genetic comparison and rearrangements. It is used to get a visual overview of genetic rearrangements and similarity between two chromosomes.
Let us take a practical example. We choose two genomes - E. Coli and S. Enterica, on x-axis and y-axis respectively.
Algorithm: We first perform k-mer matching on the two genomes (for every k-mer). For each match found, we plot a dot denoting the location of the match in the two genomes. The k-mer matching is performed on both the DNA strands - forward strand (red) and complimentary strand (blue).
We see in this graph 5 major diagonals - a, b, c, d, and e. Two of these (a and e) represent synteny blocks along identical strands while the other three (b, c and d) show synteny blocks along complementary strand. Apart from these major diagonals, we find many other dots, these are considered as noise and are ignored.
Let’s take a look at an another example plot from human and chimpanzee genomes.
The human genome is on the x-axis, and the chimpanzee genome is on the y-axis.
As mentioned before, synteny blocks in dot plots are signified by the diagonals obtained after plotting. An inversion shows up as a diagonal in opposite direction. Insertions / deletions are seen as cuts on a running diagonals. Diagonals that are observed in completely isolated areas are due to relocations on chromosomes.
More examples of synteny dot plots between chimpanzees and humans are shown below. The first is the Y chromosomes of humans and chimpanzees (example of noisy graph). And the second is chromosome 21 (example of very clean graph).
In this article, we were introduced to the concepts of synteny blocks (or conserved regions) and genomic rearrangements. We also learnt an algorithm to create dot plots based on k-mer matching which can be used for synteny block construction.