Sign up
Forgot password?
FAQ: Login

Bolshoy A., Volkovich Z., Kirzhner V., Barzily Z. Genome Clustering

  • pdf file
  • size 6,32 MB
  • added by
  • info modified
Bolshoy A., Volkovich Z., Kirzhner V., Barzily Z. Genome Clustering
Springer, 2010, -233 p.
Genome classification, construction of phylogenetic trees, became today a major approach in studying evolutionary relatedness of various species in their vast diversity. Although the modern genome clustering delivers the trees which are very similar to those generated by classical means, and basic terminology is the same, the phenotypic traits and habitats are not anymore the playground for the classification. The sequence space is the playground now. The phenotypic traits are replaced by sequence characteristics, words, in particular. Matter-of-factually, the phenotype and genotype merged, to confusion of both classical and modern phylogeneticists.
Accordingly, a completely new vocabulary of stringology, information theory and applied mathematics took over. And a new brand of scientists emerged – those who do know the math and, simultaneously, (do?) know biology. The book is written by the authors of this new brand. There is no way to test their literacy in biology, as no biologist by training would even try to enter into the elite circle of those who masters their almost occult language. But the army of informaticians, formal linguists, mathematicians humbly (or aggressively) longing to join modern biology, got an excellent introduction to the field of genome clustering, written by the team of their kin.
The analogy genomic sequences – texts is both an immediate simple thought, and an open door to the depths of genetic information and intricacies of its organization. The most fascinating and unique features of these texts are multiplicity, degeneracy and overlapping of various codes carried by the genetic sequences. In this respect mere transfer of techniques used for analysis of familiar monocode texts to the polycode sequences would be naive. But no one would deny importance of such transfer, to begin with, to reveal, at least, the amazing specifics of the new reality. Another interesting aspect of the genomes is the uncertainty of the species’ formal definition. Already in classical genetics this was a stumbling block. The fertile progeny based definition of Dobzhansky1, though broadly accepted, does not fit all diversity of species. In the genomics the matter becomes even more complicated, in particular, due to horizontal gene transfer. It appears that the species is not an elementary node of evolution. Rather, the gene, or (again uncertain) DNA segment in general, is the node.
Principally new techniques have to be introduced to cope with this very special language. The monograph is a rather comprehensive outline of the state of art in the field, introducing as well some original developments. The appreciation of the principal differences of the natural sequence language from all we knew before is an important merit of the book.
Biological Background
Biological Classification
Mathematical Models for the Analysis of Natural-Language Documents
DNA Texts
N-Gram Spectra of the DNA Text
Application of Compositional Spectra to DNA Sequences
Marker-Function Profile-Based Clustering
Genome as a Bag of Genes – The Whole-Genome Phylogenetics
A. Clustering Methods
B. Sequence Complexity
C. DNA Curvature.
  • Sign up or login using form at top of the page to download this file.
  • Sign up
Up