Statistics and Data Science Seminar

Chenglong Yu
University of Illinois at Chicago
Graphical Representation of Biological Sequences and Its Applications
Abstract: Among all existing alignment-free methods for comparing biological sequences, the sequence graphical representation provides a simple approach to view, sort, and compare gene structures. The aim of graphical representation is to display DNA or protein sequences graphically so that we can easily find out visually how similar or how different they are. Of course, only the visual comparison of sequences is not enough for the follow-up research work. We need more accurate comparison. This leads us to develop the application of the graphical representation for biological sequences. I will talk about two contributions for this direction. (1) We construct a protein map with the help of our proposed new graphical representation for protein sequences. Each protein sequence can be represented as a point in this map, and cluster analysis of proteins can be performed for comparison between the points. This protein map can be used to mathematically specify the similarity of two proteins and predict properties of an unknown protein based on its amino acid sequence. (2) We construct a novel genome space with biological geometry, which is a subspace in R^N. In this space each point corresponds to a genome. The natural distance between two points in the genome space reflects the biological distance between these two genomes. The genome space will provide a new powerful tool for analyzing the classification of genomes and their phylogenetic relationships.
Wednesday February 15, 2012 at 4:00 PM in SEO 636
Web Privacy Notice HTML 5 CSS FAE
UIC LAS MSCS > persisting_utilities > seminars >