The first complete genome sequence was published in 1977 (Nucleotide-sequence of bacteriophage PHICHI174 DNA; Sanger F et al. (1977) Nature 265, 687-95) and the first complete genome of a free-living organism has become available in 1995 (Haemophilus influenza; Fleischmann RD et al. (1995) Science 269, 496-512). Since then the velocity of producing new sequence data is rapidly increasing. Today more than 1,000 complete genomes are known, and thousands are on the way, see GenomesOnline (http://gold.jgi.doe.gov) for a compilation. The task is now to convert this information into as much biological knowledge as possible. Many tools for genome analysis, phylogeny, motif discovery, etc. have been developed, but still we do not fully understand how genomes are functioning. Focusing on the solvent accessible surface area of DNA it was shown recently that this physical DNA property is evolutionarily more constrained than the underlying actual sequence and that these constrained regions correlate with functional non-coding elements, e.g. enhancers (Parker et al. (2009) Science 324, 389-392). Here we describe a new and unconventional genome browser called DiProGB that encodes the genome sequence by physico-chemical dinucleotide properties such as stacking energy, melting temperature or twist angle. Analyses can be performed for the + and –, as well as for the double strand. DiProGB is closely linked to the dinucleotide property database DiProDB (http://diprodb.leibniz-fli.de). Combining the conventional letter-based approach with the analysis of physico-chemical properties provides a new quality of genome analysis leading to new insights into genome functioning.
What can be done with DiProGB?
- Visualization of nucleotide sequences as dinucleotide-encoded sequence graphs and real-time manipulation of these graphs.
- Easy localization of annotated features (e.g. genes, tRNAs, CDS ...).
- Identification of patterns of specific physical properties in DNA or RNA.
- Search for sequence motifs.
- Search for repeats.
- Statistical analyses.
- And more ... .