Escherichia coli genome is composed of two distinct types of nucleotide sequences
We calculated correlations of the nucleotide distributions along the E. coli genome. Subsequent cluster analysis of the correlation distributions showed that the genome was composed of two qualitatively different types of nucleotide sequences. The first type exhibited strong correlations of the genomic distributions of A with T and G with C, and high anticorrelations of A with C and G with T. In contrast, the second type was characterized by weak or negligible correlations typical of randomized sequences, Both types of sequences were almost equally abundant in the E, coli genome and their length varied from several hundred nucleotides to about 70 kilobases, They were not disjunct with respect to their (G + C) content but the high correlations and anticorrelations were rather characteristic for (A + T)-rich genomic segments, We offer possible explanations of the mosaic structure of the E, coli genome. (C) 2000 Academic Press.