Origin of eukaryotic introns: a hypothesis, based on codon distribution statistics in genes, and its implications. |
| |
Authors: | P Senapathy |
| |
Abstract: | A hypothesis for the origin of introns in eukaryotic genes is developed. By computer simulation it was found that the reading-frame lengths in a random nucleotide sequence are distributed in a negative exponential manner and that there exists an upper limit of about 200 codons in the length of the reading frames (RFs). These characteristics suggest that, if primordial DNA contained a random nucleotide sequence, the most primitive cells would have been under selective pressure to eliminate interfering stop codons in order to increase the length of RFs. Further, they indicate that the only possible way that a coding sequence that is considerably longer than 600 nucleotides could be derived from the short coding sequences occurring in a random sequence would be to splice the short coding sequences and to eliminate the stretches of sequences containing clusters of inframe stop codons. Thus, introns are suggested to be those stretches of sequences containing interfering stop codons that were originally earmarked in the first primitive cells to be eliminated in order to enable the coding for long polypeptides. Because the statistical characteristics of codon distributions in today's eukaryotic DNA sequences resemble closely those of a random sequence and because the upper limit in the length of RFs (200 codons) in a random sequence corresponds precisely to the observed maximum length of exons in today's eukaryotic genes (600 nucleotides), it is suggested that introns originated in the most primitive unicellular eukaryotes when they evolved from primordial sequences. The data from the prokaryotic gene sequences indicate that prokaryotic genes may have been derived originally from primitive unicellular eukaryotic genes by losing introns from them. |
| |
Keywords: | |
|
|