首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this article we analyze some of the structural characteristics of the coding section and the intron of the human chemokine CXC receptor 4 (a 7-transmembrane receptor) pre-mRNA. In the coding sequence the frequencies of the individual nucleotides do not depart significantly from 0.25, while in the intron the frequencies of the As and Gs are significantly lower and higher, respectively, than expected from a random distribution. Analysis of the pattern of association of nucleotides into triplets or couples shows that some triplets or couples occur with frequencies significantly higher or lower than expected when assuming a random association of nucleotides. In particular, in the intron combinations of the same nucleotide are over-represented. 7-or-more nucleotide repeats occur in both the coding section and the intron with frequencies which exceed the confidence limits for a random distribution. For the coding sequence this is possibly explained by the alternans of relatively similar hydrophobic-coding sections and relatively similar intervening intracellular and extracellular hydrophilic-coding sections. 7-or-more nucleotide repeats in reverse order and in reverse/complemented order occur in the intron, but not in the coding section, with frequencies which significantly exceed a random distribution. The numerous intronic repeats in reverse/complemented order may be of relevance for the secondary structure of the intron and might be one important element of the integrated splicing code.  相似文献   

2.
Both mouse and human chemokine receptor CXC motif 5 (CXCR5) genes exhibit one single intron interrupting the coding sequence. The mouse intron is 12053 nucleotides (nt) long; the human intron is 9603 nt long. Sections of the mouse intron significantly align plus/plus with sections of the human intron; the aligned segments are in the same order in mouse as in man and overall cover 13% of the mouse sequence and 17% of the human sequence. The human CXCR5 intron harbors sequences derived from retroviruses (human endogenous retroviruses). The mouse intron comprises very similar sequences. About 70% of the mouse intron sequence is ‘specific’ to this gene, while sequences in the rest of the intron are shared with many other genes located on different chromosomes. In the human the coverage by specific sequences is about 87%. Thus, the contribution of transposable elements is significantly higher in mouse (30%) than in man (13%). Intra-intronic plus/minus alignments exist in mouse (10 couples) and man (two couples): these may form stem and loop structures determining the secondary structure of the corresponding pre-mRNAs.  相似文献   

3.
Sequence of full length cDNA for human S-adenosylhomocysteine hydrolase.   总被引:10,自引:0,他引:10  
Two cDNA clones for human S-adenosylhomocysteine hydrolase isolated from a placental cDNA library were sequenced. Each contained a sequence of 1299 nucleotides encoding a 432 amino-acide protein of MW 47660. Clone 16-1 contained 47 nucleotides 5' of the coding region, and a 780 nucleotide 3' flanking region terminating in apoly A tail. In addition, a 101 nucleotide unprocessed intron interrupted the coding sequence at nucleotide 854 (second base of codon 285). Clone 20-1 contianed 43 nucleotides 5' and 742 nucleotides 3' flanking the uninterrupted coding region. Besides the intron, the clones differed in one position of the coding sequence and at two positions of the 3' non-coding region. The cDNAs for human and rat S-adenosylhomocysteine hydrolase were identical at 91.5% of position in the coding sequence and showed 70% homology in the 3' non-coding regions. Human and rat S-adenosylhomocysteine hydrolases are identical at 97% of amino-acid residues, and the Dictyostelium and human enzymes at 75%.  相似文献   

4.
Formyl peptides are oligopeptides released by Gram-negative bacteria. So far, specific formyl peptide receptors (FPRs) have been described in mammals only. FPRs are seven-transmembrane G-coupled molecules and make up a relatively homogeneous group, although exhibiting different levels of affinity for the ligands. We examined the patterns of conservation/mutation within the FPR group of genes, as studied in 16 mRNAs from different species. Following alignment of the coding sections, those nucleotides identical in at least 15 sequences were assigned a “conservation index” 2; those with 8–14 identities an index 1; those with less than 8 identities an index zero. The cumulative average conservation index was 1.36. The autocorrelation function and the power spectrum of the whole series of indexes demonstrated a 3-unit periodicity. This periodicity is explained by the fact that the average conservation indexes of the first, second and third nucleotides of the coding triplets were 1.46, 1.55 (both above the mean), and 1.06 (below the mean), respectively, so that correlations at lag 3 tend to be all positive. In mRNAs, regardless of the position in the coding triplets, T is significantly more frequently conserved (average index?=?1.60) than A, C, and G (1.21 – 1.38). In the nucleotides with conservation index 1 or zero, we recorded the two more frequently represented bases. In 35% of mRNA nucleotides the two more frequently represented bases were C and T; in 28% of cases the two more frequently represented bases were A and G; other couples occurred with lower frequencies. Both mutations may arise following C methylation with subsequent transformation into T (by deamination), either in the template or the coding DNA strand. Thus, we hypothesized that in FPR mRNAs there is an evolutionary trend of transformation from G to A and from C to T, the latter being the more stable of the bases.  相似文献   

5.
Microsatellite enrichment is an excess of repetitive sequences characteristic to all studied eukaryotes. It is thought to result from the accumulated effects of replication slippage mutations. Enrichment is commonly measured as the ratio of the observed frequency of microsatellites to the frequency expected to result from random association of nucleotides. We have compared enrichment of specific types of microsatellites in coding sequences with those in noncoding sequences across seven eukaryotic clades. The results reveal consistent differences between coding and noncoding regions, in terms of both the quantity of repetitive DNA and the types present. In noncoding regions, all types of microsatellite (mono-, di-, tri-, tetra-, penta-, and hexanucleotide repeats) are found in excess, and in all cases, these excesses scale in a similar exponential fashion with the length of the microsatellite. This suggests that all types of noncoding repeats are subject to similar mutational and selective processes. Coding repeats, however, appear to be under much stronger and more specific constraints. Tri- and hexanucleotide repeats are found in consistent and significant excess over a wide range of lengths in both coding and noncoding sequences, but other repeat types are much less frequent in coding regions than in noncoding regions. These findings suggest that the differences between coding and noncoding microsatellite frequencies arise from specific selection against frameshift mutations in coding regions resulting from length changes in nontriplet repeats. Furthermore, the excesses of tri- and hexanucleotide coding repeats appear to be controlled primarily by mutation pressure.  相似文献   

6.
Formyl peptides are oligopeptides released by Gram-negative bacteria. So far, specific formyl peptide receptors (FPRs) have been described in mammals only. FPRs are seven-transmembrane G-coupled molecules and make up a relatively homogeneous group, although exhibiting different levels of affinity for the ligands. We examined the patterns of conservation/mutation within the FPR group of genes, as studied in 16 mRNAs from different species. Following alignment of the coding sections, those nucleotides identical in at least 15 sequences were assigned a “conservation index” 2; those with 8-14 identities an index 1; those with less than 8 identities an index zero. The cumulative average conservation index was 1.36. The autocorrelation function and the power spectrum of the whole series of indexes demonstrated a 3-unit periodicity. This periodicity is explained by the fact that the average conservation indexes of the first, second and third nucleotides of the coding triplets were 1.46, 1.55 (both above the mean), and 1.06 (below the mean), respectively, so that correlations at lag 3 tend to be all positive. In mRNAs, regardless of the position in the coding triplets, T is significantly more frequently conserved (average index = 1.60) than A, C, and G (1.21 - 1.38). In the nucleotides with conservation index 1 or zero, we recorded the two more frequently represented bases. In 35% of mRNA nucleotides the two more frequently represented bases were C and T; in 28% of cases the two more frequently represented bases were A and G; other couples occurred with lower frequencies. Both mutations may arise following C methylation with subsequent transformation into T (by deamination), either in the template or the coding DNA strand. Thus, we hypothesized that in FPR mRNAs there is an evolutionary trend of transformation from G to A and from C to T, the latter being the more stable of the bases.  相似文献   

7.
The impact of the somatic hypermutational machinery was examined by analyzing the frequency and distribution of mutations in nonproductive VHDJH rearrangements obtained from individual human peripheral B cells. A strong bias toward nucleotide substitutions within the quadruplet motif RGYW was observed. In addition, there was a comparably increased frequency of mutations of the inverse repeat of RGYW, WRCY. Together, mutations of RGYW / WRCY accounted for 37 % of all nucleotide substitutions. No significant strand polarity of the distribution of mutations was evident when nucleotide substitutions of highly mutated quartets and triplets as well as of their inverse repeats were analyzed. Furthermore, detailed analysis of mutations of specific triplets, such as AGC and TAC provided evidence that they were mutated more frequently when they were included within RGYW and WRCY, respectively. Despite being a target of the mutational machinery, neither RGYW nor WRCY was mutated in the absence of a large number of substitutions of other nucleotides in the same sequence. These results indicate that the mutational machinery targets RGYW sequences for mutations on either DNA strand and do not support the contention that the mutational machinery exhibits DNA strand polarity.  相似文献   

8.
Circumsporozoite gene of a Plasmodium falciparum strain from Thailand   总被引:5,自引:0,他引:5  
The nucleotide and deduced amino acid sequences of the CS gene of a Plasmodium falciparum strain from Thailand (T4) are presented. Comparison with the nucleotide sequences of two other P. falciparum CS genes, 7G8 from Brazil and Wellcome from West Africa, shows that: the coding regions outside the repeats of T4 and 7G8 are co-extensive and lack 30 nucleotides present in the Wellcome strain 5' to the repeats; in this region, T4 also differs at 3 nucleotide positions from the 7G8 and the Wellcome strains; in the region 3' to the repeats, T4 differs at two positions from 7G8 and at two other positions from the Wellcome strain--remarkably, all of these differences result in amino acid substitutions; the structure of the tandem repeats in the CS gene of T4 is, 5' to 3', [NANP-NVDP] X 3, [NANP] X 38, which is different from that of the two other strains. Due to the use of synonymous codons, the repetition of the sequence is more precise at the amino acid level than at the nucleotide level. These features contrast with those observed in the CS genes of other plasmodial species.  相似文献   

9.
Single-stranded conformation polymorphism (SSCP) by capillary electrophoresis was assessed as a screening and typing method for alleles of KIR2DL4. Exon 6 was investigated as this exon was reported to include three polymorphic nucleotides. Exon 6, intron 6 and exon 7 were amplified as a single polymerase chain reaction (PCR) product of 650 bp from genomic DNA. The PCR product was sequenced and analysed by SSCP. Exon 7 was found to be invariant. Only two nucleotides were found to be polymorphic in exon 6 and another three were found in intron 6. Strong linkage disequilibrium was found between the polymorphic nucleotides resulting in the presence of three alleles in a panel of 20 cell lines. Two alleles differed within intron 6 while the third allele differed at two nucleotides in exon 6. All six possible genotypes were distinguishable by SSCP providing information from both the forward and reverse primers was used. Exon 6 of one allele was one nucleotide shorter than that of the other alleles and the resulting frame shift is predicted to produce a truncated cytoplasmic tail due to a premature stop codon four codons into exon 7. SSCP was found to be an efficient method of typing exons 6 and 7 in a panel of 46 bone marrow donors. All three alleles were found to be common and one was in strong linkage disequilibrium with the presence of another KIR sequence KIR3DS1.  相似文献   

10.
11.

Background

Immunoglobulin rearrangement involves random and imprecise processes that act to both create and constrain diversity. Two such processes are the loss of nucleotides through the action of unknown exonuclease(s) and the addition of P nucleotides. The study of such processes has been compromised by difficulties in reliably aligning immunoglobulin genes and in the partitioning of nucleotides between segment ends, and between N and P nucleotides.

Results

A dataset of 294 human IgM sequences was created and partitioned with the aid of a probabilistic model. Non-random removal of nucleotides is seen between the three IGH gene types with the IGHV gene averaging removals of 1.2 nucleotides compared to 4.7 for the other gene ends (p < 0.001). Individual IGHV, IGHD and IGHJ gene subgroups also display statistical differences in the level of nucleotide loss. For example, within the IGHJ group, IGHJ3 has average removals of 1.3 nucleotides compared to 6.4 nucleotides for IGHJ6 genes (p < 0.002). Analysis of putative P nucleotides within the IgM and pooled datasets revealed only a single putative P nucleotide motif (GTT at the 3' D-REGION end) to occur at a frequency significantly higher then would be expected from random N nucleotide addition.

Conclusions

The loss of nucleotides due to the action of exonucleases is not random, but is influenced by the nucleotide composition of the genes. P nucleotides do not make a significant contribution to diversity of immunoglobulin sequences. Although palindromic sequences are present in 10% of immunologlobulin rearrangements, most of the 'palindromic' nucleotides are likely to have been inserted into the junction during the process of N nucleotide addition. P nucleotides can only be stated with confidence to contribute to diversity of less than 1% of sequences. Any attempt to identify P nucleotides in immunoglobulins is therefore likely to introduce errors into the partitioning of such sequences.  相似文献   

12.
Polymorphism of intron 4 in HLA-A, -B and -C genes   总被引:3,自引:0,他引:3  
The sequence database of HLA class I genes focuses on the coding sequences, the exons. Limited information is available on the non-coding sequences of the different class I alleles. In this study we have determined the intron 4 nucleotide sequence of at least one representative of each major allelic group of HLA-A, -B and -C. The intron 4 sequences were determined for 27 HLA-A, 81 HLA-B and 30 HLA-C alleles by allele-specific sequencing, using primers located in adjacent exons and introns. The sequences revealed that the length of intron 4 varies with a minimum of 93 and a maximum of 124 nucleotides as a result of insertions and deletions. There were remarkable similarities and differences within HLA-A, -B and -C, as well as between them. Within HLA-A, a deletion of three nucleotides was detected in several HLA-A alleles. The HLA-B alleles could be divided into two groups with one group having a deletion of 11 nucleotides compared with the second group. Within HLA-C, all Cw*07 alleles showed remarkable differences with the other Cw alleles. Cw*07 had an insertion of three nucleotides, shared only by the Cw*17 group. Moreover, Cw*07 was found to have an aberrant nucleotide sequence. Differences between HLA-A, -B and -C alleles were also observed. Remarkable was the deletion of 20 nucleotides in all HLA-A and -B alleles compared with HLA-C, whereas the HLA-A alleles showed an insertion of one nucleotide and a deletion of three nucleotides compared with HLA-B and -C. Furthermore, 32 different polymorphic positions were detected between HLA-A, -B and -C.  相似文献   

13.
An infectious clone of the Australian geminivirus tobacco yellow dwarf virus (TobYDV) was constructed from virus-specific double-stranded DNA isolated from infected tobacco and used to demonstrate a single-component genome. The nucleotide sequence of TobYDV DNA comprises 2580 nucleotides. TobYDV DNA has three coding regions, two in the virion sense and one in the complementary sense, homologous to those identified for other geminiviruses, particularly those infecting monocotyledonous (monocot) plants. The complementary sense coding region is comprised of two overlapping reading frames, with an intron of 86 nucleotides. Efficient splicing of the mRNA for this coding region was observed in the infected dicotyledonous (dicot) hosts bean and tobacco despite the intron having an A + U content (57%) more typical of geminiviruses of monocot plants. TobYDV encapsidates a small oligonucleotide able to prime synthesis of the complementary DNA strand in vitro. The TobYDV genome organization, low A + U intron, and encapsidated oligonucleotide primer resemble those of the monocot-infecting geminiviruses. These results strongly suggest that TobYDV is a monocot geminivirus which has become adapted to dicot hosts.  相似文献   

14.
Formyl peptides (FPs) released by some bacteria are powerful chemoattractants and activators of granulocytes, monocytes, and macrophages, acting through the members of a subfamily of specific seven-transmembrane G-protein–coupled formyl peptide receptors (FPRs), which are expressed only in mammals. Upon stimulation, granulocytes chemotactically move towards sites of maximal FP concentration, and release different bactericidal lytic enzymes and reactive oxygen species (ROI). In some instances, such as ischemia/reperfusion, the proinflammatory mediators released by the injured tissues and the intestinal bacteria and endotoxins, which may permeate across the damaged mucosal barrier, prime the inflowing granulocytes for an enhanced ROI production, resulting in severe damage to the host tissues. In this investigation 16 representative FPR and FPR-like mRNAs were selected to study the pattern of mutation/conservation of the individual nucleotides (nt) in the coding sequences. Mutations occur in 56.7%, 46.4%, and 87.5 % of cases in the first, second, and third nt, respectively, of the coding triplets. A probabilistic analysis demonstrated a significant nonrandom linkage between mutations in the first and second nt. Furthermore, the triplets that are variously double-mutated in the first two nt code, on average, for more hydrophobic amino acids (AA) in the transmembrane segments and more hydrophilic AA in the external and intracytoplasmic segments, thus preserving the general structure of the receptor. The authors hypothesize that when in one of the first two nt a mutation leading to a nonfunctioning protein product occurred, the mutated gene was eventually eliminated; however, a second mutation occurring in the other previously unmutated nt may have led to a protein product that is compatible with functional activity, although mutated in one (noncritical) AA. Such double mutations effecting a “functional repair” have thus survived and are retained among the extant sequences. Moreover, the combined mutation of all three nt in coding triplets occurs with a significantly higher than random frequency and this finding may be interpreted in a similar way.  相似文献   

15.
Formyl peptides (FPs) released by some bacteria are powerful chemoattractants and activators of granulocytes, monocytes, and macrophages, acting through the members of a subfamily of specific seven-transmembrane G-protein-coupled formyl peptide receptors (FPRs), which are expressed only in mammals. Upon stimulation, granulocytes chemotactically move towards sites of maximal FP concentration, and release different bactericidal lytic enzymes and reactive oxygen species (ROI). In some instances, such as ischemia/reperfusion, the proinflammatory mediators released by the injured tissues and the intestinal bacteria and endotoxins, which may permeate across the damaged mucosal barrier, prime the inflowing granulocytes for an enhanced ROI production, resulting in severe damage to the host tissues. In this investigation 16 representative FPR and FPR-like mRNAs were selected to study the pattern of mutation/conservation of the individual nucleotides (nt) in the coding sequences. Mutations occur in 56.7%, 46.4%, and 87.5 % of cases in the first, second, and third nt, respectively, of the coding triplets. A probabilistic analysis demonstrated a significant nonrandom linkage between mutations in the first and second nt. Furthermore, the triplets that are variously double-mutated in the first two nt code, on average, for more hydrophobic amino acids (AA) in the transmembrane segments and more hydrophilic AA in the external and intracytoplasmic segments, thus preserving the general structure of the receptor. The authors hypothesize that when in one of the first two nt a mutation leading to a nonfunctioning protein product occurred, the mutated gene was eventually eliminated; however, a second mutation occurring in the other previously unmutated nt may have led to a protein product that is compatible with functional activity, although mutated in one (noncritical) AA. Such double mutations effecting a "functional repair" have thus survived and are retained among the extant sequences. Moreover, the combined mutation of all three nt in coding triplets occurs with a significantly higher than random frequency and this finding may be interpreted in a similar way.  相似文献   

16.
Selected segments of the nucleotide sequences of the human 18S rRNA and the human formyl peptide receptor 1 mRNA exhibit structural similarities that are unlikely to be due simply to chance. Herein we analyze the structural similarities between the human 18S rRNA gene and the vertebrate chemokine CXC receptor 4 (CXCR4) gene that encodes a class A (rhodopsin-like) seven-transmembrane G-protein coupled receptor belonging to the same superfamily of formyl peptide receptors. The method of study was based on the recording of the positions of the 7-or-more-base oligonucleotide identities encountered in the 18S and CXCR4 genes and the construction of scatter-plots (abscissa-18S; ordinate-CXCR4) displaying the identity points positions. Analysis of the distribution of distances between identity points (abscissa-ordinate in the scatter-plot) demonstrated distinct peaks of frequency around 1200. Series of identities arranged near diagonal lines at 45 degrees in the scatter-plot (quasialignments) were evaluated for their probabilistic level of random occurrence. Results of this analysis demonstrated nonrandom quasialignments between (i) a 900-nt ca. section of the human CXCR4 intron that immediately precedes almost the whole of the coding sequence and the 18S gene from nt 125 to 1025 ca.; and (ii) a 425-nt ca. section of the CXCR4 vertebrate genes, corresponding to nt 137-560 of the coding sequence, and the 18S gene from nt 1300 to 1730 ca. In both instances significant quasialignments are evidenced when CXCR4 nt sequences are shifted to the right by about 1200 nt with respect to the 18S nt sequence, as confirmed by analysis of the abscissa - ordinate differences. Taken together, these results indicate that, at least in humans, a continuous nonrandom quasialignment extends for some 1600 nt, from the second part of the (single) intron to the first part of the coding sequence. We hypothesize that the relatively more recent CXCR4 vertebrate gene might be evolutionarily related to the more ancient and highly conserved 18S gene.  相似文献   

17.
18.
The complete nucleotide sequence of the cloned circumsporozoite protein gene of the Plasmodium falciparum Wellcome (West African) isolate has been determined. The sequence shows two striking differences from that of the published Brazilian strain; the total number of tandem 12 base pair repeats is 46 compared to 41, and the 5' coding region contains an additional 30 nucleotides. From Southern blot experiments, two out of four cloned Thai lines also have a similar, higher number of repeats. Heterogeneity in the CSP gene repeat region and in the length of the 5' coding region allows the strains to be classed into three groups, with the Wellcome strain being indistinguishable from the Thai line T9-94.  相似文献   

19.
20.
Selected segments of the nucleotide sequences of the human 18S rRNA and the human formyl peptide receptor 1 mRNA exhibit structural similarities that are unlikely to be due simply to chance. Herein we analyze the structural similarities between the human 18S rRNA gene and the vertebrate chemokine CXC receptor 4 (CXCR4) gene that encodes a class A (rhodopsin-like) seven-transmembrane G-protein coupled receptor belonging to the same superfamily of formyl peptide receptors. The method of study was based on the recording of the positions of the 7-or-more-base oligonucleotide identities encountered in the 18S and CXCR4 genes and the construction of scatter-plots (abscissa-18S; ordinate-CXCR4) displaying the identity points positions. Analysis of the distribution of distances between identity points (abscissa-ordinate in the scatter-plot) demonstrated distinct peaks of frequency around 1200. Series of identities arranged near diagonal lines at 45° in the scatter-plot (quasialignments) were evaluated for their probabilistic level of random occurrence. Results of this analysis demonstrated nonrandom quasialignments between (i) a 900-nt ca. section of the human CXCR4 intron that immediately precedes almost the whole of the coding sequence and the 18S gene from nt 125 to 1025 ca.; and (ii) a 425-nt ca. section of the CXCR4 vertebrate genes, corresponding to nt 137–560 of the coding sequence, and the 18S gene from nt 1300 to 1730 ca. In both instances significant quasialignments are evidenced when CXCR4 nt sequences are shifted to the right by about 1200 nt with respect to the 18S nt sequence, as confirmed by analysis of the abscissa - ordinate differences. Taken together, these results indicate that, at least in humans, a continuous nonrandom quasialignment extends for some 1600 nt, from the second part of the (single) intron to the first part of the coding sequence. We hypothesize that the relatively more recent CXCR4 vertebrate gene might be evolutionarily related to the more ancient and highly conserved 18S gene.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号