首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The rat is an important animal model for human diseases and is widely used in physiology. In this article we present a new strategy for gene discovery based on the production of ESTs from serially subtracted and normalized cDNA libraries, and we describe its application for the development of a comprehensive nonredundant collection of rat ESTs. Our new strategy appears to yield substantially more EST clusters per ESTs sequenced than do previous approaches that did not use serial subtraction. However, multiple rounds of library subtraction resulted in high frequencies of otherwise rare internally primed cDNAs, defining the limits of this powerful approach. To date, we have generated >200,000 3' ESTs from >100 cDNA libraries representing a wide range of tissues and developmental stages of the laboratory rat. Most importantly, we have contributed to approximately 50,000 rat UniGene clusters. We have identified, arrayed, and derived 5' ESTs from >30,000 unique rat cDNA clones. Complete information, including radiation hybrid mapping data, is also maintained locally at http://genome.uiowa.edu/clcg.html. All of the sequences described in this article have been submitted to the dbEST division of the NCBI.  相似文献   

2.
We developed computer-based methods for constructing a nonredundant mouse full-length cDNA library. Our cDNA library construction process comprises assessment of library quality, sequencing the 3' ends of inserts and clustering, and completing a re-array to generate a nonredundant library from a redundant one. After the cDNA libraries are generated, we sequence the 5' ends of the inserts to check the quality of the library; then we determine the sequencing priority of each library. Selected libraries undergo large-scale sequencing of the 3' ends of the inserts and clustering of the tag sequences. After clustering, the nonredundant library is constructed from the original libraries, which have redundant clones. All libraries, plates, clones, sequences, and clusters are uniquely identified, and all information is saved in the database according to this identifier. At press time, our system has been in place for the past two years; we have clustered 939,725 3' end sequences into 127,385 groups from 227 cDNA libraries/sublibraries (see http://genome.gse.riken.go.jp/).  相似文献   

3.
A collection of 90,000 human cDNA clones generated to increase the fraction of "full-length" cDNAs available was analyzed by sequence alignment on the human genome assembly. Five hundred fifty-two gene models not found in LocusLink, with coding regions of at least 300 bp, were defined by using this collection. Exon composition proposed for novel genes showed an average of 4.7 exons per gene. In 20% of the cases, at least half of the exons predicted for new genes coincided with evolutionary conserved regions defined by sequence comparisons with the pufferfish Tetraodon nigroviridis. Among this subset, CpG islands were observed at the 5' end of 75%. In-frame stop codons upstream of the initiator ATG were present in 49% of the new genes, and 16% contained a coding region comprising at least 50% of the cDNA sequence. This cDNA resource also provided candidate small protein-coding genes, usually not included in genome annotations. In addition, analysis of a sample from this cDNA collection indicates that approximately 380 gene models described in LocusLink could be extended at their 5' end by at least one new exon. Finally, this cDNA resource provided an experimental support for annotations based exclusively on predictions, thus representing a resource substantially improving the human genome annotation.  相似文献   

4.
5.
6.
7.
A large-scale effort, termed the Secreted Protein Discovery Initiative (SPDI), was undertaken to identify novel secreted and transmembrane proteins. In the first of several approaches, a biological signal sequence trap in yeast cells was utilized to identify cDNA clones encoding putative secreted proteins. A second strategy utilized various algorithms that recognize features such as the hydrophobic properties of signal sequences to identify putative proteins encoded by expressed sequence tags (ESTs) from human cDNA libraries. A third approach surveyed ESTs for protein sequence similarity to a set of known receptors and their ligands with the BLAST algorithm. Finally, both signal-sequence prediction algorithms and BLAST were used to identify single exons of potential genes from within human genomic sequence. The isolation of full-length cDNA clones for each of these candidate genes resulted in the identification of >1000 novel proteins. A total of 256 of these cDNAs are still novel, including variants and novel genes, per the most recent GenBank release version. The success of this large-scale effort was assessed by a bioinformatics analysis of the proteins through predictions of protein domains, subcellular localizations, and possible functional roles. The SPDI collection should facilitate efforts to better understand intercellular communication, may lead to new understandings of human diseases, and provides potential opportunities for the development of therapeutics.  相似文献   

8.
9.
10.
11.
Lo J  Lee S  Xu M  Liu F  Ruan H  Eun A  He Y  Ma W  Wang W  Wen Z  Peng J 《Genome research》2003,13(3):455-466
A total of 15590 unique zebrafish EST clusters from two cDNA libraries have been identified. Most significantly, only 22% (3437) of the 15590 unique clusters matched 2805 (of 15200) clusters in the Danio rerio UniGene database, indicating that our EST set is complementary to the existing ESTs in the public database and will be invaluable in assisting the annotation of genes based on the upcoming zebrafish genome sequence. Blast search showed that 7824 of our unique clusters matched 6710 known or predicted proteins in the nonredundant database. A cDNA microarray representing approximately 3100 unique zebrafish cDNA clusters has been generated and used to profile the gene expression patterns across six different embryonic stages (cleavage, blastula, gastrula, segmentation, pharyngula, and hatching). Analysis of expression data using K-means clustering revealed that genes coding for muscle-specific proteins displayed similar expression patterns, confirming that the coordinate gene expression is important for myogenesis. Our results demonstrate that the combination of microarray technology with the zebrafish model system can provide useful information on how genes are coordinated in a genetic network to control zebrafish embryogenesis and can help to identify novel genes that are important for organogenesis.  相似文献   

12.
13.
14.
We constructed a cDNA library of Japanese flounder, Paralichthys olivaceus, leukocytes that were infected with Hirame rhabdovirus (HRV) in order to analyze some of the genes that are induced and expressed by virus infection in the immune system. Four hundred and fifty-two partial sequences representing 300 cDNA clones were obtained from the 5' and/or 3' ends of inserts derived from the Japanese flounder leukocyte cDNA library. About three-quarters of the 300 cDNA clones (217 clones, 72.3%) represented known genes in the public databases, whereas the remaining 83 (27.7%) of the clones did not show any significant homology with the sequences in the public databases. Clones matching known genes were classified into 12 categories according to their function or distribution. Only 40 (18.4%) of the 217 known genes showed homology with fish genes deposited in the database. Thirty (10%) of the clones, encoding 21 different sequences, and representing several categories, were identified as putative biodefense genes or genes associated with the immune response. Nineteen of the 21 putative biodefense or immune response-related cDNAs have not been previously reported in fish genes or cDNAs.  相似文献   

15.
16.
17.
We used an expressed sequence tag approach to initiate a study of the genome of the horn fly, Hematobia irritans (L.) (Diptera: Muscidae). Two normalized cDNA libraries were synthesized from RNA isolated from embryos and first instars from a field population of horn flies. Approximately 10,000 clones were sequenced from both the 5' and 3' directions. Sequence data from each library was assembled into a database of tentative consensus sequences (TCs) and singletons and used to search public protein databases and annotate the sequences. Additionally, the sequences from both the egg and larval libraries were combined into a single database consisting of 16,702 expressed sequence tags (ESTs) assembling into 2886 TCs and 1,522 singleton entries. Several sequences were identified that may have roles in the horn fly's resistance to insecticides. The availability of this database will facilitate the design of microarray and other experiments to study horn fly gene expression on a larger scale than previously possible. This would include studies designed to investigate metabolic-based insecticide resistance, identify novel antigens for vaccine-based control approaches, and discover new proteins to serve as targets for new pesticide development.  相似文献   

18.
Immunoblot, immunofluorescence, and complement-mediated cytolytic assays revealed that two new monoclonal antibodies raised against a membrane-enriched fraction of Toxoplasma gondii tachyzoites recognize protein P22 on the surface of the parasite. Using these monoclonal antibodies to screen a cDNA expression library in lambda gt11, several clones expressing recombinant fusion proteins were isolated. Subsequent screening of the library with a synthetic oligonucleotide derived from the 5' end of one of these cDNAs permitted the isolation of additional nonexpressing clones containing the entire translated sequence. Blots of parasite RNA and DNA suggested that the corresponding gene occurs as a single copy in the tachyzoite genome. The amino acid sequence deduced from the composite cDNA indicates a primary translation product with a theoretical molecular weight of 18,959. As expected for surface protein P22, the putative polypeptide contains a predicted N-terminal signal sequence and a C-terminal hydrophobic region characteristic of proteins attached to the membrane by a glycophospholipid anchor. Recombinant fusion proteins produced by the expressing clones were recognized on immunoblots by IgG antibodies in the sera of humans with acute and chronic T. gondii infection. Antibodies selected by the fusion protein reacted predominantly with a 22-kDa antigen on immunoblots of parasite lysate.  相似文献   

19.
20.
Cryptosporidium parvum is a protozoan enteropathogen that infects humans and animals and causes a pronounced diarrheal disease that can be life-threatening in immunocompromised hosts. No specific chemo- or immunotherapies exist to treat cryptosporidiosis and little molecular information is available to guide development of such therapies. To accelerate gene discovery and identify genes encoding potential drug and vaccine targets we constructed sporozoite cDNA and genomic DNA sequencing libraries from the Iowa isolate of C. parvum and determined approximately 2000 sequence tags by single-pass sequencing of random clones. Together, the 567 expressed sequence tags (ESTs) and 1507 genome survey sequences (GSSs) totaled one megabase (1 mb) of unique genomic sequence indicating that approximately 10% of the 10.4 mb C. parvum genome has been sequence tagged in this gene discovery expedition. The tags were used to search the public nucleic acid and protein databases via BLAST analyses, and 180 ESTs (32%) and 277 GSSs (18%) exhibited similarity with database sequences at smallest sum probabilities P(N)< or =10(-8). Some tags encoded proteins with clear therapeutic potential including S-adenosylhomocysteine hydrolase, histone deacetylase, polyketide/fatty-acid synthases, various cyclophilins, thrombospondin-related cysteine-rich protein and ATP-binding-cassette transporters. Several anonymous ESTs encoded proteins predicted to contain signal peptides or multiple transmembrane spanning segments suggesting they were destined for membrane-bound compartments, the cell surface or extracellular secretion. One-hundred four simple sequence repeats were identified within the nonredundant sequence tag collection with (TAA)(> or =6)/(TTA)(> or =6) and (TA)(> or = 10)/(AT)(> or =10 ) being the most prevalent, occurring 40 and 15 times, respectively. Various cellular RNAs and their genes were also identified including the small and large ribosomal RNAs, five tRNAs, the U2 small nuclear RNA, and the small and large virus-like, double-stranded RNAs. This investigation has demonstrated that survey sequencing is an efficient procedure for gene discovery and genome characterization and has identified and sequence tagged many C. parvum genes encoding potential therapeutic targets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号