共查询到20条相似文献,搜索用时 15 毫秒
1.
Scheetz TE Laffin JJ Berger B Holte S Baumes SA Brown R Chang S Coco J Conklin J Crouch K Donohue M Doonan G Estes C Eyestone M Fishler K Gardiner J Guo L Johnson B Keppel C Kreger R Lebeck M Marcelino R Miljkovich V Perdue M Qui L Rehmann J Reiter RS Rhoads B Schaefer K Smith C Sunjevaric I Trout K Wu N Birkett CL Bischof J Gackle B Gavin A Grundstad AJ Mokrzycki B Moressi C O'Leary B Pedretti K Roberts C Robinson NL Smith M Tack D Trivedi N Kucaba T Freeman T Lin JJ Bonaldo MF Casavant TL 《Genome research》2004,14(4):733-741
The rat is an important animal model for human diseases and is widely used in physiology. In this article we present a new strategy for gene discovery based on the production of ESTs from serially subtracted and normalized cDNA libraries, and we describe its application for the development of a comprehensive nonredundant collection of rat ESTs. Our new strategy appears to yield substantially more EST clusters per ESTs sequenced than do previous approaches that did not use serial subtraction. However, multiple rounds of library subtraction resulted in high frequencies of otherwise rare internally primed cDNAs, defining the limits of this powerful approach. To date, we have generated >200,000 3' ESTs from >100 cDNA libraries representing a wide range of tissues and developmental stages of the laboratory rat. Most importantly, we have contributed to approximately 50,000 rat UniGene clusters. We have identified, arrayed, and derived 5' ESTs from >30,000 unique rat cDNA clones. Complete information, including radiation hybrid mapping data, is also maintained locally at http://genome.uiowa.edu/clcg.html. All of the sequences described in this article have been submitted to the dbEST division of the NCBI. 相似文献
2.
Konno H Fukunishi Y Shibata K Itoh M Carninci P Sugahara Y Hayashizaki Y 《Genome research》2001,11(2):281-289
We developed computer-based methods for constructing a nonredundant mouse full-length cDNA library. Our cDNA library construction process comprises assessment of library quality, sequencing the 3' ends of inserts and clustering, and completing a re-array to generate a nonredundant library from a redundant one. After the cDNA libraries are generated, we sequence the 5' ends of the inserts to check the quality of the library; then we determine the sequencing priority of each library. Selected libraries undergo large-scale sequencing of the 3' ends of the inserts and clustering of the tag sequences. After clustering, the nonredundant library is constructed from the original libraries, which have redundant clones. All libraries, plates, clones, sequences, and clusters are uniquely identified, and all information is saved in the database according to this identifier. At press time, our system has been in place for the past two years; we have clustered 939,725 3' end sequences into 127,385 groups from 227 cDNA libraries/sublibraries (see http://genome.gse.riken.go.jp/). 相似文献
3.
Porcel BM Delfour O Castelli V De Berardinis V Friedlander L Cruaud C Ureta-Vidal A Scarpelli C Wincker P Schächter V Saurin W Gyapay G Salanoubat M Weissenbach J 《Genome research》2004,14(3):463-471
A collection of 90,000 human cDNA clones generated to increase the fraction of "full-length" cDNAs available was analyzed by sequence alignment on the human genome assembly. Five hundred fifty-two gene models not found in LocusLink, with coding regions of at least 300 bp, were defined by using this collection. Exon composition proposed for novel genes showed an average of 4.7 exons per gene. In 20% of the cases, at least half of the exons predicted for new genes coincided with evolutionary conserved regions defined by sequence comparisons with the pufferfish Tetraodon nigroviridis. Among this subset, CpG islands were observed at the 5' end of 75%. In-frame stop codons upstream of the initiator ATG were present in 49% of the new genes, and 16% contained a coding region comprising at least 50% of the cDNA sequence. This cDNA resource also provided candidate small protein-coding genes, usually not included in genome annotations. In addition, analysis of a sample from this cDNA collection indicates that approximately 380 gene models described in LocusLink could be extended at their 5' end by at least one new exon. Finally, this cDNA resource provided an experimental support for annotations based exclusively on predictions, thus representing a resource substantially improving the human genome annotation. 相似文献
4.
5.
6.
7.
The secreted protein discovery initiative (SPDI), a large-scale effort to identify novel human secreted and transmembrane proteins: a bioinformatics assessment 总被引:9,自引:0,他引:9
下载免费PDF全文
![点击此处可从《Genome research》网站下载免费的PDF全文](/ch/ext_images/free.gif)
Clark HF Gurney AL Abaya E Baker K Baldwin D Brush J Chen J Chow B Chui C Crowley C Currell B Deuel B Dowd P Eaton D Foster J Grimaldi C Gu Q Hass PE Heldens S Huang A Kim HS Klimowski L Jin Y Johnson S Lee J Lewis L Liao D Mark M Robbie E Sanchez C Schoenfeld J Seshagiri S Simmons L Singh J Smith V Stinson J Vagts A Vandlen R Watanabe C Wieand D Woods K Xie MH Yansura D Yi S Yu G Yuan J Zhang M Zhang Z Goddard A Wood WI Godowski P Gray A 《Genome research》2003,13(10):2265-2270
A large-scale effort, termed the Secreted Protein Discovery Initiative (SPDI), was undertaken to identify novel secreted and transmembrane proteins. In the first of several approaches, a biological signal sequence trap in yeast cells was utilized to identify cDNA clones encoding putative secreted proteins. A second strategy utilized various algorithms that recognize features such as the hydrophobic properties of signal sequences to identify putative proteins encoded by expressed sequence tags (ESTs) from human cDNA libraries. A third approach surveyed ESTs for protein sequence similarity to a set of known receptors and their ligands with the BLAST algorithm. Finally, both signal-sequence prediction algorithms and BLAST were used to identify single exons of potential genes from within human genomic sequence. The isolation of full-length cDNA clones for each of these candidate genes resulted in the identification of >1000 novel proteins. A total of 256 of these cDNAs are still novel, including variants and novel genes, per the most recent GenBank release version. The success of this large-scale effort was assessed by a bioinformatics analysis of the proteins through predictions of protein domains, subcellular localizations, and possible functional roles. The SPDI collection should facilitate efforts to better understand intercellular communication, may lead to new understandings of human diseases, and provides potential opportunities for the development of therapeutics. 相似文献
8.
9.
Normalization and subtraction of cap-trapper-selected cDNAs to prepare full-length cDNA libraries for rapid discovery of new genes 总被引:21,自引:3,他引:21
下载免费PDF全文
![点击此处可从《Genome research》网站下载免费的PDF全文](/ch/ext_images/free.gif)
Carninci P Shibata Y Hayatsu N Sugahara Y Shibata K Itoh M Konno H Okazaki Y Muramatsu M Hayashizaki Y 《Genome research》2000,10(10):1617-1630
10.
Development and application of a salmonid EST database and cDNA microarray: data mining and interspecific hybridization characteristics 总被引:10,自引:0,他引:10
下载免费PDF全文
![点击此处可从《Genome research》网站下载免费的PDF全文](/ch/ext_images/free.gif)
Rise ML von Schalburg KR Brown GD Mawer MA Devlin RH Kuipers N Busby M Beetz-Sargent M Alberto R Gibbs AR Hunt P Shukin R Zeznik JA Nelson C Jones SR Smailus DE Jones SJ Schein JE Marra MA Butterfield YS Stott JM Ng SH Davidson WS Koop BF 《Genome research》2004,14(3):478-490
11.
15000 unique zebrafish EST clusters and their future use in microarray for profiling gene expression patterns during embryogenesis
下载免费PDF全文
![点击此处可从《Genome research》网站下载免费的PDF全文](/ch/ext_images/free.gif)
Lo J Lee S Xu M Liu F Ruan H Eun A He Y Ma W Wang W Wen Z Peng J 《Genome research》2003,13(3):455-466
A total of 15590 unique zebrafish EST clusters from two cDNA libraries have been identified. Most significantly, only 22% (3437) of the 15590 unique clusters matched 2805 (of 15200) clusters in the Danio rerio UniGene database, indicating that our EST set is complementary to the existing ESTs in the public database and will be invaluable in assisting the annotation of genes based on the upcoming zebrafish genome sequence. Blast search showed that 7824 of our unique clusters matched 6710 known or predicted proteins in the nonredundant database. A cDNA microarray representing approximately 3100 unique zebrafish cDNA clusters has been generated and used to profile the gene expression patterns across six different embryonic stages (cleavage, blastula, gastrula, segmentation, pharyngula, and hatching). Analysis of expression data using K-means clustering revealed that genes coding for muscle-specific proteins displayed similar expression patterns, confirming that the coordinate gene expression is important for myogenesis. Our results demonstrate that the combination of microarray technology with the zebrafish model system can provide useful information on how genes are coordinated in a genetic network to control zebrafish embryogenesis and can help to identify novel genes that are important for organogenesis. 相似文献
12.
Fernández C Gregory WF Loke P Maizels RM 《Molecular and biochemical parasitology》2002,122(2):171-180
13.
14.
We constructed a cDNA library of Japanese flounder, Paralichthys olivaceus, leukocytes that were infected with Hirame rhabdovirus (HRV) in order to analyze some of the genes that are induced and expressed by virus infection in the immune system. Four hundred and fifty-two partial sequences representing 300 cDNA clones were obtained from the 5' and/or 3' ends of inserts derived from the Japanese flounder leukocyte cDNA library. About three-quarters of the 300 cDNA clones (217 clones, 72.3%) represented known genes in the public databases, whereas the remaining 83 (27.7%) of the clones did not show any significant homology with the sequences in the public databases. Clones matching known genes were classified into 12 categories according to their function or distribution. Only 40 (18.4%) of the 217 known genes showed homology with fish genes deposited in the database. Thirty (10%) of the clones, encoding 21 different sequences, and representing several categories, were identified as putative biodefense genes or genes associated with the immune response. Nineteen of the 21 putative biodefense or immune response-related cDNAs have not been previously reported in fish genes or cDNAs. 相似文献
15.
Analysis and functional annotation of an expressed sequence tag collection for tropical crop sugarcane 总被引:8,自引:0,他引:8
下载免费PDF全文
![点击此处可从《Genome research》网站下载免费的PDF全文](/ch/ext_images/free.gif)
Vettore AL da Silva FR Kemper EL Souza GM da Silva AM Ferro MI Henrique-Silva F Giglioti EA Lemos MV Coutinho LL Nobrega MP Carrer H França SC Bacci Júnior M Goldman MH Gomes SL Nunes LR Camargo LE Siqueira WJ Van Sluys MA Thiemann OH Kuramae EE Santelli RV Marino CL Targon ML Ferro JA Silveira HC Marini DC Lemos EG Monteiro-Vitorello CB Tambor JH Carraro DM Roberto PG Martins VG Goldman GH de Oliveira RC Truffi D Colombo CA Rossi M de Araujo PG Sculaccio SA Angella A Lima MM de Rosa Júnior VE 《Genome research》2003,13(12):2725-2735
16.
Analysis of 2166 clones from a human colorectal cancer cDNA library by partial sequencing 总被引:1,自引:0,他引:1
Frigerio Jean-Marc; Berthezene Patrice; Garrido Patricia; Barthellemy Emilia Ortiz Sandrine; Vasseur Sophie; Sastre Bernard; Seleznieff Igor; Dagorn Jean-Charles; lovanna Juan Lucio 《Human molecular genetics》1995,4(1):37-43
17.
We used an expressed sequence tag approach to initiate a study of the genome of the horn fly, Hematobia irritans (L.) (Diptera: Muscidae). Two normalized cDNA libraries were synthesized from RNA isolated from embryos and first instars from a field population of horn flies. Approximately 10,000 clones were sequenced from both the 5' and 3' directions. Sequence data from each library was assembled into a database of tentative consensus sequences (TCs) and singletons and used to search public protein databases and annotate the sequences. Additionally, the sequences from both the egg and larval libraries were combined into a single database consisting of 16,702 expressed sequence tags (ESTs) assembling into 2886 TCs and 1,522 singleton entries. Several sequences were identified that may have roles in the horn fly's resistance to insecticides. The availability of this database will facilitate the design of microarray and other experiments to study horn fly gene expression on a larger scale than previously possible. This would include studies designed to investigate metabolic-based insecticide resistance, identify novel antigens for vaccine-based control approaches, and discover new proteins to serve as targets for new pesticide development. 相似文献
18.
Cloning, expression, and cDNA sequence of surface antigen P22 from Toxoplasma gondii 总被引:15,自引:0,他引:15
J B Prince K L Auer J Huskinson S F Parmley F G Araujo J S Remington 《Molecular and biochemical parasitology》1990,43(1):97-106
Immunoblot, immunofluorescence, and complement-mediated cytolytic assays revealed that two new monoclonal antibodies raised against a membrane-enriched fraction of Toxoplasma gondii tachyzoites recognize protein P22 on the surface of the parasite. Using these monoclonal antibodies to screen a cDNA expression library in lambda gt11, several clones expressing recombinant fusion proteins were isolated. Subsequent screening of the library with a synthetic oligonucleotide derived from the 5' end of one of these cDNAs permitted the isolation of additional nonexpressing clones containing the entire translated sequence. Blots of parasite RNA and DNA suggested that the corresponding gene occurs as a single copy in the tachyzoite genome. The amino acid sequence deduced from the composite cDNA indicates a primary translation product with a theoretical molecular weight of 18,959. As expected for surface protein P22, the putative polypeptide contains a predicted N-terminal signal sequence and a C-terminal hydrophobic region characteristic of proteins attached to the membrane by a glycophospholipid anchor. Recombinant fusion proteins produced by the expressing clones were recognized on immunoblots by IgG antibodies in the sera of humans with acute and chronic T. gondii infection. Antibodies selected by the fusion protein reacted predominantly with a 22-kDa antigen on immunoblots of parasite lysate. 相似文献
19.
20.
Cryptosporidium parvum is a protozoan enteropathogen that infects humans and animals and causes a pronounced diarrheal disease that can be life-threatening in immunocompromised hosts. No specific chemo- or immunotherapies exist to treat cryptosporidiosis and little molecular information is available to guide development of such therapies. To accelerate gene discovery and identify genes encoding potential drug and vaccine targets we constructed sporozoite cDNA and genomic DNA sequencing libraries from the Iowa isolate of C. parvum and determined approximately 2000 sequence tags by single-pass sequencing of random clones. Together, the 567 expressed sequence tags (ESTs) and 1507 genome survey sequences (GSSs) totaled one megabase (1 mb) of unique genomic sequence indicating that approximately 10% of the 10.4 mb C. parvum genome has been sequence tagged in this gene discovery expedition. The tags were used to search the public nucleic acid and protein databases via BLAST analyses, and 180 ESTs (32%) and 277 GSSs (18%) exhibited similarity with database sequences at smallest sum probabilities P(N)< or =10(-8). Some tags encoded proteins with clear therapeutic potential including S-adenosylhomocysteine hydrolase, histone deacetylase, polyketide/fatty-acid synthases, various cyclophilins, thrombospondin-related cysteine-rich protein and ATP-binding-cassette transporters. Several anonymous ESTs encoded proteins predicted to contain signal peptides or multiple transmembrane spanning segments suggesting they were destined for membrane-bound compartments, the cell surface or extracellular secretion. One-hundred four simple sequence repeats were identified within the nonredundant sequence tag collection with (TAA)(> or =6)/(TTA)(> or =6) and (TA)(> or = 10)/(AT)(> or =10 ) being the most prevalent, occurring 40 and 15 times, respectively. Various cellular RNAs and their genes were also identified including the small and large ribosomal RNAs, five tRNAs, the U2 small nuclear RNA, and the small and large virus-like, double-stranded RNAs. This investigation has demonstrated that survey sequencing is an efficient procedure for gene discovery and genome characterization and has identified and sequence tagged many C. parvum genes encoding potential therapeutic targets. 相似文献