首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Kim JH  Waterman MS  Li LM 《Genome research》2007,17(7):1101-1110
One of the main goals in genome sequencing projects is to determine a haploid consensus sequence even when clone libraries are constructed from homologous chromosomes. However, it has been noticed that haplotypes can be inferred from genome assemblies by investigating phase conservation in sequenced reads. In this study, we seek to infer haplotypes, a diploid consensus sequence, from the genome assembly of an organism, Ciona intestinalis. The Ciona intestinalis genome is an ideal resource from which haplotypes can be inferred because of the high polymorphism rate (1.2%). The haplotype estimation scheme consists of polymorphism detection and phase estimation. The core step of our method is a Gibbs sampling procedure. The mate-pair information from two-end sequenced clone inserts is exploited to provide long-range continuity. We estimate the polymorphism rate of Ciona intestinalis to be 1.2% and 1.5%, according to two different polymorphism counting schemes. The distribution of heterozygosity number is well fit by a compound Poisson distribution. The N50 length of haplotype segments is 37.9 kb in our assembly, while the N50 scaffold length of the Ciona intestinalis assembly is 190 kb. We also infer diploid gene sequences from haplotype segments. According to our reconstruction, 85.4% of predicted gene sequences are continuously covered by single haplotype segments. Our results indicate 97% accuracy in haplotype estimation, based on a simulated data set. We conduct a comparative analysis with Ciona savignyi, and discover interesting patterns of conserved DNA elements in chordates.  相似文献   

2.
We report the discovery of a new GABAA receptor alpha5 subunit gene polymorphism close to the polymorphism described by Glatt et al. (GT)5GCGTGC(GT)21. This new polymorphism is of great importance, because it means that non-denaturing acrylamide gels used to separate the different alleles of the polymorphism described by Glatt et al. cannot distinguish an allele with the sequence: (GT)4GCGTGC(GT)n from another allele with the sequence: (GT)4(GCGT)4GC(GT)(n-6). These gel fragments are separated by size, which would be the same in these two cases. An alternative would be to use an analysis method that can detect base changes, for instance, single strand conformation polymorphism (SSCP) or denaturing gradient gel electrophoresis (DGGE).  相似文献   

3.
Studies of microsatellites evolution based on marker data almost inherently suffer from an ascertainment bias because there is selection for the most mutable and polymorphic loci during marker development. To circumvent this bias we took advantage of whole-genome shotgun sequence data from three unrelated chicken individuals that, when aligned to the genome reference sequence, give sequence information on two chromosomes from about one-fourth (375,000) of all microsatellite loci containing di- through pentanucleotide repeat motifs in the chicken genome. Polymorphism is seen at loci with as few as five repeat units, and the proportion of dimorphic loci then increases to 50% for sequences with approximately 10 repeat units, to reach a maximum of 75%-80% for sequences with 15 or more repeat units. For any given repeat length, polymorphism increases with decreasing GC content of repeat motifs for dinucleotides, nonhairpin-forming trinucleotides, and tetranucleotides. For trinucleotide repeats which are likely to form hairpin structures, polymorphism increases with increasing GC content, indicating that the relative stability of hairpins affects the rate of replication slippage. For any given repeat length, polymorphism is significantly lower for imperfect compared to perfect repeats and repeat interruptions occur in >15% of loci. However, interruptions are not randomly distributed within repeat arrays but are preferentially located toward the ends. There is negative correlation between microsatellite abundance and single nucleotide polymorphism (SNP) density, providing large-scale genomic support for the hypothesis that equilibrium microsatellite distributions are governed by a balance between rate of replication slippage and rate of point mutation.  相似文献   

4.
To construct an infrastructure for genome-wide association studies of common diseases or drug sensitivities, we have been systematically exploring common variants by resequencing genomic regions containing genes in DNA from 24 Japanese individuals. We have analyzed a total of 154 Mb, corresponding to approximately 5% of the human genome, and so far have identified 174,269 single-nucleotide polymorphisms and 16,293 insertion/deletion polymorphisms within gene regions, i.e., one polymorphism in 807 bp on average. Our data are freely available via our web site (http://snp.ims.u-tokyo.ac.jp) and will facilitate studies to identify genes associated with susceptibility to common diseases and genes involved in sensitivity to therapeutic drugs.  相似文献   

5.
Dense coverage of the rice genome with polymorphic DNA markers is an invaluable tool for DNA marker-assisted breeding, positional cloning, and a wide range of evolutionary studies. We have aligned drafts of two rice subspecies, indica and japonica, and analyzed levels and patterns of genetic diversity. After filtering multiple copy and low quality sequence, 408,898 candidate DNA polymorphisms (SNPs/INDELs) were discerned between the two subspecies. These filters have the consequence that our data set includes only a subset of the available SNPs (in particular excluding large numbers of SNPs that may occur between repetitive DNA alleles) but increase the likelihood that this subset is useful: Direct sequencing suggests that 79.8% +/- 7.5% of the in silico SNPs are real. The SNP sample in our database is not randomly distributed across the genome. In fact, 566 rice genomic regions had unusually high (328 contigs/48.6 Mb/13.6% of genome) or low (237 contigs/64.7 Mb/18.1% of genome) polymorphism rates. Many SNP-poor regions were substantially longer than most SNP-rich regions, covering up to 4 Mb, and possibly reflecting introgression between the respective gene pools that may have occurred hundreds of years ago. Although 46.2% +/- 8.3% of the SNPs differentiate other pairs of japonica and indica genotypes, SNP rates in rice were not predictive of evolutionary rates for corresponding genes in another grass species, sorghum. The data set is freely available at http://www.plantgenome.uga.edu/snp.  相似文献   

6.
The EnsMart system (www.ensembl.org/EnsMart) provides a generic data warehousing solution for fast and flexible querying of large biological data sets and integration with third-party data and tools. The system consists of a query-optimized database and interactive, user-friendly interfaces. EnsMart has been applied to Ensembl, where it extends its genomic browser capabilities, facilitating rapid retrieval of customized data sets. A wide variety of complex queries, on various types of annotations, for numerous species are supported. These can be applied to many research problems, ranging from SNP selection for candidate gene screening, through cross-species evolutionary comparisons, to microarray annotation. Users can group and refine biological data according to many criteria, including cross-species analyses, disease links, sequence variations, and expression patterns. Both tabulated list data and biological sequence output can be generated dynamically, in HTML, text, Microsoft Excel, and compressed formats. A wide range of sequence types, such as cDNA, peptides, coding regions, UTRs, and exons, with additional upstream and downstream regions, can be retrieved. The EnsMart database can be accessed via a public Web site, or through a Java application suite. Both implementations and the database are freely available for local installation, and can be extended or adapted to 'non-Ensembl' data sets.  相似文献   

7.
文题释义: 组织工程骨:将体外培养的功能相关的种子细胞种植于天然的或人工合成的支架材料内,加入生长因子体外培养一段时间,将他们移植到体内,促进组织修复和骨再生的人工骨。组织工程骨形成的3要素为:支架材料、成骨细胞、生长因子。 生物陶瓷:生物表面活性陶瓷通常含有羟基,还可做成多孔性,生物组织可长入并同其表面发生牢固的键合;生物吸收性陶瓷的特点是能部分吸收或者全部吸收,在生物体内能诱发新生骨的生长。生物活性陶瓷具有骨传导性,它作为一个支架,成骨在其表面进行;还可作为多种物质的外壳或填充骨缺损。生物陶瓷有羟基磷灰石陶瓷、磷酸三钙陶瓷等。  背景:目前常用的骨缺损修复支架材料种类较多,但单一类型材料难以满足骨组织工程支架材料的要求,通过合适的方法将几种单一材料组合形成复合型材料,综合考虑各种材料优缺点,是近年来学者们的研究重点。 目的:构建纳米羟基磷灰石/壳聚糖/聚己内酯三元复合支架材料,并作表征分析研究。 方法:采用3D打印成型技术制备纳米羟基磷灰石/壳聚糖/聚己内酯多孔三元复合支架材料,从X射线衍射分析、吸水率、抗压强度、体外降解性能、孔径分析、扫描电镜分析等多个维度对支架材料进行表征研究。 结果与结论:①X射线衍射分析显示,纳米羟基磷灰石/壳聚糖/聚己内酯多孔三元复合支架的晶型峰图与羟基磷灰石粉末衍射标准卡片类似,表明该三元复合支架是通过物理作用相互结合的,不影响羟基磷灰石的生物学功能;②三元复合支架的吸水率为18.28%,亲水性好,支架可承受的最大压力为1 415 N,其体外降解速率与成骨速率相当;③显微镜下可见三元复合支架的内孔为方形,孔径250 µm,孔径大小均匀、分布有致;④扫描电镜下三元复合支架可见,壳聚糖和聚己内酯组成的纤维排列整齐有序,成网格状, 羟基磷灰石呈颗粒状在纤维表面均匀分布,三元复合材料呈现均匀、疏松的微孔结构;⑤结果表明,通过3D打印成型技术可成功制备纳米羟基磷灰石/壳聚糖/聚己内酯三元复合支架材料,其具有适度的抗压强度、一定的孔隙率、适宜的降解速度和吸水率,能为修复骨缺损的奠定基础。 ORCID: 0000-0002-6321-9160(余和东) 中国组织工程研究杂志出版内容重点:生物材料;骨生物材料; 口腔生物材料; 纳米材料; 缓释材料; 材料相容性;组织工程    相似文献   

8.
In studies from this laboratory, we localized the regions on the H chain of botulinum neurotoxin A (BoNT/A) that are recognized by anti-BoNT/A antibodies (Abs) and block the activity of the toxin in vivo. These Abs were obtained from cervical dystonia patients who had been treated with BoNT/A and had become unresponsive to the treatment, as well as blocking Abs raised in mouse, horse, and chicken. We also localized the regions involved in BoNT/A binding to mouse brain synaptosomes (snp). Comparison of spatial proximities in the three-dimensional structure of the Ab-binding regions and the snp binding showed that except for one, the Ab-binding regions either coincide or overlap with the snp regions. It should be folly expected that protective Abs when bound to the toxin at sites that coincide or overlap with snp binding would prevent the toxin from binding to nerve synapse and therefore block toxin entry into the neuron. Thus, analysis of the locations of the Ab-binding and the snp-binding regions provides a molecular rationale for the ability of protecting Abs to block BoNT/A action in vivo.  相似文献   

9.
Dindel: accurate indel calls from short-read data   总被引:1,自引:0,他引:1  
Small insertions and deletions (indels) are a common and functionally important type of sequence polymorphism. Most of the focus of studies of sequence variation is on single nucleotide variants (SNVs) and large structural variants. In principle, high-throughput sequencing studies should allow identification of indels just as SNVs. However, inference of indels from next-generation sequence data is challenging, and so far methods for identifying indels lag behind methods for calling SNVs in terms of sensitivity and specificity. We propose a Bayesian method to call indels from short-read sequence data in individuals and populations by realigning reads to candidate haplotypes that represent alternative sequence to the reference. The candidate haplotypes are formed by combining candidate indels and SNVs identified by the read mapper, while allowing for known sequence variants or candidates from other methods to be included. In our probabilistic realignment model we account for base-calling errors, mapping errors, and also, importantly, for increased sequencing error indel rates in long homopolymer runs. We show that our method is sensitive and achieves low false discovery rates on simulated and real data sets, although challenges remain. The algorithm is implemented in the program Dindel, which has been used in the 1000 Genomes Project call sets.  相似文献   

10.
Physical map-assisted whole-genome shotgun sequence assemblies   总被引:2,自引:0,他引:2       下载免费PDF全文
We describe a targeted approach to improve the contiguity of whole-genome shotgun sequence (WGS) assemblies at run-time, using information from Bacterial Artificial Chromosome (BAC)-based physical maps. Clone sizes and overlaps derived from clone fingerprints are used for the calculation of length constraints between any two BAC neighbors sharing 40% of their size. These constraints are used to promote the linkage and guide the arrangement of sequence contigs within a sequence scaffold at the layout phase of WGS assemblies. This process is facilitated by FASSI, a stand-alone application that calculates BAC end and BAC overlap length constraints from clone fingerprint map contigs created by the FPC package. FASSI is designed to work with the assembly tool PCAP, but its output can be formatted to work with other WGS assembly algorithms able to use length constraints for individual clones. The FASSI method is simple to implement, potentially cost-effective, and has resulted in the increase of scaffold contiguity for both the Drosophila melanogaster and Cryptococcus gattii genomes when compared to a control assembly without map-derived constraints. A 6.5-fold coverage draft DNA sequence of the Pan troglodytes (chimpanzee) genome was assembled using map-derived constraints and resulted in a 26.1% increase in scaffold contiguity.  相似文献   

11.
Whole-genome assembly is now used routinely to obtain high-quality draft sequence for the genomes of species with low levels of polymorphism. However, genome assembly remains extremely challenging for highly polymorphic species. The difficulty arises because two divergent haplotypes are sequenced together, making it difficult to distinguish alleles at the same locus from paralogs at different loci. We present here a method for assembling highly polymorphic diploid genomes that involves assembling the two haplotypes separately and then merging them to obtain a reference sequence. Our method was developed to assemble the genome of the sea squirt Ciona savignyi, which was sequenced to a depth of 12.7 x from a single wild individual. By comparing finished clones of the two haplotypes we determined that the sequenced individual had an extremely high heterozygosity rate, averaging 4.6% with significant regional variation and rearrangements at all physical scales. Applied to these data, our method produced a reference assembly covering 157 Mb, with N50 contig and scaffold sizes of 47 kb and 989 kb, respectively. Alignment of ESTs indicates that 88% of loci are present at least once and 81% exactly once in the reference assembly. Our method represented loci in a single copy more reliably and achieved greater contiguity than a conventional whole-genome assembly method.  相似文献   

12.
The genomic region encompassing complement factor H (CFH) is thought to be important in determining susceptibility to inflammatory diseases such as age-related macular degeneration, but only limited polymorphism has been described. After applying the genomic matching technique to three-generation families and an ethnically diverse reference panel we have demonstrated that the polymorphism resembles that found in the major histocompatibility complex. The different ancestral haplotypes carry either T or C at T1277C but also other more polymorphic alleles over a region of 2 Mb. Thus the association between age-related macular degeneration and T1277 or Y402 actually reflects multiple linked polymorphisms including an indel that cannot be dissected from any direct effect of Y402 and may be more important. We show for the first time that simple algorithms can identify genomic sequence elements that appear to be more useful haplospecific markers than single nucleotide polymorphism or microsatellites.  相似文献   

13.
Major histocompatibility complex class I polymorphism in Asiatic lions   总被引:1,自引:0,他引:1  
Asiatic lions (Panthera leo persica), whose only natural habitat in the world is the Gir forest sanctuary of Gujarat State in India, are highly endangered and are considered to be highly inbred with narrow genetic diversity. An objective assessment of genetic diversity in their immune loci will help in assessing their survivability and may provide vital clues in designing strategies for their scientific management and conservation. We analyzed the comparative sequence polymorphism at exon 2 and exon 3 of major histocompatibility complex (MHC) class I in three groups of lions, i.e. wild Asiatic (from Gir forest), captive-bred Asiatic (from zoological parks in India), and Afro-Asiatic hybrid groups (from zoological parks in India) through polymorphism chain reaction-assisted sequence-based typing. The two exons were amplified, cloned, sequenced, and analyzed for polymorphism at nucleotide and putative translated product level. The analysis revealed extensive sequence polymorphism not only between clones derived from different lions but also the clones derived from a single lion. Furthermore, the wild Asiatic lions of Gir forest exhibited abundant sequence polymorphism at MHC class I comparable with that of Afro-Asiatic hybrid lions and significantly higher than that of captive-bred Asiatic lions. We hypothesize that Asiatic lions of Gir forest are not highly inbred as thought earlier and they possess abundant sequence polymorphism at MHC class I loci. During this study, 52 new sequences of the multigene MHC class I family were also identified among Asiatic lions.  相似文献   

14.
Summary: With the advent of modern genomic sequencing technology the ability to obtain new sequence data and to acquire allelic polymorphism data from a broad range of samples has become routine. In this regard, our investigations have started with the most polymorphic of genetic regions fundamental to the immune response in the major histocompatibility complex (MHC). Starting with the completed human MHC genomic sequence, we have developed a resource of methods and information that provide ready access to a large portion of human and nonhuman primate MHCs. This resource consists of a set of primer pairs or amplicons that can be used to isolate about 15% of the 4.0 Mb MHC. Essentially similar studies are now being carried out on a set of immune response loci to broaden the usefulness of the data and tools developed. A panel of 100 genes involved in the immune response have been targeted for single nucleotide polymorphism (SNP) discovery efforts that will analyze 120 Mb of sequence data for the presence of immune‐related SNPs. The SNP data provided from the MHC and from the immune response panel has been adapted for use in studies of evolution, MHC disease associations, and clinical transplantation.  相似文献   

15.
A DNA mutation detection protocol able to identify and characterize a previously unknown change in a given sequence in a rapid, efficient, sensitive, and inexpensive manner is required to take advantage of the resources now available to researchers through the genome sequencing projects. We have developed a method based on base-specific cleavage of polymerase chain reaction (PCR) products and then separation of the fragments by matrix-assisted laser desorption ionization-mass spectrometry (MALDI-MS), which can meet these criteria. Differences are seen as the presence, absence, or mass change of peaks corresponding to fragments affected by the base difference. This technique is shown through the detection of a polymorphism in the 3' untranslated region of IL12p40 from a double-stranded PCR product, and the detection of a single nucleotide polymorphism between two mouse strains. The sensitivity of the technique can be increased with the use of postsource decay, which enables differentiation of two fragments of identical mass but different sequence. The level of specificity and the rapid sample analysis time lend this technique to the mass screening of individuals for sequence changes and, in combination with MS sequencing methods, could be used to facilitate rapid resequencing of DNA.  相似文献   

16.
Mechanical properties of three-dimensional (3D) scaffolds can be appropriately modulated through novel fabrication techniques like 3D fiber deposition (3DF), by varying scaffold's pore size and shape. Dynamic stiffness, in particular, can be considered as an important property to optimize the scaffold structure for its ultimate in vivo application to regenerate a natural tissue. Experimental data from dynamic mechanical analysis (DMA) reveal a dependence of the dynamic stiffness of the scaffold on the intrinsic mechanical and physicochemical properties of the material used, and on the overall porosity and architecture of the construct. The aim of this study was to assess the relationship between the aforementioned parameters, through a mathematical model, which was derived from the experimental mechanical data. As an example of how mechanical properties can be tailored to match the natural tissue to be replaced, articular bovine cartilage and porcine knee meniscus cartilage dynamic stiffness were measured and related to the modeled 3DF scaffolds dynamic stiffness. The dynamic stiffness of 3DF scaffolds from poly(ethylene oxide terephthalate)-poly(butylene terephthalate) (PEOT/PBT) copolymers was measured with DMA. With increasing porosity, the dynamic stiffness was found to decrease in an exponential manner. The influence of the scaffold architecture (or pore shape) and of the molecular network properties of the copolymers was expressed as a scaffold characteristic coefficient alpha, which modulates the porosity effect. This model was validated through an FEA numerical simulation performed on the structures that were experimentally tested. The relative deviation between the experimental and the finite element model was less than 15% for all of the constructs with a dynamic stiffness higher than 1 MPa. Therefore, we conclude that the mathematical model introduced can be used to predict the dynamic stiffness of a porous PEOT/PBT scaffold, and to choose the biomechanically optimal structure for tissue engineering applications.  相似文献   

17.
摘要:本综述拟介绍有关自然选择的一些基本概念。在人类进化过程中,自然选择会在基因组中留下一些分子信号,我们重点分析如何利用这些信号来识别自然选择尤其是正向选择,因为发生正向选择作用的基因组区域一定具有重要的功能。  相似文献   

18.
We have demonstrated previously that noncoding sequences of genes are a robust source of polymorphisms between mouse species when tested using single-strand conformation polymorphism (SSCP) analysis, and that these polymorphisms are useful for genetic mapping. In this report we demonstrate that presumptive 3′-untranslated region sequence obtained from expressed sequence tags (ESTs) can be analyzed in a similar fashion, and we have used this approach to map 262 loci using an interspecific backcross. These results demonstrate SSCP analysis of genes or ESTs is a simple and efficient means for the genetic localization of transcribed sequences, and is furthermore an approach that is applicable to any system for which there is sufficient sequence polymorphism.  相似文献   

19.
An MCMC algorithm for haplotype assembly from whole-genome sequence data   总被引:1,自引:0,他引:1  
In comparison to genotypes, knowledge about haplotypes (the combination of alleles present on a single chromosome) is much more useful for whole-genome association studies and for making inferences about human evolutionary history. Haplotypes are typically inferred from population genotype data using computational methods. Whole-genome sequence data represent a promising resource for constructing haplotypes spanning hundreds of kilobases for an individual. In this article, we propose a Markov chain Monte Carlo (MCMC) algorithm, HASH (haplotype assembly for single human), for assembling haplotypes from sequenced DNA fragments that have been mapped to a reference genome assembly. The transitions of the Markov chain are generated using min-cut computations on graphs derived from the sequenced fragments. We have applied our method to infer haplotypes using whole-genome shotgun sequence data from a recently sequenced human individual. The high sequence coverage and presence of mate pairs result in fairly long haplotypes (N50 length ~ 350 kb). Based on comparison of the sequenced fragments against the individual haplotypes, we demonstrate that the haplotypes for this individual inferred using HASH are significantly more accurate than the haplotypes estimated using a previously proposed greedy heuristic and a simple MCMC method. Using haplotypes from the HapMap project, we estimate the switch error rate of the haplotypes inferred using HASH to be quite low, ~1.1%. Our Markov chain Monte Carlo algorithm represents a general framework for haplotype assembly that can be applied to sequence data generated by other sequencing technologies. The code implementing the methods and the phased individual haplotypes can be downloaded from (http://www.cse.ucsd.edu/users/vibansal/HASH/).  相似文献   

20.
Several cDNA clones comprising the entire coding sequence of the rainbow trout ( Oncorh ynchus mykiss ) major histocompatibility comlex ( Mhc ) class II B gene have been isolated from different sources. A single B gene appears to be transcribed in the rainbow trout and it encodes a 247 amino acid long polypeptide, which is of similar size to mammalian, avian, and amphibian and other teleost δ chains. The amino acid sequence identity to mammalian, amphibian, and avian class II δ chains is only about 30%. Despite the low similarity, a striking pattern of conservation is observed, both in the putative peptide-binding domain and in the Ig-like domain. Most of the conserved residues are located in the Ig like domain and in the transmembrane segment. The majority of polymorphic residues reside in the δ1 domain, with the greatest variability found in the amino-terminal half of the domain. The sequence data are compatible with a rather limited polymorphism of a single, expressed Mhc class II B gene.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号