首页 | 本学科首页   官方微博 | 高级检索  
 共查询到19条相似文献,搜索用时 109 毫秒
针对垂直搜索引擎研究领域的关键技术问题,提出了一个结合本体筛选和文本挖掘的垂直搜索引擎构建思想.首先探讨了作为研究基础的本体和文本挖掘技术,讨论了两者的作用;之后阐述了垂直搜索引擎构建的关键技术,包括基于本体筛选的智能搜索器、结合文本挖掘的网页信息分析及抽取、索引器及查询处理器的构造;最后,对提出的思想进行了实现验证,构造一个面向高校毕业生招聘的垂直搜索引擎原型.  相似文献   

语义搜索将语义Web技术引入搜索引擎,改善当前搜索引擎的搜索效果,近年来得到广泛关注.文章介绍了语义搜索领域的研究基础,包括研究现状和常用的研究方法,对语义搜索进行了分类研究和深入分析,语义搜索主要可分为基于传统搜索的增强型语义搜索和基于本体推理的知识型语义搜索;文章指出了语义搜索研究中存在的问题,并对未来开展语义搜索研究进行了总结和展望.  相似文献   

研究了本体、本体匹配、NBC文本分类和OWL-S.OWL-S把网格中的资源组织为服务,并用服务本体来表示和描述,不但可以描述服务的语义,而且还能够进行适当的推理.针对OWL-S服务本体的异构性,利用OWL、OWL-S的元素值和文本内容,从本体结构、功能和文本信息等多个维度分析本体间的语义匹配问题,并给出了相应的语义等价匹配规则和基于NBC的文本分类式语义相似匹配算法,为语义网格中的服务本体共享、交互和集成等技术的实现提供了基础.  相似文献   

由于本体能够消除概念的混淆和重用知识,因此它的质量对于语义网技术的应用非常重要.为了提高本体的质量,很多的工作集中在概念建模,但是本体表示这个非常重要的方面一直被忽视.目前本体的表示使用的是词(term),但同一个词可能有很多不同的意思,这样在基于本体的应用时将导致不清楚或错误的理解.为了解决这个问题,使用定义在WordNet中的词义(sense)而不是词来作为本体的表示,其原因是词义只有唯一的意思.本体澄清的定义为利用目标词周围的本体元素和被它标注的文档附近的词,对目标词进行自动消歧的过程.通过计算目标词义和它的邻居词的语义相似度,语义相关度最大的词义将选为正确的词义.实验表明,我们的算法有很好的性能.与最好的消歧算法相比,概念(Concept)精度差不多是名词精度的2倍,关系(Property)精度差不多是动词精度的3倍.实验证明了我们的算法在半自动的本体净化过程中也是非常有效的.  相似文献   

本文提出了一种基于自然语言理解的搜索引擎模型.它的核心技术是基于自然语言理解的相关技术,包括从 关键词、提问方式、提问重点三个层次对用户查询进行语义分析、特征向量提取及基于该思想建立了面向Web网页内容 的特征库,提出返回文档排序的算法,基于Lucene全文索引工具包建立了搜索引擎,对库中已收入的特征词进行了查询 测试,查准率为86.7%.实验表明,该模型基本实现了对查询短语的理解,对提高搜索引擎的查准率有显著的效果.  相似文献   

依据模型知识的特性,定义了模型描述本体和任务求解本体,为模型提供语义支持.模型语义分为描述语义和行为语义.基于描述语义的相似性判定,模型可进行潜在冲突预测;然后根据任务求解本体的定义,模型通过行为语义交互,进行行为协商;而在执行过程中,模型需要为每一个操作申请资源,因此模型根据模型描述本体和描述语义对资源申请进行协商,从而得到互不冲突的操作执行序列,消除冲突.最后通过实验分析验证算法的有效性.  相似文献   

本体能够对特定领域的概念、术语以及关系提供一种形式化的描述方法.尽管本体在知识表示上有很强的能力,但是有一个缺陷,即不能表达不确定和不精确的信息.而这些信息在语义网和多媒体应用中,又是至关重要的.针对模糊信息的本体表示问题,本文对本体语言OWL DL进行了基于模糊逻辑的扩展,给出了形式化的语法和语义,并通过一个实例说明了该方法在表达能力上的灵活性. .  相似文献   

现有的网格门户不能较好地满足用户的个性化的需求,对用户需求的缺乏语义描述,也没有资源的语义描述,不能很好满足用户需求的动态变化.网格门户能通过语义网技术来增强信息共享和社区用户交互.本文在域管理模型的基础上提出了语义社区概念,并以域服务器为基础构建具备用户语义模型的语义网格门户,用任务本体替换作业说明书改进任务调度.  相似文献   

采用知识点的方式组织知识资源,有利于知识的获取、分享、分配和存取.但是传统的树型结构对知识的整体关系描述能力不足,不利于分布式环境下对知识资源的查找和定位.语义网是一种可以详细描述本体间复杂关系并具有天然分布式特性的技术.然而一般的语义网本身不是按照知识点的方式进行组建.本文对语义网进行扩展,使其适用于描述基于知识点组织的知识资源.通过应用案例,按照知识点进行扩展的语义网可以有效地描述知识资源间的相互关系,便于知识的理解和利用,并且对知识的查找和定位也变得更加方便.  相似文献   

查询扩展技术通过向初始查询请求中加入相似或者相关的词,来减少查询请求与相关文献在表达上的不匹配现象,改善检索性能.本文利用语义单元的语义表达能力和语义单元之间的关系,将与初始查询具有密切语义关系的查询词或短语加入到初始查询请求中,更加全面地表示了用户的查询意愿.算法的时间复杂度为O(L),只与搜索请求的长度L有关,与语义单元表示库的规模无关,这对实时性要求较高的搜索引擎来讲是很实用的.  相似文献   

Greenberg GN 《Toxicology》2002,173(1-2):145-152
The Internet = s global reach offers new powerful tools to professionals in Occupational and Environmental Health (OEH). The World Wide Web includes extensive free and commercially available reference materials on toxicology, regulatory issues, environmental epidemiology and prevention programs. Much of this especially useful content is inaccessible to general Web-based search engines. Effective use of the Web requires discovery and familiarity with sites housing query engines for technical databases. Although the Web = s structure and capacity is so dynamic that any listing is incomplete, introductions to many resources are provided in this article. The Internet also offers professionals electronic access to one another, for collegial discourse. Electronic mailing lists provide assembly points for collaboration and guidance about technical issues. Several specialty forums for OEH professionals are also discussed.  相似文献   

Carole Goble is Professor in the Department of Computer Science in the University of Manchester, from where she graduated. Her research interests are centred on the accessibility of information, primarily through the use of ontologies for the representation and classification of metadata. She works in many application areas, and in particular Life Sciences. The Information Management Group that she co-leads is renowned for its work on ontology languages (OIL, DAML+OIL, OWL), reasoning systems (FaCT) and their practical application to real problems. Her work on the application of ontologies to biology and bioinformatics has been particularly influential. She currently has a leading role in two major international initiatives: the Semantic Web and the Grid. She has combined these into the Semantic Grid, co-chairing the Semantic Grid Research Group in the Global Grid Forum standards organisation and directing a major UK BioGrid research pilot, myGrid. She chaired the first Semantic Web track of the World Wide Web Conference in 2002. She serves on many boards and programme committees including the OntoWeb Thematic Network executive management board, the international Semantic Web Science Association and the EU/NSF joint ad hoc committee Semantic Web Services Initiative. She is an Editor-in-Chief of the new Elsevier journal Journal of Web Semantics and is co-founder of a start-up company, Network Inference, specialising in technologies for the Semantic Web.  相似文献   

Greenberg G 《Toxicology》2002,178(3):263
The Internet's global reach offers new powerful tools to professionals in Occupational and Environmental Health (OEH). The World Wide Web includes extensive free and commercially available reference materials on toxicology, regulatory issues, environmental epidemiology and prevention programs. Much of this especially useful content is inaccessible to general Web-based search engines. Effective use of the Web requires discovery and familiarity with sites housing query engines for technical databases. Although the Web's structure and capacity is so dynamic that any listing is incomplete, introductions to many resources are provided in this article. The Internet also offers professionals electronic access to one another, for collegial discourse. Electronic mailing lists provide assembly points for collaboration and guidance about technical issues. Several specialty forums for OEH professionals are also discussed.  相似文献   

OBJECTIVE: (1) To give an overview of research tools, techniques, and resources that are available on the Internet; and (2) to identify valid, pharmacy-related information that will reduce uncertainty in the problem-solving activities of practitioners. DATA SOURCES: The World Wide Web. STUDY SELECTION: Examples cited in the article were evaluated according to the criteria offered in the text as a prerequisite for their inclusion. DATA SYNTHESIS: Functional aspects of the Internet include communication, commerce, and content. Because a lack of control has led to mixed information quality, the use of Internet-based information for patient care and professional decision making should be subject to rigorous screening criteria. Pharmacists can use Web browsers combined with excellent search engines and search techniques to identify quality resources, including primary, secondary, and tertiary literature, either fee-based or free, and that can be sought actively or distributed on a schedule directly to the desktop of the pharmacist. CONCLUSION: The Internet can be an immensely helpful research tool, if used appropriately. Whether actively searching or passively receiving useful updates, the Internet can function as a value-added asset to any pharmacy practice.  相似文献   

PURPOSE: The quality and reliability of Internet-based arthritis information were studied. METHODS: The search terms "arthritis," "osteoarthritis," and 'rheumatoid arthritis" were entered into the AOL, MSN, Yahoo, Google, and Lycos search engines. The Web sites for the first 40 matches generated by each search engine were grouped by URL suffix and evaluated on the basis of four categories of criteria: disease and medication information content, Web-site navigability, required literacy level, and currentness of information. Ratings were assigned by using an assessment tool derived from published literature (maximum score of 15 points). RESULTS: Of the 600 arthritis Web sites identified, only 69 were unique and included in the analysis. Fifty-seven percent were .com sites, 20% .org sites, 7% .gov sites, 6% .edu sites, and 10% other sites. Total scores for individual sites reviewed ranged from 3 to 14. Eighty percent of .gov sites, 75% of .edu sites, 29% of other sites, 36% of .com sites, and 21% of .org sites were within the top tertile of scores. No Web site met the criterion for being understandable to people with no more than a sixth-grade reading ability. .Gov sites scored significantly higher overall than .com sites, .org sites, and other sites. .Edu sites also scored relatively well. CONCLUSION: The quality of arthritis information on the Internet varied widely. Sites with URLs having suffixes of .gov and .edu were ranked higher than other types of sites.  相似文献   

One of the biggest challenges in pharmaceutical development is finding drug candidates with a desired activity or efficacy balanced with low toxicity or side-effects. Despite the enormous effort and cost required to get drugs to market, numerous drugs have been abandoned due to unanticipated, untoward effects. Clearly, technologies are needed that can identify safe and effective pharmaceutical candidates early in the drug pipeline. Our laboratory has developed a computer modeling technology that can be used to screen small-molecule ligands for certain desired activities as well as certain toxicities. The technology utilizes modeling of the intercalation of molecules into DNA to create efficacy and toxicity search engines and is grounded by two fundamental observations. First, intercalation has been shown to be integral in the action of drugs that act in concert with nuclear enzymes, e.g., topoisomerases. Second, evidence is mounting that intercalation facilitated by nuclear receptors bound to natural ligands is a critical part of their genomic mode of action. To date, two classes of search engines have been created, i.e., those that can be used to identify: (1) efficacious molecules, e.g., antibiotics, estrogens, androgens, glucocorticoids, thyroid drugs, antidepressants, antihistamines, and sedatives, and (2) toxic molecules, e.g., certain carcinogens and genotoxins. Here, we describe the creation of two prototype search engines (the estrogen efficacy and arene oxide genotoxicity search engines) and illustrative results of searches of three-dimensional databases. Of particular interest is the specificity of the search engines and their capacity to identify widely different, and in some cases obscure, structures having the same activities. Taken as a whole, future drug discovery research is likely to focus on methods to assess DNA intercalation as a salient feature of selecting safe and effective drug candidates.  相似文献   

INTRODUCTION: Over 39,000 diabetic patients are surgically treated for trauma and orthopaedic injuries annually in the UK, yet the effects of diabetic medications on the skeletal system is an under researched and under acknowledged field. AREAS COVERED: This review covers all English language novel experimental data reports investigating the effects of the main classes of diabetic drugs on the skeletal system, specifically their effects on fracture healing, located through the literature search engines Medline and Web of Science. EXPERT OPINION: Post-surgical gylcaemic control is paramount in insulin-controlled type 1 diabetic patients. Data on pharmacological control compounds used in type 2 diabetes are limited. Reports to date indicate thiazolidinediones to exert anti-osteogenic effects, in contrast to the observed osteogenic effects of biguanides. Ongoing research is desirable to guide future clinical recommendations.  相似文献   

We apply robust classification algorithms to high-dimensional genomic data to find biomarkers, by analyzing variable importance, that enable a better diagnosis of disease, an earlier intervention, or a more effective assignment of therapies. The goal is to use variable importance ranking to isolate a set of important genes that can be used to classify life-threatening diseases with respect to prognosis or type to maximize efficacy or minimize toxicity in personalized treatment of such diseases. A ranking method and present several other methods to select a set of important genes to use as genomic biomarkers is proposed, and the performance of the selection procedures in patient classification by cross-validation is evaluated. The various selection algorithms are applied to published high-dimensional genomic data sets using several well-known classification methods. For each data set, a set of genes selected on the basis of variable importance that performed the best in classification is reported. That classification algorithm with the proposed ranking method is shown to be competitive with other selection methods for discovering genomic biomarkers underlying both adverse and efficacious outcomes for improving individualized treatment of patients for life-threatening diseases.  相似文献   

We apply robust classification algorithms to high-dimensional genomic data to find biomarkers, by analyzing variable importance, that enable a better diagnosis of disease, an earlier intervention, or a more effective assignment of therapies. The goal is to use variable importance ranking to isolate a set of important genes that can be used to classify life-threatening diseases with respect to prognosis or type to maximize efficacy or minimize toxicity in personalized treatment of such diseases. A ranking method and present several other methods to select a set of important genes to use as genomic biomarkers is proposed, and the performance of the selection procedures in patient classification by cross-validation is evaluated. The various selection algorithms are applied to published high-dimensional genomic data sets using several well-known classification methods. For each data set, a set of genes selected on the basis of variable importance that performed the best in classification is reported. That classification algorithm with the proposed ranking method is shown to be competitive with other selection methods for discovering genomic biomarkers underlying both adverse and efficacious outcomes for improving individualized treatment of patients for life-threatening diseases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号