The absence of longitudinal data limits the accuracy of high-throughput clinical phenotyping for identifying type 2 diabetes mellitus subjects期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

The absence of longitudinal data limits the accuracy of high-throughput clinical phenotyping for identifying type 2 diabetes mellitus subjects

Authors:	Wei-Qi Wei Cynthia L Leibson Jeanine E Ransom Abel N Kho Christopher G Chute

Institution:	1. Institute for Health Informatics, University of Minnesota, Twin Cities, MN, United States;2. Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, United States;3. Division of Epidemiology, Mayo Clinic, Rochester, MN, United States;4. Divisions of General Internal Medicine and Biomedical Informatics, Northwestern University Feinberg School of Medicine, Chicago, IL, United States

Abstract:	PurposeTo evaluate the impact of insufficient longitudinal data on the accuracy of a high-throughput clinical phenotyping (HTCP) algorithm for identifying (1) patients with type 2 diabetes mellitus (T2DM) and (2) patients with no diabetes.MethodsRetrospective study conducted at Mayo Clinic in Rochester, Minnesota. Eligible subjects were Olmsted County residents with ≥1 Mayo Clinic encounter in each of three time periods: (1) 2007, (2) from 1997 through 2006, and (3) before 1997 (N = 54,283). Diabetes relevant electronic medical record (EMR) data about diagnoses, laboratories, and medications were used. We employed the HTCP algorithm to categorize individuals as T2DM cases and non-diabetes controls. Considering the full 11 years (1997–2007) as the gold standard, we compared gold-standard categorizations with those using data for 10 subsequent intervals, ranging from 1998–2007 (10-year data) to 2007 (1-year data). Positive predictive values (PPVs) and false-negative rates (FNRs) were calculated. McNemar tests were used to determine whether categorizations using shorter time periods differed from the gold standard. Statistical significance was defined as P < 0.05.ResultsWe identified 2770 T2DM cases and 21,005 controls when the algorithm was applied using 11-year data. Using 2007 data alone, PPVs and FNRs, respectively, were 70% and 25% for case identification and 59% and 67% for control identification. All time frames differed significantly from the gold standard, except for the 10-year period.ConclusionsThe accuracy of the algorithm reduced remarkably as data were limited to shorter observation periods. This impact should be considered carefully when designing/executing HTCP algorithms.

Keywords:
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏