首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 739 毫秒
1.
The potential for research involving biospecimens can be hindered by the prohibitive cost of performing laboratory assays on individual samples. To mitigate this cost, strategies such as randomly selecting a portion of specimens for analysis or randomly pooling specimens prior to performing laboratory assays may be employed. These techniques, while effective in reducing cost, are often accompanied by a considerable loss of statistical efficiency. We propose a novel pooling strategy based on the k‐means clustering algorithm to reduce laboratory costs while maintaining a high level of statistical efficiency when predictor variables are measured on all subjects, but the outcome of interest is assessed in pools. We perform simulations motivated by the BioCycle study to compare this k‐means pooling strategy with current pooling and selection techniques under simple and multiple linear regression models. While all of the methods considered produce unbiased estimates and confidence intervals with appropriate coverage, pooling under k‐means clustering provides the most precise estimates, closely approximating results from the full data and losing minimal precision as the total number of pools decreases. The benefits of k‐means clustering evident in the simulation study are then applied to an analysis of the BioCycle dataset. In conclusion, when the number of lab tests is limited by budget, pooling specimens based on k‐means clustering prior to performing lab assays can be an effective way to save money with minimal information loss in a regression setting. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

2.
In planning large longitudinal field trials, one is often faced with a choice between a cohort design and a cross-sectional design, with attendant issues of precision, sample size, and bias. To provide a practical method for assessing these trade-offs quantitatively, we present a unifying statistical model that embraces both designs as special cases. The model takes account of continuous and discrete endpoints, site differences, and random cluster and subject effects of both a time-invariant and a time-varying nature. We provide a comprehensive design equation, relating sample size to precision for cohort and cross-sectional designs, and show that the follow-up cost and selection bias attending a cohort design may outweigh any theoretical advantage in precision. We provide formulae for the minimum number of clusters and subjects. We relate this model to the recently published prevalence model for COMMIT, a multi-site trial of smoking cessation programmes. Finally, we tabulate parameter estimates for some physiological endpoints from recent community-based heart-disease prevention trials, work an example, and discuss the need for compiling such estimates as a basis for informed design of future field trials.  相似文献   

3.
Since its introduction into the biomedical literature, statistical significance testing (abbreviated as SST) caused much debate. The aim of this perspective article is to review frequent fallacies and misuses of SST in the biomedical field and to review a potential way out of the fallacies and misuses associated with SSTs. Two frequentist schools of statistical inference merged to form SST as it is practised nowadays: the Fisher and the Neyman-Pearson school. The P-value is both reported quantitatively and checked against the α-level to produce a qualitative dichotomous measure (significant/nonsignificant). However, a P-value mixes the estimated effect size with its estimated precision. Obviously, it is not possible to measure these two things with one single number. For the valid interpretation of SSTs, a variety of presumptions and requirements have to be met. We point here to four of them: study size, correct statistical model, correct causal model, and absence of bias and confounding. It has been stated that the P-value is perhaps the most misunderstood statistical concept in clinical research. As in the social sciences, the tyranny of SST is still highly prevalent in the biomedical literature even after decades of warnings against SST. The ubiquitous misuse and tyranny of SST threatens scientific discoveries and may even impede scientific progress. In the worst case, misuse of significance testing may even harm patients who eventually are incorrectly treated because of improper handling of P-values. For a proper interpretation of study results, both estimated effect size and estimated precision are necessary ingredients.  相似文献   

4.
Motivated by high‐throughput profiling studies in biomedical research, variable selection methods have been a focus for biostatisticians. In this paper, we consider semiparametric varying‐coefficient accelerated failure time models for right censored survival data with high‐dimensional covariates. Instead of adopting the traditional regularization approaches, we offer a novel sparse boosting (SparseL2Boosting) algorithm to conduct model‐based prediction and variable selection. One main advantage of this new method is that we do not need to perform the time‐consuming selection of tuning parameters. Extensive simulations are conducted to examine the performance of our sparse boosting feature selection techniques. We further illustrate our methods using a lung cancer data analysis.  相似文献   

5.
The most common data structures in the biomedical studies have been matched or unmatched designs. Data structures resulting from a hybrid of the two may create challenges for statistical inferences. The question may arise whether to use parametric or nonparametric methods on the hybrid data structure. The Early Treatment for Retinopathy of Prematurity study was a multicenter clinical trial sponsored by the National Eye Institute. The design produced data requiring a statistical method of a hybrid nature. An infant in this multicenter randomized clinical trial had high‐risk prethreshold retinopathy of prematurity that was eligible for treatment in one or both eyes at entry into the trial. During follow‐up, recognition visual acuity was accessed for both eyes. Data from both eyes (matched) and from only one eye (unmatched) were eligible to be used in the trial. The new hybrid nonparametric method is a meta‐analysis based on combining the Hodges–Lehmann estimates of treatment effects from the Wilcoxon signed rank and rank sum tests. To compare the new method, we used the classic meta‐analysis with the t‐test method to combine estimates of treatment effects from the paired and two sample t‐tests. We used simulations to calculate the empirical size and power of the test statistics, as well as the bias, mean square and confidence interval width of the corresponding estimators. The proposed method provides an effective tool to evaluate data from clinical trials and similar comparative studies. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

6.
While recent genomic surveys reveal growing numbers of di‐allelic copy number variations, it is genes with multiallelic (>2) copy numbers that have shown association with distinct phenotypes. Current high‐throughput laboratory methods are restricted to enumerating total gene copy numbers (GCNs) per individual and not the “genotype,” i.e. gene copy per chromosome. Thus, association studies of multiallelic GCNs have been limited to comparison of median copies in different groups. Our new nonparametric statistical approach is based on GCN information within a trio‐based study design. We present theoretical derivation of the statistics and results of simulation studies that show robustness of our approach and power under several genetic models. Genet. Epidemiol. 34:2–6, 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

7.
In systems biology, it is of great interest to identify new genes that were not previously reported to be associated with biological pathways related to various functions and diseases. Identification of these new pathway‐modulating genes does not only promote understanding of pathway regulation mechanisms but also allow identification of novel targets for therapeutics. Recently, biomedical literature has been considered as a valuable resource to investigate pathway‐modulating genes. While the majority of currently available approaches are based on the co‐occurrence of genes within an abstract, it has been reported that these approaches show only sub‐optimal performances because 70% of abstracts contain information only for a single gene. To overcome such limitation, we propose a novel statistical framework based on the concept of ontology fingerprint that uses gene ontology to extract information from large biomedical literature data. The proposed framework simultaneously identifies pathway‐modulating genes and facilitates interpreting functions of these new genes. We also propose a computationally efficient posterior inference procedure based on Metropolis–Hastings within Gibbs sampler for parameter updates and the poor man's reversible jump Markov chain Monte Carlo approach for model selection. We evaluate the proposed statistical framework with simulation studies, experimental validation, and an application to studies of pathway‐modulating genes in yeast. The R implementation of the proposed model is currently available at https://dongjunchung.github.io/bayesGO/ . Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

8.
The recent development of high‐throughput sequencing technologies calls for powerful statistical tests to detect rare genetic variants associated with complex human traits. Sampling related individuals in sequencing studies offers advantages over sampling unrelated individuals only, including improved protection against sequencing error, the ability to use imputation to make more efficient use of sequence data, and the possibility of power boost due to more observed copies of extremely rare alleles among relatives. With related individuals, familial correlation needs to be accounted for to ensure correct control over type I error and to improve power. Recognizing the limitations of existing rare‐variant association tests for family data, we propose MONSTER (Minimum P‐value Optimized Nuisance parameter Score Test Extended to Relatives), a robust rare‐variant association test, which generalizes the SKAT‐O method for independent samples. MONSTER uses a mixed effects model that accounts for covariates and additive polygenic effects. To obtain a powerful test, MONSTER adaptively adjusts to the unknown configuration of effects of rare‐variant sites. MONSTER also offers an analytical way of assessing P‐values, which is desirable because permutation is not straightforward to conduct in related samples. In simulation studies, we demonstrate that MONSTER effectively accounts for family structure, is computationally efficient and compares very favorably, in terms of power, to previously proposed tests that allow related individuals. We apply MONSTER to an analysis of high‐density lipoprotein cholesterol in the Framingham Heart Study, where we are able to replicate association with three genes.  相似文献   

9.
The main role of high‐throughput microarrays today is screening of relevant genes from a large pool of candidate genes. For prioritizing genes for subsequent studies, gene ranking based on the strength of the association with the phenotype is a relevant statistical output. In this article, we propose sample size calculations based on gene ranking and selection using the non‐parametric Mann–Whitney–Wilcoxon statistic in microarray experiments. The use of the non‐parametric statistic is expected to be advantageous in robustification in gene ranking for the deviation from normality and for possible scale change by using different platforms such as polymerase chain reaction‐based platforms in subsequent studies in gene expression data. Application to the data set from a clinical study for lymphoma is given. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

10.
For ethical reasons, the least number of animals possible should be used in biomedical research, though not so few as to fail to detect biologically important effects or to necessitate the repetition of experiments. We describe biostatistical approaches that can contribute to either reducing the number of animals in single experiments or to increasing the quality of studies so that fewer subsequent studies (and thus animals) will be needed. The described approaches regard different phases of experimentation, specifically: planning the experimental design and calculating the sample size, controlling variability, choosing the response variable, postulating the statistical hypothesis to be tested, choosing the procedure for analysing data, and interpreting and suitably presenting the results.  相似文献   

11.
《Vaccine》2017,35(23):3082-3088
A current barrier to the standardized evaluation of respiratory syncytial virus (RSV) vaccine candidates is the wide variety of virus neutralization assay formats currently in use for assessing immunogenicity. Assay formats vary widely in labor intensiveness, duration, and sample throughput. Furthermore, the cell lines and virus strains used are not consistent among formats. The purpose of this multi-laboratory study was to assess the variability across a diverse array of assay formats that quantitate RSV neutralizing antibodies. Using a common specimen panel, the degree of overall agreement among existing assays was evaluated to inform on the need for harmonization of assay results. A total of 12 laboratories participated in the blinded survey study by testing a panel comprised of 57 samples chosen to span the reportable titer range of the assays. An independent statistical analysis was conducted to measure overall agreement of assay results. This analysis showed that precision was consistently high, whereas agreement varied widely among assays. To examine whether agreement could be improved, we conducted a harmonization exercise using a variety of sample types as pseudo standards. The results showed that the level of agreement could be improved, and provided information on the suitability of samples for developing an international standard.  相似文献   

12.
Design of studies using DNA microarrays   总被引:7,自引:0,他引:7  
DNA microarrays are assays that simultaneously provide information about expression levels of thousands of genes and are consequently finding wide use in biomedical research. In order to control the many sources of variation and the many opportunities for misanalysis, DNA microarray studies require careful planning. Different studies have different objectives, and important aspects of design and analysis strategy differ for different types of studies. We review several types of objectives of studies using DNA microarrays and address issues such as selection of samples, levels of replication needed, allocation of samples to dyes and arrays, sample size considerations, and analysis strategies.  相似文献   

13.
Prospective randomized clinical trials addressing biomarkers are time consuming and costly, but are necessary for regulatory agencies to approve new therapies with predictive biomarkers. For this reason, recently, there have been many discussions and proposals of various trial designs and comparisons of their efficiency in the literature. We compare statistical efficiencies between the marker‐stratified design and the marker‐based precision medicine design regarding testing/estimating 4 hypotheses/parameters of clinical interest, namely, treatment effects in each marker‐positive and marker‐negative cohorts, marker‐by‐treatment interaction, and the marker's clinical utility. As may be expected, the stratified design is more efficient than the precision medicine design. However, it is perhaps surprising to find out how low the relative efficiency can be for the precision medicine design. We quantify the relative efficiency as a function of design factors including the marker‐positive prevalence rate, marker assay and classification sensitivity and specificity, and the treatment randomization ratio. It is interesting to examine the trends of the relative efficiency with these design parameters in testing different hypotheses. We advocate to use the stratified design over the precision medicine design in clinical trials with predictive biomarkers.  相似文献   

14.
Common clinical studies assess the quality of prognostic factors, such as gene expression signatures, clinical variables or environmental factors, and cluster patients into various risk groups. Typical examples include cancer clinical trials where patients are clustered into high or low risk groups. Whenever applied to survival data analysis, such groups are intended to represent patients with similar survival odds and to select the most appropriate therapy accordingly. The relevance of such risk groups, and of the related prognostic factors, is typically assessed through the computation of a hazard ratio. We first stress three limitations of assessing risk groups through the hazard ratio: (1) it may promote the definition of arbitrarily unbalanced risk groups; (2) an apparently optimal group hazard ratio can be largely inconsistent with the p‐value commonly associated to it; and (3) some marginal changes between risk group proportions may lead to highly different hazard ratio values. Those issues could lead to inappropriate comparisons between various prognostic factors. Next, we propose the balanced hazard ratio to solve those issues. This new performance metric keeps an intuitive interpretation and is as simple to compute. We also show how the balanced hazard ratio leads to a natural cut‐off choice to define risk groups from continuous risk scores. The proposed methodology is validated through controlled experiments for which a prescribed cut‐off value is defined by design. Further results are also reported on several cancer prognosis studies, and the proposed methodology could be applied more generally to assess the quality of any prognostic markers. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

15.
Along with the accumulated data of genetic variants and biomedical phenotypes in the genome era, statistical identification of pleiotropy is of growing interest for dissecting and understanding genetic correlations between complex traits. We proposed a novel method for estimating and testing pleiotropic effect of a genetic variant on two quantitative traits. Based on a covariance decomposition and estimation, our method quantifies pleiotropy as the portion of between‐trait correlation explained by the same genetic variant. Unlike most multiple‐trait methods that assess potential pleiotropy (i.e., whether a variant contributes to at least one trait), our method formulates a statistic that tests exact pleiotropy (i.e., whether a variant contributes to both of two traits). We developed two approaches (a regression approach and a bootstrapping approach) for such test and investigated their statistical properties, in comparison with other potential pleiotropy test methods. Our simulation shows that the regression approach produces correct P‐values under both the complete null (i.e., a variant has no effect on both two traits) and the incomplete null (i.e., a variant has effect on only one of two traits), but requires large sample sizes to achieve a good power, when the bootstrapping approach has a better power and produces conservative P‐values under the complete null. We demonstrate our method for detecting exact pleiotropy using a real GWAS dataset. Our method provides an easy‐to‐implement tool for measuring, testing, and understanding the pleiotropic effect of a single variant on the correlation architecture of two complex traits.  相似文献   

16.
There is mounting urgency regarding the mental health of gay, bisexual and other men who have sex with men (GBM). We examined how GBM are understanding the relationship between HIV and their mental health given the increasing biomedicalisation of HIV prevention and care. Our Grounded Theory analysis derived from qualitative interviews with 24 GBM living in Toronto, Canada, including both HIV‐negative and HIV‐positive men. Participants understood biomedical advances, such as undetectable viral load and pre‐exposure prophylaxis (PrEP), as providing some relief from HIV‐related distress. However, they offered ambivalent perspectives on the biomedicalisation of HIV. Some considered non‐HIV‐specific stressors (e.g. unemployment, racial discrimination) more significant than HIV‐related concerns. These men expressed HIV‐related distress as being under control due to biomedical advances or as always negligible when compared to non‐HIV‐specific stressors. Others emphasised the ongoing mental health implications of HIV (e.g. enduring risk and stigma). We describe a tension between optimistic responses to biomedicine's ability to ease the psychosocial burdens associated with HIV and the inability for biomedicine to address the social and economic determinants driving the dual epidemics of HIV and mental distress amongst GBM. We argue for more socio‐material analysis over further sexual behavioural analysis of GBM mental health disparities.  相似文献   

17.
Buck Louis GM, Schisterman EF, Sweeney AM, Wilcosky TC, Gore‐Langton RE, Lynch CD, Boyd Barr D, Schrader SM, Kim S, Chen Z, Sundaram R, on behalf of the LIFE Study. Designing prospective cohort studies for assessing reproductive and developmental toxicity during sensitive windows of human reproduction and development – the LIFE Study. Paediatric and Perinatal Epidemiology 2011; 25: 413–424. The relationship between the environment and human fecundity and fertility remains virtually unstudied from a couple‐based perspective in which longitudinal exposure data and biospecimens are captured across sensitive windows. In response, we completed the LIFE Study with methodology that intended to empirically evaluate a priori purported methodological challenges: ? implementation of population‐based sampling frameworks suitable for recruiting couples planning pregnancy; ? obtaining environmental data across sensitive windows of reproduction and development; ? home‐based biospecimen collection; and ? development of a data management system for hierarchical exposome data. We used two sampling frameworks (i.e. fish/wildlife licence registry and a direct marketing database) for 16 targeted counties with presumed environmental exposures to persistent organochlorine chemicals to recruit 501 couples planning pregnancies for prospective longitudinal follow‐up while trying to conceive and throughout pregnancy. Enrolment rates varied from <1% of the targeted population (n = 424 423) to 42% of eligible couples who were successfully screened; 84% of the targeted population could not be reached, while 36% refused screening. Among enrolled couples, ~85% completed daily journals while trying; 82% of pregnant women completed daily early pregnancy journals, and 80% completed monthly pregnancy journals. All couples provided baseline blood/urine samples; 94% of men provided one or more semen samples and 98% of women provided one or more saliva samples. Women successfully used urinary fertility monitors for identifying ovulation and home pregnancy test kits. Couples can be recruited for preconception cohorts and will comply with intensive data collection across sensitive windows. However, appropriately sized sampling frameworks are critical, given the small percentage of couples contacted found eligible and reportedly planning pregnancy at any point in time.  相似文献   

18.
Protein biomarkers found in plasma are commonly used for cancer screening and early detection. Measurements obtained by such markers are often based on different assays that may not support detection of accurate measurements due to a limit of detection. The ROC curve is the most popular statistical tool for the evaluation of a continuous biomarker. However, in situations where limits of detection exist, the empirical ROC curve fails to provide a valid estimate for the whole spectrum of the false positive rate (FPR). Hence, crucial information regarding the performance of the marker in high sensitivity and/or high specificity values is not revealed. In this paper, we address this problem and propose methods for constructing ROC curve estimates for all possible FPR values. We explore flexible parametric methods, transformations to normality, and robust kernel‐based and spline‐based approaches. We evaluate our methods though simulations and illustrate them in colorectal and pancreatic cancer data.  相似文献   

19.
Medical Education 2010: 44: 953–961 Context The practice of medicine involves many stakeholders (or participant groups such as patients, doctors and trainees). Based on their respective goals, perceptions and understandings, and on what is being measured, these stakeholders may have dramatically different viewpoints of the same event. There are many ways to characterise what occurred in a clinical encounter; these include an oral presentation (faculty perspective), a written note (trainee perspective), and the patient’s perspective. In the present study, we employed two established theories as frameworks with the purpose of assessing the extent to which different views of the same clinical encounter (a three‐component, Year 2 medical student objective structured clinical examination [OSCE] station) are similar to or differ from one another. Methods We performed univariate comparisons between the individual items on each of the three components of the OSCE: the standardised patient (SP) checklist (patient perspective); the post‐encounter form (trainee perspective), and the oral presentation rating form (faculty perspective). Confirmatory factor analysis (CFA) of the three‐component station was used to assess the fit of the three‐factor (three‐viewpoint) model. We also compared tercile performance across these three views as a form of extreme groups analysis. Results Results from the CFA yielded a measurement model with reasonable fit. Moderate correlations between the three components of the station were observed. Individual trainee performance, as measured by tercile score, varied across components of the station. Conclusions Our work builds on research in fields outside medicine, with results yielding small to moderate correlations between different perspectives (and measurements) of the same event (SP checklist, post‐encounter form and oral presentation rating form). We believe obtaining multiple perspectives of the same encounter provides a more valid measure of a student’s clinical performance.  相似文献   

20.
In this article, we propose a new generalization of the Weibull distribution, which incorporates the exponentiated Weibull distribution introduced by Mudholkar and Srivastava (IEEE Trans. Reliab. 1993; 42 :299–302) as a special case. We refer to the new family of distributions as the beta‐Weibull distribution. We investigate the potential usefulness of the beta‐Weibull distribution for modeling censored survival data from biomedical studies. Several other generalizations of the standard two‐parameter Weibull distribution are compared with regards to maximum likelihood inference of the cumulative incidence function, under the setting of competing risks. These Weibull‐based parametric models are fit to a breast cancer data set from the National Surgical Adjuvant Breast and Bowel Project. In terms of statistical significance of the treatment effect and model adequacy, all generalized models lead to similar conclusions, suggesting that the beta‐Weibull family is a reasonable candidate for modeling survival data. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号