首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper we use simulations to compare the performance of new goodness-of-fit tests based on weighted statistical processes to three currently available tests: the Hosmer-Lemeshow decile-of-risk test; the Pearson chi-square, and the unweighted sum-of-squares tests. The simulations demonstrate that all tests have the correct size. The power for all tests to detect lack-of-fit due to an omitted quadratic term with a sample of size 100 is close to or exceeds 50 per cent to detect moderate departures from linearity and is over 90 per cent for these same alternatives for sample size 500. All tests have low power with sample size 100 to detect lack-of-fit due to an omitted interaction between a dichotomous and continuous covariate, while the power exceeds 80 per cent to detect extreme interaction with a sample size of 500. The power is low to detect any alternative link function with sample size 100 and for most alternative links for sample size 500. Only in the case of sample size 500 and an extremely asymmetric link function is the power over 80 per cent. The results from these simulations show that no single test, new or current, performs best in detecting lack-of-fit due to an omitted covariate or incorrect link function. However, one of the new weighted tests has power comparable to other tests in all settings simulated and had the highest power in the difficult case of an omitted interaction term. We illustrate the tests within the context of a model for factors associated with abstinence from drug use in a randomized trial of residential treatment programmes. We conclude the paper with a summary and specific recommendations for practice.  相似文献   

2.
Many researchers are considering the use of transmission/disequilibrium tests (TDT) for trios of genotypes (father, mother, child) as a method for localizing genes associated with complex diseases. We evaluate the effect of random errors (allele changes) in trios on the power to detect linkage. For a marker in the simulated data set, one allele is associated with the fictitious disease in a certain subpopulation. For the data as given (no errors), our power to detect linkage using the multiallelic TDT (TDTmhet) is 68% (critical p-value set at 0.0001). We introduce errors into trios at various rates (1%, 5%, or 10%), remove only trios displaying mendelian inconsistencies, and recalculate power to detect linkage. Our principal finding is that there is power loss to detect linkage with the TDTmhet when errors are introduced. We observe power losses of 8%, 16%, and 48% for error rates of 1%, 5%, and 10%, respectively. To determine the source of the power loss, we perform Monte Carlo simulations. At the 1% and 5% rates, we conclude that power loss is due primarily to loss in sample size. At the 10% rate, we observe substantial power loss due to error introduction in addition to sample size reduction. We also determine, given a particular error rate, the probability that we detect errors if we use only mendelian consistency as a check. We find that the mean detection rates for the data sets with 1%, 5%, or 10% error rates are 58%, 60%, and 62%, respectively. As a result, the apparent error rate appears to be almost half the true error rate. Based on these results, we recommend that researchers maintain error rates below 5% when using the TDTmhet for linkage, use additional methods beyond mendelian consistency checks when searching for errors in their data, and modify sample size calculations when accounting for errors in their genotype data.  相似文献   

3.
Most methods for calculating sample size use the relative risk (RR) to indicate the strength of the association between exposure and disease. For measuring the public health importance of a possible association, the population attributable fraction (PAF)--the proportion of disease incidence in a population that is attributable to an exposure--is more appropriate. We determined sample size and power for detecting a specified PAF in both cohort and case-control studies and compared the results with those obtained using conventional estimates based on the relative risk. When an exposure is rare, a study that has little power to detect a small RR often has adequate power to detect a small PAF. On the other hand, for common exposures, even a relatively large study may have inadequate power to detect a small PAF. These comparisons emphasize the importance of selecting the most pertinent measure of association, either relative risk or population attributable fraction, when calculating power and sample size.  相似文献   

4.
In case-control studies, the results about how the exposure distribution affects sample size are well known. This paper extends previous results by incorporating the effect of a confounder into the calculation of sample size for a desired size and power of a statistical test. The paper also includes a quantitative discussion on the influence of the joint distribution for exposure to a putative cause and a a confounder on required sample sizes. The results show that, to detect a specified alternative for a given size and power, the required sample size decreases as either the variance of exposure or the effect of exposure on disease increases. The required sample size, however, increases as either the variance of the confounder or the effect of the confounder on disease increases. Generally, the higher is the absolute value of the simple correlation between the exposure and the confounder, the larger is the required sample size.  相似文献   

5.
With its potential to discover a much greater amount of genetic variation, next‐generation sequencing is fast becoming an emergent tool for genetic association studies. However, the cost of sequencing all individuals in a large‐scale population study is still high in comparison to most alternative genotyping options. While the ability to identify individual‐level data is lost (without bar‐coding), sequencing pooled samples can substantially lower costs without compromising the power to detect significant associations. We propose a hierarchical Bayesian model that estimates the association of each variant using pools of cases and controls, accounting for the variation in read depth across pools and sequencing error. To investigate the performance of our method across a range of number of pools, number of individuals within each pool, and average coverage, we undertook extensive simulations varying effect sizes, minor allele frequencies, and sequencing error rates. In general, the number of pools and pool size have dramatic effects on power while the total depth of coverage per pool has only a moderate impact. This information can guide the selection of a study design that maximizes power subject to cost, sample size, or other laboratory constraints. We provide an R package (hiPOD: hierarchical Pooled Optimal Design) to find the optimal design, allowing the user to specify a cost function, cost, and sample size limitations, and distributions of effect size, minor allele frequency, and sequencing error rate.  相似文献   

6.
The positive approach to negative results in toxicology studies   总被引:2,自引:0,他引:2  
Negative results in toxicology studies are often as noteworthy as are results that detect significant toxicological effects. The results of 49.1% of all t tests published in Ecotoxicology and Environmental Safety in 1985 and 1986 were negative. However, despite the importance and prevalence of negative results in toxicology studies, they are frequently misinterpreted. Negative results from statistical tests that have poor statistical power can only be considered to be inconclusive. Because toxicology studies often use small sample sizes, such studies often have poor power to detect small, but biologically significant, effects. Toxicologists may improve the power of their tests by improving experiment design, increasing alpha, increasing sample size, or limiting the analysis to detection of large differences among samples. Selection of both sample size and alpha level should take considerations of statistical power into account.  相似文献   

7.
Repeated measures are common in clinical trials and epidemiological studies. Designing studies with repeated measures requires reasonably accurate specifications of the variances and correlations to select an appropriate sample size. Underspecifying the variances leads to a sample size that is inadequate to detect a meaningful scientific difference, while overspecifying the variances results in an unnecessarily large sample size. Both lead to wasting resources and placing study participants in unwarranted risk. An internal pilot design allows sample size recalculation based on estimates of the nuisance parameters in the covariance matrix. We provide the theoretical results that account for the stochastic nature of the final sample size in a common class of linear mixed models. The results are useful for designing studies with repeated measures and balanced design. Simulations examine the impact of misspecification of the covariance matrix and demonstrate the accuracy of the approximations in controlling the type I error rate and achieving the target power. The proposed methods are applied to a longitudinal study assessing early antiretroviral therapy for youth living with HIV.  相似文献   

8.
Genome‐wide association studies (GWAS) have been widely used to identify genetic effects on complex diseases or traits. Most currently used methods are based on separate single‐nucleotide polymorphism (SNP) analyses. Because this approach requires correction for multiple testing to avoid excessive false‐positive results, it suffers from reduced power to detect weak genetic effects under limited sample size. To increase the power to detect multiple weak genetic factors and reduce false‐positive results caused by multiple tests and dependence among test statistics, a modified forward multiple regression (MFMR) approach is proposed. Simulation studies show that MFMR has higher power than the Bonferroni and false discovery rate procedures for detecting moderate and weak genetic effects, and MFMR retains an acceptable‐false positive rate even if causal SNPs are correlated with many SNPs due to population stratification or other unknown reasons. Genet. Epidemiol. 33:518–525, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

9.
The univariate analysis of categorical twin data can be performed using either structural equation modeling (SEM) or logistic regression. This paper presents a comparison between these two methods using a simulation study. Dichotomous and ordinal (three category) twin data are simulated under two different sample sizes (1,000 and 2,000 twin pairs) and according to different additive genetic and common environmental models of phenotypic variation. The two methods are found to be generally comparable in their ability to detect a “correct” model under the specifications of the simulation. Both methods lack power to detect the right model for dichotomous data when the additive genetic effect is low (between 10 and 20%) or medium (between 30 and 40%); the ordinal data simulations produce similar results except for the additive genetic model with medium or high heritability. Neither method could adequately detect a correct model that included a modest common environmental effect (20%) even when the additive genetic effect was large and the sample size included 2,000 twin pairs. The SEM method was found to have better power than logistic regression when there is a medium (30%) or high (50%) additive genetic effect and a modest common environmental effect. Conversely, logistic regression performed better than SEM in correctly detecting additive genetic effects with simulated ordinal data (for both 1,000 and 2,000 pairs) that did not contain modest common environmental effects; in this case the SEM method incorrectly detected a common environmental effect that was not present. © 1996 Wiley-Liss, Inc.  相似文献   

10.
Increasing the sample size based on unblinded interim result may inflate the type I error rate and appropriate statistical adjustments may be needed to control the type I error rate at the nominal level. We briefly review the existing approaches which allow early stopping due to futility, or change the test statistic by using different weights, or adjust the critical value for final test, or enforce rules for sample size recalculation. The implication of early stopping due to futility and a simple modification to the weighted Z-statistic approach are discussed. In this paper, we show that increasing the sample size when the unblinded interim result is promising will not inflate the type I error rate and therefore no statistical adjustment is necessary. The unblinded interim result is considered promising if the conditional power is greater than 50 per cent or equivalently, the sample size increment needed to achieve a desired power does not exceed an upper bound. The actual sample size increment may be determined by important factors such as budget, size of the eligible patient population and competition in the market. The 50 per cent-conditional-power approach is extended to a group sequential trial with one interim analysis where a decision may be made at the interim analysis to stop the trial early due to a convincing treatment benefit, or to increase the sample size if the interim result is not as good as expected. The type I error rate will not be inflated if the sample size may be increased only when the conditional power is greater than 50 per cent. If there are two or more interim analyses in a group sequential trial, our simulation study shows that the type I error rate is also well controlled.  相似文献   

11.
For some trials, simple but subtle assumptions can have a profound impact on the size of the trial. A case in point is a vaccine lot consistency (or equivalence) trial. Standard sample size formulas used for designing lot consistency trials rely on only one component of variation, namely, the variation in antibody titers within lots. The other component, the variation in the means of titers between lots, is assumed to be equal to zero. In reality, some amount of variation between lots, however small, will be present even under the best manufacturing practices. Using data from a published lot consistency trial, we demonstrate that when the between-lot variation is only 0.5 per cent of the total variation, the increase in the sample size is nearly 300 per cent when compared with the size assuming that the lots are identical. The increase in the sample size is so pronounced that in order to maintain power one is led to consider a less stringent criterion for demonstration of lot consistency. The appropriate sample size formula that is a function of both components of variation is provided. We also discuss the increase in the sample size due to correlated comparisons arising from three pairs of lots as a function of the between-lot variance.  相似文献   

12.
《Vaccine》2017,35(50):6934-6937
BackgroundPatients undergoing primary total hip arthroplasty (THA) would be a worthy population for anti-staphylococcal vaccines. The objective is to assess sample size for significant vaccine efficacy (VE) in a randomized clinical trial (RCT).MethodsData from a surveillance network of surgical site infection in France between 2008 and 2011 were used. The outcome was S. aureus SSI (SASSI) within 30 days after surgery. Statistical power was estimated by simulations repeated for theoretical VE ranging from 20% to 100% and for sample sizes from 250 to 8000 individuals per arm.Results18,688 patients undergoing THA were included; 66 (0.35%) SASSI occurred. For a 1% SASSI rate, the sample size would be at least 1316 patients per arm to detect significant VE of 80% with 80% power.ConclusionSimulations with real-life data from surveillance of hospital acquired infections allow estimation of power for RCT and sample size to reach the required power.  相似文献   

13.
ObjectiveSimple guidelines for calculating efficient sample sizes in cluster randomized trials with unknown intraclass correlation (ICC) and varying cluster sizes.MethodsA simple equation is given for the optimal number of clusters and sample size per cluster. Here, optimal means maximizing power for a given budget or minimizing total cost for a given power. The problems of cluster size variation and specification of the ICC of the outcome are solved in a simple yet efficient way.ResultsThe optimal number of clusters goes up, and the optimal sample size per cluster goes down as the ICC goes up or as the cluster-to-person cost ratio goes down. The available budget, desired power, and effect size only affect the number of clusters and not the sample size per cluster, which is between 7 and 70 for a wide range of cost ratios and ICCs. Power loss because of cluster size variation is compensated by sampling 10% more clusters. The optimal design for the ICC halfway the range of realistic ICC values is a good choice for the first stage of a two-stage design. The second stage is needed only if the first stage shows the ICC to be higher than assumed.ConclusionEfficient sample sizes for cluster randomized trials are easily computed, provided the cost per cluster and cost per person are specified.  相似文献   

14.
BACKGROUND: Primary care research often involves clustered samples in which subjects are randomized at a group level but analyzed at an individual level. Analyses that do not take this clustering into account may report significance where none exists. This article explores the causes, consequences, and implications of cluster data. METHODS: Using a case study with accompanying equations, we show that clustered samples are not as statistically efficient as simple random samples. RESULTS: Similarity among subjects within preexisting groups or clusters reduces the variability of responses in a clustered sample, which erodes the power to detect true differences between study arms. This similarity is expressed by the intracluster correlation coefficient, or p (rho), which compares the within-group variance with the between-group variance. Rho is used in equations along with the cluster size and the number of clusters to calculate the effective sample size (ESS) in a clustered design. The ESS should be used to calculate power in the design phase of a clustered study. Appropriate accounting for similarities among subjects in a cluster almost always results in a net loss of power, requiring increased total subject recruitment. Increasing the number of clusters enhances power more efficiently than does increasing the number of subjects within a cluster. CONCLUSIONS: Primary care research frequently uses clustered designs, whether consciously or unconsciously. Researchers must recognize and understand the implications of clusters to avoid costly sample size errors.  相似文献   

15.
This paper considers quantitatively the extent to which the interaction or confounding effects of covariates may influence the design of case-control studies with particular reference to sample requirements and the role of matching. For the most part, attention is confined to a dichotomous exposure variable, and a single dichotomous covariate. Adjustment for confounding variables appears to have little effect on the power of a study unless they are strongly (odds ratio of 5 or more) related to both the disease and the exposure of interest, and only in similar circumstances will matching be of appreciable value. Matching also makes only a small improvement in the power to detect interaction effects, except under fairly extreme conditions. Both to control confounding and to detect interaction, the effect of matching may sometimes be to reduce the power of a study. The difference in power between matched and unmatched studies diminishes rapidly as the control-to-case ratio is increased. The implications of interaction effects for sample size requirements are more important. If one aim of a study is to detect interactions, the size of the study will have to be at least four times larger than if attention were confined to detecting main effects of the same magnitude. These conclusions are based on a quantitative evaluation of a wide range of possible situations.  相似文献   

16.
‘Qualitative’ or ‘crossover’ interactions arise when a new treatment, compared with a control treatment, is beneficial in some subsets of patients and harmful in other subsets. We present a new range test for crossover interactions and compare it with the likelihood ratio test developed by Gail and Simon. The range test has greater power when the new treatment is harmful in only a few subsets, whereas the likelihood ratio test has greater power when the new treatment is harmful in several subsets. We provide power tables for both tests to facilitate sample size calculations for designing experiments to detect qualitative interactions and for interpreting the results of clinical trials.  相似文献   

17.
The efficiency of the popular case-control design in gene-longevity association studies needs to be verified because, different from a binary trait, longevity represents only the extreme end of the continuous life span distribution without a clear cutoff for defining the phenotype. In this paper, the authors use the current Danish life tables to simulate individual life span by using a variety of scenarios and assess the empirical power for different sample sizes when cases are defined as centenarians or as nonagenarians. Results show that, although using small samples of centenarians (several hundred) provides power to detect only common alleles with large effects (a >20% reduction in hazard rate), large samples of centenarians (>1,000) achieve power to capture genes responsible for minor effects (5%-10% hazard reduction depending on the mode of inheritance). Although the method provides good power for rare alleles with multiplicative or dominant effects, it performs poorly for rare recessive alleles. Power is drastically reduced when nonagenarians are considered cases, with a more than 5-fold difference in the size of the case sample required to achieve comparable power as that found with centenarians.  相似文献   

18.
BACKGROUND: The reliability of biomarkers profoundly impacts validity of their use in epidemiology and can have serious implications for study power and the ability to find true associations. We assessed reliability of plasma carotenoid levels over time and how it could influence study power through sample size and effect-size. METHODS: Plasma carotenoid levels were measured in a cohort study of 1323 women participating in the control arm of the Women's Healthy Eating and Living Study. We compared mean plasma levels at baseline, year 1, and year 4 of the study for alpha-carotene, beta-carotene, lycopene, lutein, and beta-cryptoxanthin. Reliability of these levels over time was assessed by Spearman correlations and intraclass correlation. RESULTS: We found limited variation in mean levels between any 2 time points. Variation did not exceed 8% for lycopene, lutein, and beta-cryptoxanthin, 15% for alpha-carotene, and 18% for beta-carotene. Spearman correlations for individual carotenoids over time varied between 0.50 and 0.80, with lycopene having the lowest correlation. Intraclass correlations ranged from 0.47 to 0.66 for carotenoids. CONCLUSION: Intraclass correlations for plasma carotenoids over a period of several years are acceptable for epidemiologic studies. However, such variation is enough to decrease statistical power and increase the sample size needed to detect a given effect.  相似文献   

19.
一种临床试验中的适应性样本量调整方法   总被引:1,自引:1,他引:0  
目的介绍一种临床试验中的适应性样本量调整方法,并探讨样本量调整后统计分析方法的第Ⅰ类错误率及检验效能。方法通过montecarlo模拟的方法研究n1大小对最终样本量Nf的影响,并估计最终方差偏移大小;同时模拟研究样本量调整后统计分析方法的第Ⅰ类错误率及检验效能大小。结果(1)模拟结果显示运用该样本量调整方法所得到的最终样本量Nf非常接近其真实值N0,尤其在π=0.4时进行样本量调整。(2)同时模拟结果显示所介绍的样本量调整后的校正t检验方法不仅能有效控制第Ⅰ类错误率α并且能充分满足试验检验效能(1-β)。结论该样本量调整方法研究结果是在一般两样本单侧t检验条件下得到也可应用于优效或非劣效设计的临床试验中。  相似文献   

20.
Monitoring clinical trials often requires examining the interim findings to see if the sample size originally specified in the protocol will provide the required power against the null hypothesis when the alternative hypothesis is true, and to increase the sample size if necessary. This paper presents a new method, based on the overall response rate, for carrying out interim power evaluations when the observations have binomial distributions, without unblinding the treatment assignments or materially affecting the type I error rate. Simulation study results confirm the performance of the method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号