首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Experimental studies in biomedical research frequently pose analytical problems related to small sample size. In such studies, there are conflicting findings regarding the choice of parametric and nonparametric analysis, especially with non‐normal data. In such instances, some methodologists questioned the validity of parametric tests and suggested nonparametric tests. In contrast, other methodologists found nonparametric tests to be too conservative and less powerful and thus preferred using parametric tests. Some researchers have recommended using a bootstrap test; however, this method also has small sample size limitation. We used a pooled method in nonparametric bootstrap test that may overcome the problem related with small samples in hypothesis testing. The present study compared nonparametric bootstrap test with pooled resampling method corresponding to parametric, nonparametric, and permutation tests through extensive simulations under various conditions and using real data examples. The nonparametric pooled bootstrap t‐test provided equal or greater power for comparing two means as compared with unpaired t‐test, Welch t‐test, Wilcoxon rank sum test, and permutation test while maintaining type I error probability for any conditions except for Cauchy and extreme variable lognormal distributions. In such cases, we suggest using an exact Wilcoxon rank sum test. Nonparametric bootstrap paired t‐test also provided better performance than other alternatives. Nonparametric bootstrap test provided benefit over exact Kruskal–Wallis test. We suggest using nonparametric bootstrap test with pooled resampling method for comparing paired or unpaired means and for validating the one way analysis of variance test results for non‐normal data in small sample size studies. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

2.
Clinical trials with multiple primary time‐to‐event outcomes are common. Use of multiple endpoints creates challenges in the evaluation of power and the calculation of sample size during trial design particularly for time‐to‐event outcomes. We present methods for calculating the power and sample size for randomized superiority clinical trials with two correlated time‐to‐event outcomes. We do this for independent and dependent censoring for three censoring scenarios: (i) the two events are non‐fatal; (ii) one event is fatal (semi‐competing risk); and (iii) both are fatal (competing risk). We derive the bivariate log‐rank test in all three censoring scenarios and investigate the behavior of power and the required sample sizes. Separate evaluations are conducted for two inferential goals, evaluation of whether the test intervention is superior to the control on: (1) all of the endpoints (multiple co‐primary) or (2) at least one endpoint (multiple primary). Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

3.
The standard procedure to assess genetic equilibrium is a χ2 test of goodness‐of‐fit. As is the case with any statistical procedure of that type, the null hypothesis is that the distribution underlying the data is in agreement with the model. Thus, a significant result indicates incompatibility of the observed data with the model, which is clearly at variance with the aim in the majority of applications: to exclude the existence of gross violations of the equilibrium condition. In current practice, we try to avoid this basic logical difficulty by increasing the significance bound to the P‐value (e.g. from 5 to 10%) and inferring compatibility of the data with Hardy Weinberg Equilibrium (HWE) from an insignificant result. Unfortunately, such direct inversion of a statistical testing procedure fails to produce a valid test of the hypothesis of interest, namely, that the data are in sufficiently good agreement with the model under which the P‐value is calculated. We present a logically unflawed solution to the problem of establishing (approximate) compatibility of an observed genotype distribution with HWE. The test is available in one‐ and two‐sided versions. For both versions, we provide tools for exact power calculation. We demonstrate the merits of the new approach through comparison with the traditional χ2 goodness‐of‐fit test in 2×60 genotype distributions from 43 published genetic studies of complex diseases where departure from HWE was noted in either the case or control sample. In addition, we show that the new test is useful for the analysis of genome‐wide association studies. Genet. Epidemiol. 33:569–580, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

4.
The power of a chi‐square test, and thus the required sample size, are a function of the noncentrality parameter that can be obtained as the limiting expectation of the test statistic under an alternative hypothesis specification. Herein, we apply this principle to derive simple expressions for two tests that are commonly applied to discrete ordinal data. The Wilcoxon rank sum test for the equality of distributions in two groups is algebraically equivalent to the Mann–Whitney test. The Kruskal–Wallis test applies to multiple groups. These tests are equivalent to a Cochran–Mantel–Haenszel mean score test using rank scores for a set of C‐discrete categories. Although various authors have assessed the power function of the Wilcoxon and Mann–Whitney tests, herein it is shown that the power of these tests with discrete observations, that is, with tied ranks, is readily provided by the power function of the corresponding Cochran–Mantel–Haenszel mean scores test for two and R > 2 groups. These expressions yield results virtually identical to those derived previously for rank scores and also apply to other score functions. The Cochran–Armitage test for trend assesses whether there is an monotonically increasing or decreasing trend in the proportions with a positive outcome or response over the C‐ordered categories of an ordinal independent variable, for example, dose. Herein, it is shown that the power of the test is a function of the slope of the response probabilities over the ordinal scores assigned to the groups that yields simple expressions for the power of the test. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

5.
The most common data structures in the biomedical studies have been matched or unmatched designs. Data structures resulting from a hybrid of the two may create challenges for statistical inferences. The question may arise whether to use parametric or nonparametric methods on the hybrid data structure. The Early Treatment for Retinopathy of Prematurity study was a multicenter clinical trial sponsored by the National Eye Institute. The design produced data requiring a statistical method of a hybrid nature. An infant in this multicenter randomized clinical trial had high‐risk prethreshold retinopathy of prematurity that was eligible for treatment in one or both eyes at entry into the trial. During follow‐up, recognition visual acuity was accessed for both eyes. Data from both eyes (matched) and from only one eye (unmatched) were eligible to be used in the trial. The new hybrid nonparametric method is a meta‐analysis based on combining the Hodges–Lehmann estimates of treatment effects from the Wilcoxon signed rank and rank sum tests. To compare the new method, we used the classic meta‐analysis with the t‐test method to combine estimates of treatment effects from the paired and two sample t‐tests. We used simulations to calculate the empirical size and power of the test statistics, as well as the bias, mean square and confidence interval width of the corresponding estimators. The proposed method provides an effective tool to evaluate data from clinical trials and similar comparative studies. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

6.
In randomised controlled trials of treatments for late‐stage cancer, it is common for control arm patients to receive the experimental treatment around the point of disease progression. This treatment switching can dilute the estimated treatment effect on overall survival and impact the assessment of a treatment's benefit on health economic evaluations. The rank‐preserving structural failure time model of Robins and Tsiatis (Comm. Stat., 20 :2609–2631) offers a potential solution to this problem and is typically implemented using the logrank test. However, in the presence of substantial switching, this test can have low power because the hazard ratio is not constant over time. Schoenfeld (Biometrika, 68 :316–319) showed that when the hazard ratio is not constant, weighted versions of the logrank test become optimal. We present a weighted logrank test statistic for the late stage cancer trial context given the treatment switching pattern and working assumptions about the underlying hazard function in the population. Simulations suggest that the weighted approach can lead to large efficiency gains in either an intention‐to‐treat or a causal rank‐preserving structural failure time model analysis compared with the unweighted approach. Furthermore, violation of the working assumptions used in the derivation of the weights only affects the efficiency of the estimates and does not induce bias or inflate the type I error rate. The weighted logrank test statistic should therefore be considered for use as part of a careful secondary, exploratory analysis of trial data affected by substantial treatment switching. ©©2015 The Authors. Statistics inMedicine Published by John Wiley & Sons Ltd.  相似文献   

7.
The Wilcoxon–Mann–Whitney (WMW) test is often used to compare the means or medians of two independent, possibly nonnormal distributions. For this problem, the true significance level of the large sample approximate version of the WMW test is known to be sensitive to differences in the shapes of the distributions. Based on a wide ranging simulation study, our paper shows that the problem of lack of robustness of this test is more serious than is thought to be the case. In particular, small differences in variances and moderate degrees of skewness can produce large deviations from the nominal type I error rate. This is further exacerbated when the two distributions have different degrees of skewness. Other rank‐based methods like the Fligner–Policello (FP) test and the Brunner–Munzel (BM) test perform similarly, although the BM test is generally better. By considering the WMW test as a two‐sample T test on ranks, we explain the results by noting some undesirable properties of the rank transformation. In practice, the ranked samples should be examined and found to sufficiently satisfy reasonable symmetry and variance homogeneity before the test results are interpreted. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

8.
When conducting a meta‐analysis of standardized mean differences (SMDs), it is common to use Cohen's d, or its variants, that require equal variances in the two arms of each study. While interpretation of these SMDs is simple, this alone should not be used as a justification for assuming equal variances. Until now, researchers have either used an F‐test for each individual study or perhaps even conveniently ignored such tools altogether. In this paper, we propose a meta‐analysis of ratios of sample variances to assess whether the equality of variances assumptions is justified prior to a meta‐analysis of SMDs. Quantile–quantile plots, an omnibus test for equal variances or an overall meta‐estimate of the ratio of variances can all be used to formally justify the use of less common methods when evidence of unequal variances is found. The methods in this paper are simple to implement and the validity of the approaches are reinforced by simulation studies and an application to a real data set. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

9.
Repeated measurement designs have been widely used in various randomized controlled trials for evaluating long‐term intervention efficacies. For some clinical trials, the primary research question is how to compare two treatments at a fixed time, using a t‐test. Although simple, robust, and convenient, this type of analysis fails to utilize a large amount of collected information. Alternatively, the mixed‐effects model is commonly used for repeated measurement data. It models all available data jointly and allows explicit assessment of the overall treatment effects across the entire time spectrum. In this paper, we propose an analytic strategy for longitudinal clinical trial data where the mixed‐effects model is coupled with a model selection scheme. The proposed test statistics not only make full use of all available data but also utilize the information from the optimal model deemed for the data. The performance of the proposed method under various setups, including different data missing mechanisms, is evaluated via extensive Monte Carlo simulations. Our numerical results demonstrate that the proposed analytic procedure is more powerful than the t‐test when the primary interest is to test for the treatment effect at the last time point. Simulations also reveal that the proposed method outperforms the usual mixed‐effects model for testing the overall treatment effects across time. In addition, the proposed framework is more robust and flexible in dealing with missing data compared with several competing methods. The utility of the proposed method is demonstrated by analyzing a clinical trial on the cognitive effect of testosterone in geriatric men with low baseline testosterone levels. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

10.
The problem of testing symmetry about zero has a long and rich history in the statistical literature. We introduce a new test that sequentially discards observations whose absolute value is below increasing thresholds defined by the data. McNemar's statistic is obtained at each threshold and the largest is used as the test statistic. We obtain the exact distribution of this maximally selected McNemar and provide tables of critical values and a program for computing p‐values. Power is compared with the t‐test, the Wilcoxon Signed Rank Test and the Sign Test. The new test, MM, is slightly less powerful than the t‐test and Wilcoxon Signed Rank Test for symmetric normal distributions with nonzero medians and substantially more powerful than all three tests for asymmetric mixtures of normal random variables with or without zero medians. The motivation for this test derives from the need to appraise the safety profile of new medications. If pre and post safety measures are obtained, then under the null hypothesis, the variables are exchangeable and the distribution of their difference is symmetric about a zero median. Large pre–post differences are the major concern of a safety assessment. The discarded small observations are not particularly relevant to safety and can reduce power to detect important asymmetry. The new test was utilized on data from an on‐road driving study performed to determine if a hypnotic, a drug used to promote sleep, has next day residual effects. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

11.
Gene expression (GE) levels have important biological and clinical implications. They are regulated by copy number alterations (CNAs). Modeling the regulatory relationships between GEs and CNAs facilitates understanding disease biology and can also have values in translational medicine. The expression level of a gene can be regulated by its cis‐acting as well as trans‐acting CNAs, and the set of trans‐acting CNAs is usually not known, which poses a high‐dimensional selection and estimation problem. Most of the existing studies share a common limitation in that they cannot accommodate long‐tailed distributions or contamination of GE data. In this study, we develop a high‐dimensional robust regression approach to infer the regulatory relationships between GEs and CNAs. A high‐dimensional regression model is used to accommodate the effects of both cis‐acting and trans‐acting CNAs. A density power divergence loss function is used to accommodate long‐tailed GE distributions and contamination. Penalization is adopted for regularized estimation and selection of relevant CNAs. The proposed approach is effectively realized using a coordinate descent algorithm. Simulation shows that it has competitive performance compared to the nonrobust benchmark and the robust LAD (least absolute deviation) approach. We analyze TCGA (The Cancer Genome Atlas) data on cutaneous melanoma and study GE‐CNA regulations in the RAP (regulation of apoptosis) pathway, which further demonstrates the satisfactory performance of the proposed approach.  相似文献   

12.
Standard meta‐analytic theory assumes that study outcomes are normally distributed with known variances. However, methods derived from this theory are often applied to effect sizes having skewed distributions with estimated variances. Both shortcomings can be largely overcome by first applying a variance stabilizing transformation. Here we concentrate on study outcomes with Student t‐distributions and show that we can better estimate parameters of fixed or random effects models with confidence intervals using stable weights or with profile approximate likelihood intervals following stabilization. We achieve even better coverage with a finite sample bias correction. Further, a simple t‐interval provides very good coverage of an overall effect size without estimation of the inter‐study variance. We illustrate the methodology on two meta‐analytic studies from the medical literature, the effect of salt reduction on systolic blood pressure and the effect of opioids for the relief of breathlessness. Substantial simulation studies compare traditional methods with those newly proposed. We can apply the theoretical results to other study outcomes for which an effective variance stabilizer is available. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

13.
The sandwich estimator in generalized estimating equations (GEE) approach underestimates the true variance in small samples and consequently results in inflated type I error rates in hypothesis testing. This fact limits the application of the GEE in cluster‐randomized trials (CRTs) with few clusters. Under various CRT scenarios with correlated binary outcomes, we evaluate the small sample properties of the GEE Wald tests using bias‐corrected sandwich estimators. Our results suggest that the GEE Wald z‐test should be avoided in the analyses of CRTs with few clusters even when bias‐corrected sandwich estimators are used. With t‐distribution approximation, the Kauermann and Carroll (KC)‐correction can keep the test size to nominal levels even when the number of clusters is as low as 10 and is robust to the moderate variation of the cluster sizes. However, in cases with large variations in cluster sizes, the Fay and Graubard (FG)‐correction should be used instead. Furthermore, we derive a formula to calculate the power and minimum total number of clusters one needs using the t‐test and KC‐correction for the CRTs with binary outcomes. The power levels as predicted by the proposed formula agree well with the empirical powers from the simulations. The proposed methods are illustrated using real CRT data. We conclude that with appropriate control of type I error rates under small sample sizes, we recommend the use of GEE approach in CRTs with binary outcomes because of fewer assumptions and robustness to the misspecification of the covariance structure. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

14.
Testing the association between single-nucleotide polymorphism (SNP) effects and a response is often carried out through kernel machine methods based on least squares, such as the sequence kernel association test (SKAT). However, these least-squares procedures are designed for a normally distributed conditional response, which may not apply. Other robust procedures such as the quantile regression kernel machine (QRKM) restrict the choice of the loss function and only allow inference on conditional quantiles. We propose a general and robust kernel association test with a flexible choice of the loss function, no distributional assumptions, and has SKAT and QRKM as special cases. We evaluate our proposed robust association test (RobKAT) across various data distributions through a simulation study. When errors are normally distributed, RobKAT controls type I error and shows comparable power with SKAT. In all other distributional settings investigated, our robust test has similar or greater power than SKAT. Finally, we apply our robust testing method to data from the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) clinical trial to detect associations between selected genes including the major histocompatibility complex (MHC) region on chromosome six and neurotropic herpesvirus antibody levels in schizophrenia patients. RobKAT detected significant association with four SNP sets (HST1H2BJ, MHC, POM12L2, and SLC17A1), three of which were undetected by SKAT.  相似文献   

15.
Many different methods have been proposed for the analysis of cluster randomized trials (CRTs) over the last 30 years. However, the evaluation of methods on overdispersed count data has been based mostly on the comparison of results using empiric data; i.e. when the true model parameters are not known. In this study, we assess via simulation the performance of five methods for the analysis of counts in situations similar to real community‐intervention trials. We used the negative binomial distribution to simulate overdispersed counts of CRTs with two study arms, allowing the period of time under observation to vary among individuals. We assessed different sample sizes, degrees of clustering and degrees of cluster‐size imbalance. The compared methods are: (i) the two‐sample t‐test of cluster‐level rates, (ii) generalized estimating equations (GEE) with empirical covariance estimators, (iii) GEE with model‐based covariance estimators, (iv) generalized linear mixed models (GLMM) and (v) Bayesian hierarchical models (Bayes‐HM). Variation in sample size and clustering led to differences between the methods in terms of coverage, significance, power and random‐effects estimation. GLMM and Bayes‐HM performed better in general with Bayes‐HM producing less dispersed results for random‐effects estimates although upward biased when clustering was low. GEE showed higher power but anticonservative coverage and elevated type I error rates. Imbalance affected the overall performance of the cluster‐level t‐test and the GEE's coverage in small samples. Important effects arising from accounting for overdispersion are illustrated through the analysis of a community‐intervention trial on Solar Water Disinfection in rural Bolivia. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

16.
Confidence interval (CI) construction with respect to proportion/rate difference for paired binary data has become a standard procedure in many clinical trials and medical studies. When the sample size is small and incomplete data are present, asymptotic CIs may be dubious and exact CIs are not yet available. In this article, we propose exact and approximate unconditional test‐based methods for constructing CI for proportion/rate difference in the presence of incomplete paired binary data. Approaches based on one‐ and two‐sided Wald's tests will be considered. Unlike asymptotic CI estimators, exact unconditional CI estimators always guarantee their coverage probabilities at or above the pre‐specified confidence level. Our empirical studies further show that (i) approximate unconditional CI estimators usually yield shorter expected confidence width (ECW) with their coverage probabilities being well controlled around the pre‐specified confidence level; and (ii) the ECWs of the unconditional two‐sided‐test‐based CI estimators are generally narrower than those of the unconditional one‐sided‐test‐based CI estimators. Moreover, ECWs of asymptotic CIs may not necessarily be narrower than those of two‐sided‐based exact unconditional CIs. Two real examples will be used to illustrate our methodologies. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

17.
Standard methods for two-sample tests such as the t-test and Wilcoxon rank sum test may lead to incorrect type I errors when applied to longitudinal or clustered data. Recent alternatives of two-sample tests for clustered data often require certain assumptions on the correlation structure and/or noninformative cluster size. In this paper, based on a novel pseudolikelihood for correlated data, we propose a score test without knowledge of the correlation structure or assuming data missingness at random. The proposed score test can capture differences in the mean and variance between two groups simultaneously. We use projection theory to derive the limiting distribution of the test statistic, in which the covariance matrix can be empirically estimated. We conduct simulation studies to evaluate the proposed test and compare it with existing methods. To illustrate the usefulness proposed test, we use it to compare self-reported weight loss data in a friends' referral group, with the data from the Internet self-joining group.  相似文献   

18.
Allelic expression (AE) imbalance between the two alleles of a gene can be used to detect cis‐acting regulatory SNPs (rSNPs) in individuals heterozygous for a transcribed SNP (tSNP). In this paper, we propose three tests for AE analysis focusing on phase‐unknown data and any degree of linkage disequilibrium (LD) between the rSNP and tSNP: a test based on the minimum P‐value of a one‐sided F test and a two‐sided t test (proposed previously for phase‐unknown data), a test the combines the F and t tests, and a mixture‐model‐based test. We compare these three tests to the F and t tests and an existing regression‐based test for phase‐known data. We show that the ranking of the tests based on power depends most strongly on the magnitude of the LD between the rSNP and tSNP. For phase‐unknown data, we find that under a range of scenarios, our proposed tests have higher power than the F and t tests when LD between the rSNP and tSNP is moderate (~0.2<<~0.8). We further demonstrate that the presence of a second ungenotyped rSNP almost never invalidates the proposed tests nor substantially changes their power rankings. For detection of cis‐acting regulatory SNPs using phase‐unknown AE data, we recommend the F test when the rSNP and tSNP are in or near linkage equilibrium (<0.2); the t test when the two SNPs are in strong LD (<0.7); and the mixture‐model‐based test for intermediate LD levels (0.2<<0.7). Genet. Epidemiol. 2011. © 2011 Wiley‐Liss, Inc. 35: 515‐525, 2011  相似文献   

19.
A noniterative sample size procedure is proposed for a general hypothesis test based on the t distribution by modifying and extending Guenther's 6 approach for the one sample and two sample t tests. The generalized procedure is employed to determine the sample size for treatment comparisons using the analysis of covariance (ANCOVA) and the mixed effects model for repeated measures in randomized clinical trials. The sample size is calculated by adding a few simple correction terms to the sample size from the normal approximation to account for the nonnormality of the t statistic and lower order variance terms, which are functions of the covariates in the model. But it does not require specifying the covariate distribution. The noniterative procedure is suitable for superiority tests, noninferiority tests, and a special case of the tests for equivalence or bioequivalence and generally yields the exact or nearly exact sample size estimate after rounding to an integer. The method for calculating the exact power of the two sample t test with unequal variance in superiority trials is extended to equivalence trials. We also derive accurate power formulae for ANCOVA and mixed effects model for repeated measures, and the formula for ANCOVA is exact for normally distributed covariates. Numerical examples demonstrate the accuracy of the proposed methods particularly in small samples.  相似文献   

20.
Although the P value from a Wilcoxon‐Mann‐Whitney test is used often with randomized experiments, it is rarely accompanied with a causal effect estimate and its confidence interval. The natural parameter for the Wilcoxon‐Mann‐Whitney test is the Mann‐Whitney parameter, ?, which measures the probability that a randomly selected individual in the treatment arm will have a larger response than a randomly selected individual in the control arm (plus an adjustment for ties). We show that the Mann‐Whitney parameter may be framed as a causal parameter and show that it is not equal to a closely related and nonidentifiable causal effect, ψ, the probability that a randomly selected individual will have a larger response under treatment than under control (plus an adjustment for ties). We review the paradox, first expressed by Hand, that the ψ parameter may imply that the treatment is worse (or better) than control, while the Mann‐Whitney parameter shows the opposite. Unlike the Mann‐Whitney parameter, ψ is nonidentifiable from a randomized experiment. We review some nonparametric assumptions that rule out Hand's paradox through bounds on ψ and use bootstrap methods to make inferences on those bounds. We explore the relationship of the proportional odds parameter to Hand's paradox, showing that the paradox may occur for proportional odds parameters between 1/9 and 9. Thus, large effects are needed to ensure that if treatment appears better by the Mann‐Whitney parameter, then treatment improves responses in most individuals. We demonstrate these issues using a vaccine trial.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号