首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Kang SH  Chen JJ 《Statistics in medicine》2000,19(16):2089-2100
This paper investigates an approximate unconditional test for non-inferiority between two independent binomial proportions. The P-value of the approximate unconditional test is evaluated using the maximum likelihood estimate of the nuisance parameter. In this paper, we clarify some differences in defining the rejection regions between the approximate unconditional and conventional conditional or unconditional exact test. We compare the approximate unconditional test with the asymptotic test and unconditional exact test by Chan (Statistics in Medicine, 17, 1403-1413, 1998) with respect to the type I error and power. In general, the type I errors and powers are in the decreasing order of the asymptotic, approximate unconditional and unconditional exact tests. In many cases, the type I errors are above the nominal level from the asymptotic test, and are below the nominal level from the unconditional exact test. In summary, when the non-inferiority test is formulated in terms of the difference between two proportions, the approximate unconditional test is the most desirable, because it is easier to implement and generally more powerful than the unconditional exact test and its size rarely exceeds the nominal size. However, when a test between two proportions is formulated in terms of the ratio of two proportions, such as a test of efficacy, more caution should be made in selecting a test procedure. The performance of the tests depends on the sample size and the range of plausible values of the nuisance parameter. Published in 2000 by John Wiley & Sons, Ltd.  相似文献   

2.
Kang SH  Ahn CW 《Statistics in medicine》2008,27(14):2524-2535
Asymptotic tests such as the Pearson chi-square test are unreliable for testing the homogeneity of two binomial probabilities in extremely unbalanced cases. Two exact tests (conditional and unconditional) are available as alternatives and can be implemented easily in StatXact 6.0. In equal sample cases it is well known that the unconditional exact test is more powerful than the conditional exact test. However, in this paper, we show that the opposite result holds in extremely unbalanced cases. The reason is that the peaks of the type I error occur at the extremes of the nuisance parameter when the imbalance among the sample sizes becomes severe. After we show that the conditional exact test is more powerful than the unconditional exact test in extremely unbalanced cases whose sample ratio is greater than 20, we compare the conditional exact test with the Berger and Boos approach (J. Amer. Stat. Assoc. 1994; 89:1012-1016) in which the supremum is taken over a confidence interval for the nuisance parameter. The Berger and Boos approach turns out to be slightly more powerful than the conditional exact test in extremely unbalanced data. A real example is provided.  相似文献   

3.
The asymptotic Pearson's chi‐squared test and Fisher's exact test have long been the most used for testing association in 2×2 tables. Unconditional tests preserve the significance level and generally are more powerful than Fisher's exact test for moderate to small samples, but previously were disadvantaged by being computationally demanding. This disadvantage is now moot, as software to facilitate unconditional tests has been available for years. Moreover, Fisher's exact test with mid‐p adjustment gives about the same results as an unconditional test. Consequently, several better tests are available, and the choice of a test should depend only on its merits for the application involved. Unconditional tests and the mid‐p approach ought to be used more than they now are. The traditional Fisher's exact test should practically never be used. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

4.
This paper considers methods for testing for superiority or non-inferiority in active-control trials with binary data, when the relative treatment effect is expressed as an odds ratio. Three asymptotic tests for the log-odds ratio based on the unconditional binary likelihood are presented, namely the likelihood ratio, Wald and score tests. All three tests can be implemented straightforwardly in standard statistical software packages, as can the corresponding confidence intervals. Simulations indicate that the three alternatives are similar in terms of the Type I error, with values close to the nominal level. However, when the non-inferiority margin becomes large, the score test slightly exceeds the nominal level. In general, the highest power is obtained from the score test, although all three tests are similar and the observed differences in power are not of practical importance.  相似文献   

5.
The problem of testing non-inferiority in a 2 x 2 matched-pairs sample is considered. Two exact unconditional tests based on the standard and the confidence interval p-values are proposed. Although tests of non-inferiority have two nuisance parameters under the null hypothesis, the exact tests are defined by reducing the dimension of nuisance parameter space from two to one using the monotonicity of the distribution. The exact sizes and powers of these tests and the existing asymptotic test are considered. The exact tests are found to be accurate in view of their size property. In addition, the exact test based on the confidence interval p-value is more powerful than the other exact test. It is shown that the asymptotic test is inaccurate, that is, its size exceeds the claimed nominal level alpha. Therefore, it recommends a cautious approach in use of the asymptotic test for the problem of testing non-inferiority, particularly when sample sizes are small or moderately large.  相似文献   

6.
Recent work has shown that there may be disadvantages in the use of the chi-square-like goodness-of-fit tests for the logistic regression model proposed by Hosmer and Lemeshow that use fixed groups of the estimated probabilities. A particular concern with these grouping strategies based on estimated probabilities, fitted values, is that groups may contain subjects with widely different values of the covariates. It is possible to demonstrate situations where one set of fixed groups shows the model fits while the test rejects fit using a different set of fixed groups. We compare the performance by simulation of these tests to tests based on smoothed residuals proposed by le Cessie and Van Houwelingen and Royston, a score test for an extended logistic regression model proposed by Stukel, the Pearson chi-square and the unweighted residual sum-of- squares. These simulations demonstrate that all but one of Royston's tests have the correct size. An examination of the performance of the tests when the correct model has a quadratic term but a model containing only the linear term has been fit shows that the Pearson chi-square, the unweighted sum-of-squares, the Hosmer–Lemeshow decile of risk, the smoothed residual sum-of-squares and Stukel's score test, have power exceeding 50 per cent to detect moderate departures from linearity when the sample size is 100 and have power over 90 per cent for these same alternatives for samples of size 500. All tests had no power when the correct model had an interaction between a dichotomous and continuous covariate but only the continuous covariate model was fit. Power to detect an incorrectly specified link was poor for samples of size 100. For samples of size 500 Stukel's score test had the best power but it only exceeded 50 per cent to detect an asymmetric link function. The power of the unweighted sum-of-squares test to detect an incorrectly specified link function was slightly less than Stukel's score test. We illustrate the tests within the context of a model for factors associated with low birth weight. © 1997 by John Wiley & Sons, Ltd. Stat. Med., Vol. 16, 965–980 (1997).  相似文献   

7.
Confidence interval (CI) construction with respect to proportion/rate difference for paired binary data has become a standard procedure in many clinical trials and medical studies. When the sample size is small and incomplete data are present, asymptotic CIs may be dubious and exact CIs are not yet available. In this article, we propose exact and approximate unconditional test‐based methods for constructing CI for proportion/rate difference in the presence of incomplete paired binary data. Approaches based on one‐ and two‐sided Wald's tests will be considered. Unlike asymptotic CI estimators, exact unconditional CI estimators always guarantee their coverage probabilities at or above the pre‐specified confidence level. Our empirical studies further show that (i) approximate unconditional CI estimators usually yield shorter expected confidence width (ECW) with their coverage probabilities being well controlled around the pre‐specified confidence level; and (ii) the ECWs of the unconditional two‐sided‐test‐based CI estimators are generally narrower than those of the unconditional one‐sided‐test‐based CI estimators. Moreover, ECWs of asymptotic CIs may not necessarily be narrower than those of two‐sided‐based exact unconditional CIs. Two real examples will be used to illustrate our methodologies. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

8.
When designing a study that may generate a set of sparse 2 × 2 tables, or when confronted with ‘negative’ results upon exact analysis of such tables, we need to compute the power of exact tests. In this paper we provide an efficient approach for computing exact unconditional power for four exact tests on the common odds ratio in a series of 2 × 2 tables. These tests are the traditional exact test; a test based on a probability ordering of the sample space; and two tests based on ordering the sample space according to distance from the mean, or median. For each test, we consider both a conservative version and a mid-P adjusted version. We explore three computational options for power determination: exact power computation, calculation of exact upper and lower bounds for power, and Monte Carlo confidence bounds for power. We present an interactive program implementing these options. For study design, the program may be run several times to arrive at a sample configuration with adequate power.  相似文献   

9.
In planning experiments having two groups of equal but small size, investigators face the uncertainty of power calculations that rely on asymptotic methods. This paper presents a method for determining power for two-sided tests. I compare two randomization tests, Fisher's exact test (FET), and the mid-P (MID), with the uncorrected chi-square test (CHI). Results show power as a function of relative risk for these methods, and assesses their relative power and type I error rates. MID is shown to have intermediate power between CHI, which is the most powerful, and FET, the least powerful. Situations are shown in which CHI and MID occasionally exceed the nominal level of α.  相似文献   

10.
Goodness-of-fit tests for ordinal response regression models   总被引:1,自引:0,他引:1  
It is well documented that the commonly used Pearson chi-square and deviance statistics are not adequate for assessing goodness-of-fit in logistic regression models when continuous covariates are modelled. In recent years, several methods have been proposed which address this shortcoming in the binary logistic regression setting or assess model fit differently. However, these techniques have typically not been extended to the ordinal response setting and few techniques exist to assess model fit in that case. We present the modified Pearson chi-square and deviance tests that are appropriate for assessing goodness-of-fit in ordinal response models when both categorical and continuous covariates are present. The methods have good power to detect omitted interaction terms and reasonable power to detect failure of the proportional odds assumption or modelling the wrong functional form of a continuous covariate. These tests also provide immediate information as to where a model may not fit well. In addition, the methods are simple to understand and implement, and are non-specific. That is, they do not require prespecification of a type of lack-of-fit to detect.  相似文献   

11.
In this paper, we discuss statistical inference for a 2 × 2 table under inverse sampling, where the total number of cases is fixed by design. We demonstrate that the exact unconditional distributions of some relevant statistics differ from the distributions under conventional sampling, where the sample size is fixed by design. This permits us to define a simple unconditional alternative to Fisher's exact test. We provide an asymptotic argument including simulations to demonstrate that there is little power loss associated with the alternative test when the expected event rates are very small. We then apply the method to design a clinical trial in cataract surgery, where a rare side effect occurs in one in 1000 patients. The objective of the trial is to demonstrate that adjuvant treatment with an antibiotic will reduce this risk to one in 2000. We use an inverse sampling design and demonstrate how to set this up in a sequential manner. Particularly simple stopping rules can be defined when using the unconditional alternative to Fisher's exact test. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

12.
BACKGROUND: Biomedical investigators often use unsuitable statistical techniques for analysing the 2 x 2 tables that result from their experimental observations. This is because they are confused by the conflicting, and sometimes inaccurate, advice they receive from statistical texts or statistical consultants. METHODS: These consist of a review of published work, and the use of five different statistical procedures to analyse a 2 x 2 table, executed by StatXact 8.0, Testimate 6.0, Stata 10.0, SAS 9.1 and SPSS 16.0. Discussion and Conclusions It is essential to classify a 2 x 2 table before embarking on its analysis. A useful classification is into (i) Independence trials (doubly conditioned). These almost never occur in biomedical research because they involve predetermining the column and row totals in a 2 x 2 table. The Fisher exact test is the best method for analysing these trials. (ii) Comparative trials (singly conditioned). These correspond to the usual experimental design in biomedical work, in which a sample of convenience is randomized into two treatment groups, so that the group (column) totals are fixed in advance. The proper tests of significance are exact tests on the odds ratio, on the ratio of proportions (relative risk and risk ratio) or on the difference between proportions. (iii) Double dichotomy trials (unconditional). In these, a genuine random sample is taken from a defined population. Thus, neither column nor row totals are fixed in advance. The only practicable test is Pearson's chi(2)-test. In analysing any of the above trials, exact tests are to be much preferred to asymptotic (approximate) tests. The different commercial software packages use different algorithms for exact tests, and can give different outcomes in terms of P-values and confidence intervals. The most useful are StatXact and Testimate.  相似文献   

13.
Assessing goodness-of-fit in logistic regression models can be problematic, in that commonly used deviance or Pearson chi-square statistics do not have approximate chi-square distributions, under the null hypothesis of no lack of fit, when continuous covariates are modelled. We present two easy to implement test statistics similar to the deviance and Pearson chi-square tests that are appropriate when continuous covariates are present. The methodology uses an approach similar to that incorporated by the Hosmer and Lemeshow goodness-of-fit test in that observations are classified into distinct groups according to fitted probabilities, allowing sufficient cell sizes for chi-square testing. The major difference is that the proposed tests perform this grouping within the cross-classification of all categorical covariates in the model and, in some situations, allow for a more powerful assessment of where model predicted and observed counts may differ. A variety of simulations are performed comparing the proposed tests to the Hosmer-Lemeshow test.  相似文献   

14.
In ophthalmologic studies, each subject usually contributes important information for each of two eyes and the values from the two eyes are generally highly correlated. Previous studies showed that test procedures for binary paired data that ignore the presence of intraclass correlation could lead to inflated significance levels. Furthermore, it is possible that asymptotic versions of these procedures that take the intraclass correlation into account could also produce unacceptably high type I error rates when the sample size is small or the data structure is sparse. We propose two alternatives for these situations, namely the exact unconditional and approximate unconditional procedures. According to our simulation results, the exact procedures usually produce extremely conservative empirical type I error rates. That is, the corresponding type I error rates could greatly underestimate the pre-assigned nominal level (e.g. (empirical type I error rate/nominal type I error rate) 0.8). On the other hand, the approximate unconditional procedures usually yield empirical type I error rates close to the pre-chosen nominal level. We illustrate our methodologies with a data set from a retinal detachment study.  相似文献   

15.
In testing the independence of the row and column variables in a two-way contingency table, the standard Pearson and likelihood ratio chi-square statistics often have low power, especially as the dimensions of the table increase. In this paper, one degree of freedom tests of independence based on likelihood ratio and score statistics from an association model for tables with nominal row and column classifications are described. The score statistic is especially easy to use, since it can be expressed in closed form, is simple to compute, and has size and power properties which are only slightly inferior to those of the more complicated likelihood ratio statistic.  相似文献   

16.
Paired dichotomous data may arise in clinical trials such as pre-/post-test comparison studies and equivalence trials. Reporting parameter estimates (e.g. odds ratio, rate difference and rate ratio) along with their associated confidence interval estimates becomes a necessity in many medical journals. Various asymptotic confidence interval estimators have long been developed for differences in correlated binary proportions. Nevertheless, the performance of these asymptotic methods may have poor coverage properties in small samples. In this article, we investigate several alternative confidence interval estimators for the difference between binomial proportions based on small-sample paired data. Specifically, we consider exact and approximate unconditional confidence intervals for rate difference via inverting a score test. The exact unconditional confidence interval guarantees the coverage probability, and it is recommended if strict control of coverage probability is required. However, the exact method tends to be overly conservative and computationally demanding. Our empirical results show that the approximate unconditional score confidence interval estimators based on inverting the score test demonstrate reasonably good coverage properties even in small-sample designs, and yet they are relatively easy to implement computationally. We illustrate the methods using real examples from a pain management study and a cancer study.  相似文献   

17.
In a series of articles, Gart and Nam construct the efficient score tests and confidence intervals with or without skewness correction for stratified comparisons of binomial proportions on the risk difference, relative risk, and odds ratio effect metrics. However, the stratified score methods and their properties are not well understood. We rederive the efficient score tests, which reveals their theoretical relationship with the contrast-based score tests, and provides a basis for adapting the method by using other weighting schemes. The inverse variance weight is optimal for a common treatment effect in large samples. We explore the behavior of the score approach in the presence of extreme outcomes when either no or all subjects in some strata are responders, and provide guidance on the choice of weights in the analysis of rare events. The score method is recommended for studies with a small number of moderate or large sized strata. A general framework is proposed to calculate the asymptotic power and sample size for the score test in superiority, noninferiority and equivalence clinical trials, or case-control studies. We also describe a nearly exact procedure that underestimates the exact power, but the degree of underestimation can be controlled to a negligible level. The proposed methods are illustrated by numerical examples.  相似文献   

18.
Various expressions have appeared for sample size calculation based on the power function of McNemar's test for paired or matched proportions, especially with reference to a matched case-control study. These differ principally with respect to the expression for the variance of the statistic under the alternative hypothesis. In addition to the conditional power function, I identify and compare four distinct unconditional expressions. I show that the unconditional calculation of Schlesselman for the matched case-control study can be expressed as a first-order unconditional calculation as described by Miettinen. Corrections to Schlesselman's unconditional expression presented by Fleiss and Levin and by Dupont, which use different models to describe exposure association among matched cases and controls, are also equivalent to a first-order unconditional calculation. I present a simplification of these corrections that directly provides the underlying table of cell probabilities, from which one can perform any of the alternative sample size calculations. Also, I compare the four unconditional sample size expressions relative to the exact power function. The conclusion is that Miettinen's first-order expression tends to underestimate sample size, while his second-order expression is usually fairly accurate, though possibly slightly anti-conservative. A multinomial-based expression presented by Connor, among others, is also fairly accurate and is usually slightly conservative. Finally, a local unconditional expression of Mitra, among others, tends to be excessively conservative.  相似文献   

19.
This paper discusses some general methods for determining approximate power, sample size, and smallest detectable effect for studies of multiple risk factors. These methods are based on standard large-sample formulae for determining the power of chi-square tests, and emphasis is given to determinations for Pearson chi 2 tests in multiway contingency tables. The methods are illustrated in application to the design of a clinical trial of the preventive effect of alpha-tocopherol, ascorbic acid and beta-carotene on colon polyp recurrence, and a case-control study of the joint effect of smoking and asbestos exposure on lung cancer incidence.  相似文献   

20.
In mapping diseases of complex aetiology, conventional linkage approaches narrow the location of the disease susceptibility locus to quite a large region so that candidate gene association studies are then necessary to further isolate these genes. However, even in the simplest scenario where the candidate locus is bi-allelic, two statistical tests with various correcting factors have been proposed: a chi-square 1 df test (counting chromosomes) which may be slightly conservative and a 2 df chi-square test (counting genotypes) which may lack power because of the extra degree of freedom. This paper introduces a better and more powerful alternative which turns out to be a compromise between the two existing statistical tests. The asymptotic distribution of this test statistic is determined and the efficacy of the 3 tests are compared under different genetic models by simulation. Genet. Epidemiol. 15:135–146,1998. © 1998 Wiley-Liss, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号