首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
OBJECTIVE: The objective of the paper was to design a computer algorithm to calculate sample sizes for estimating proportions incorporating clustered sampling units using a beta-binomial model when information concerning the intraclass correlation is not available. STUDY DESIGN AND SETTING: A computer algorithm was written in FORTRAN and evaluated for a hypothetical sample size situation. RESULTS: The developed algorithm was able to incorporate clustering in estimated sample sizes through the specification of a beta distribution to account for within-cluster correlation. In a hypothetical example, the usual normal approximation method for estimation of a proportion ignoring the clustered sampling design resulted in a calculated sample size of 107, whereas the developed algorithm suggested that 208 sampling units would be necessary. CONCLUSION: It is important to incorporate cluster adjustment in sample size calculations when designing epidemiologic studies for estimation of disease burden and other population proportions in the situation of correlated data even when information concerning the intraclass correlation is not available. Beta-binomial models can be used to account for clustering, and design effects can be estimated by generating beta distributions that encompass within-cluster correlation.  相似文献   

2.
Methods for estimating the size of a closed population often consist of fitting some model (e.g. a log-linear model) to data with a missing cell corresponding to the members of the population missed by all reporting sources. Although the use of the asymptotic standard error is the usual method for forming confidence intervals for the population total, the sample sizes are not always large enough to produce valid confidence intervals. We propose a method for forming confidence intervals based upon changes in a goodness-of-fit statistic associated with changes in trial values of the population total.  相似文献   

3.
Clinical trial simulations were conducted to assess power and sample size requirements for a population pharmacokinetic (PK) substudy of a phase III clinical trial. The simulations were based on a population PK model developed from phase I healthy volunteer data. A sparse sampling design was employed taking into account the practical considerations regarding the desire not to keep patients at the study sites for extended periods of time for blood sampling. It was expected that the sparse sampling design would not support fitting the same model developed in healthy volunteers due to the narrow range of sampling times. Therefore, a model with fewer parameters and variance components was fit to simulated data from the proposed design to assess the bias in the estimates of the population mean PK parameters and variance components. Results indicate that the proposed design employing the simple model can provide accurate mean estimates of oral drug clearance (CL) and the apparent steady-state volume of distribution (V(ss)). However, the simulation results also suggest that the size and power of the likelihood ratio test for subpopulation differences in CL are inflated when using the simple model.  相似文献   

4.
Powerful array‐based single‐nucleotide polymorphism‐typing platforms have recently heralded a new era in which genome‐wide studies are conducted with increasing frequency. A genetic polymorphism associated with population pharmacokinetics (PK) is typically analyzed using nonlinear mixed‐effect models (NLMM). Applying NLMM to large‐scale data, such as those generated by genome‐wide studies, raises several issues related to the assumption of random effects as follows: (i) computation time: it takes a long time to compute the marginal likelihood; (ii) convergence of iterative calculation: an adaptive Gauss–Hermite quadrature is generally used to estimate NLMM; however, iterative calculations may not converge in complex models; and (iii) random‐effects misspecification leads to slightly inflated type‐I error rates. As an alternative effective approach to resolving these issues, in this article, we propose a generalized estimating equation (GEE) approach for analyzing population PK data. In general, GEE analysis does not account for interindividual variability in PK parameters; therefore, the usual GEE estimators cannot be interpreted straightforwardly, and their validities have not been justified. Here, we propose valid inference methods for using GEE even under conditions of interindividual variability and provide theoretical justifications of the proposed GEE estimators for population PK data. In numerical evaluations by simulations, the proposed GEE approach exhibited high computational speed and stability relative to the NLMM approach. Furthermore, the NLMM analysis was sensitive to the misspecification of the random‐effects distribution, and the proposed GEE inference is valid for any distributional form. We provided an illustration by using data from a genome‐wide pharmacogenomic study of an anticancer drug. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

5.
In the pharmacokinetic (PK) study under a 2x2 crossover design that involves both the test and reference drugs, we propose a mixed‐effects model for the drug concentration‐time profiles obtained from subjects who receive different drugs at different periods. In the proposed model, the drug concentrations repeatedly measured from the same subject at different time points are distributed according to a multivariate generalized gamma distribution, and the drug concentration‐time profiles are described by a compartmental PK model with between‐subject and within‐subject variations. We then suggest a bioequivalence test based on the estimated bioavailability parameters in the proposed mixed‐effects model. The results of a Monte Carlo study further show that the proposed model‐based bioequivalence test is not only better on maintaining its level but also more powerful for detecting the bioequivalence of the two drugs than the conventional bioequivalence test based on a non‐compartmental analysis or the one based on a mixed‐effects model with a normal error variable. The application of the proposed model and test is finally illustrated by using data sets in two PK studies. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

6.
In clinical data analysis, the restricted maximum likelihood (REML) method has been commonly used for estimating variance components in the linear mixed effects model. Under the REML estimation, however, it is not straightforward to compare several linear mixed effects models with different mean and covariance structures. In particular, few approaches have been proposed for the comparison of linear mixed effects models with different mean structures under the REML estimation. We propose an approach using extended information criterion (EIC), which is a bootstrap-based extension of AIC, for comparing linear mixed effects models with different mean and covariance structures under the REML estimation. We present simulation studies and applications to two actual clinical data sets.  相似文献   

7.
Population pharmacokinetic (PK) and pharmacodynamic (PD) studies evaluate drug concentration profiles and pharmacological effects over time when standard drug dosage regimens are assigned. They constitute a scientific basis for the determination of the optimal dosage of a new drug. Population PK/PD analyses can be performed on relatively few measures per patient enabling the study of a sizable sample of patients who take the drug over a possibly long period of time. We expose the problem of bias in PK/PD estimators in the presence of partial compliance with assigned treatment as it occurs in practice. We propose to solve this by recording accurate data on a number of previous dose timings and using timing-explicit hierarchical non-linear models for analysis. In practice, we rely on electronic measures of an ambulatory patient's drug dosing histories. Especially for non-linear PD estimation, we found that not only bias can be reduced, but higher precision can also be retrieved from the same number of data points when irregular drug intake times occur in well-controlled studies. We apply methods proposed by Mentré et al. to investigate the information matrix for hierarchical non-linear models. This confirms that a substantial gain in precision can be expected due to irregular drug intakes. Intuitively, this is explained by the fact that regular takers experience a relatively small range of concentrations, which makes it hard to estimate any deviation from linearity in the effect model. We conclude that estimators of PK/PD parameters can benefit greatly from information that enters through greater variation in the drug exposure process.  相似文献   

8.
We propose a new, less costly, design to test the equivalence of digital versus analogue mammography in terms of sensitivity and specificity. Because breast cancer is a rare event among asymptomatic women, the sample size for testing equivalence of sensitivity is larger than that for testing equivalence of specificity. Hence calculations of sample size are based on sensitivity. With the proposed design it is possible to achieve the same power as a completely paired design by increasing the number of less costly analogue mammograms and not giving the more expensive digital mammograms to some randomly selected subjects who are negative on the analogue mammogram. The key idea is that subjects who are negative on the analogue mammogram are unlikely to have cancer and hence contribute less information for estimating sensitivity than subjects who are positive on the analogue mammogram. To ascertain disease state among subjects not biopsied, we propose another analogue mammogram at a later time determined by a natural history model. The design differs from a double sampling design because it compares two imperfect tests instead of combining information from a perfect and imperfect test. © 1998 John Wiley & Sons, Ltd.  相似文献   

9.
Lim J  Wang X  Lee S  Jung SH 《Statistics in medicine》2008,27(19):3833-3846
We propose a distribution-free procedure, an analogy of the DIP test in non-parametric regression, to test whether the means of responses are constant over time in repeated measures data. Unlike the existing tests, the proposed procedure requires very minimal assumptions to the distributions of both random effects and errors. We study the asymptotic reference distribution of the test statistic analytically and propose a permutation procedure to approximate the finite-sample reference distribution. The size and power of the proposed test are illustrated and compared with competitors through several simulation studies. We find that it performs well for data of small sizes, regardless of model specification. Finally, we apply our test to a data example to compare the effect of fatigue in two different methods used for cardiopulmonary resuscitation.  相似文献   

10.
Methods for sample size calculations in ROC studies often assume independent normal distributions for test scores among the diseased and nondiseased populations. We consider sample size requirements under the default two-group normal model when the data distribution for the diseased population is either skewed or multimodal. For these two common scenarios we investigate the potential for robustness of calculated sample sizes under the mis-specified normal model and we compare to sample sizes calculated under a more flexible nonparametric Dirichlet process mixture model. We also highlight the utility of flexible models for ROC data analysis and their importance to study design. When nonstandard distributional shapes are anticipated, our Bayesian nonparametric approach allows investigators to determine a sample size based on the use of more appropriate distributional assumptions than are generally applied. The method also provides researchers a tool to conduct a sensitivity analysis to sample size calculations that are based on a two-group normal model. We extend the proposed approach to comparative studies involving two continuous tests. Our simulation-based procedure is implemented using the WinBUGS and R software packages and example code is made available.  相似文献   

11.
We focus on the Fisher information matrix used for design evaluation and optimization in nonlinear mixed effects multiple response models. We evaluate the appropriateness of its expression computed by linearization as proposed for a single response model. Using a pharmacokinetic–pharmacodynamic (PKPD) example, we first compare the computation of the Fisher information matrix with approximation to one derived from the observed matrix on a large simulation using the stochastic approximation expectation–maximization algorithm (SAEM). The expression of the Fisher information matrix for multiple responses is also evaluated by comparison with the empirical information obtained through a replicated simulation study using the first‐order linearization estimation methods implemented in the NONMEM software (first‐order (FO), first‐order conditional estimate (FOCE)) and the SAEM algorithm in the MONOLIX software. The predicted errors given by the approximated information matrix are close to those given by the information matrix obtained without linearization using SAEM and to the empirical ones obtained with FOCE and SAEM. The simulation study also illustrates the accuracy of both FOCE and SAEM estimation algorithms when jointly modelling multiple responses and the major limitations of the FO method. This study highlights the appropriateness of the approximated Fisher information matrix for multiple responses, which is implemented in PFIM 3.0, an extension of the R function PFIM dedicated to design evaluation and optimization. It also emphasizes the use of this computing tool for designing population multiple response studies, as for instance in PKPD studies or in PK studies including the modelling of the PK of a drug and its active metabolite. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

12.
We investigate sample size determination for Cochran's test for stratified case-control studies when samples of cases and controls are allocated to maximize the asymptotic efficiency of Cochran's test subject to fixed total cost with cost per control varying by strata. We consider two situations typical of strata-matched case-control studies: when one samples both cases and controls and when cases are given and one samples controls. In each situation we develop and study an asymptotic method for finding the sample size required for a specific power under the optimum allocation proposed by Nam and Fears. Also, for the second situation, we investigate an asymptotic method for determining the common ratio, k, in one-to-k strata-matched case-control studies without cost consideration for a given power. When cases are given, neither the optimum nor the standard control sample sizes appear in a closed form; we present numerical methods for calculating these sample sizes and illustrate them with examples. We find the reduction in total cost obtained under the optimum allocation compared to standard allocation more pronounced as the differences in stratum-specific costs of sampling controls increase.  相似文献   

13.
The recent successes of GWAS based on large sample sizes motivate combining independent datasets to obtain larger sample sizes and thereby increase statistical power. Analysis methods that can accommodate different study designs, such as family-based and case-control designs, are of general interest. However, population stratification can cause spurious association for population-based association analyses. For family-based association analysis that infers missing parental genotypes based on the allele frequencies estimated in the entire sample, the parental mating-type probabilities may not be correctly estimated in the presence of population stratification. Therefore, any approach to combining family and case-control data should also properly account for population stratification. Although several methods have been proposed to accommodate family-based and case-control data, all have restrictions. Most of them require sampling a homogeneous population, which may not be a reasonable assumption for data from a large consortium. One of the methods, FamCC, can account for population stratification and uses nuclear families with arbitrary number of siblings but requires parental genotype data, which are often unavailable for late-onset diseases. We extended the family-based test, Association in the Presence of Linkage (APL), to combine family and case-control data (CAPL). CAPL can accommodate case-control data and families with multiple affected siblings and missing parents in the presence of population stratification. We used simulations to demonstrate that CAPL is a valid test either in a homogeneous population or in the presence of population stratification. We also showed that CAPL can have more power than other methods that combine family and case-control data.  相似文献   

14.
Calculating sample sizes required to achieve a specified level of precision when estimating population parameters is a common statistical task. As consumer surveys become increasingly common for nursing homes, home care agencies, other service providers, and state and local administrative agencies, standard methods to calculate sample size may not be adequate. Standard methods typically assume a normal approximation and require the specification of a plausible value of the unknown population trait. This paper presents a strategy to estimate sample sizes for small finite populations and when a range of possible population values is specified. This sampling strategy is hierarchical, employing first a hypergeometric sampling model, which directly addresses the finite population concern. This level is then coupled with a beta-binomial distribution for the number of population elements possessing the characteristic of interest. This second level addresses the concern that the population trait may range over an interval of values. The utility of this strategy is illustrated using a study of resident satisfaction in nursing homes.  相似文献   

15.
Quadratic inference functions (QIF) methodology is an important alternative to the generalized estimating equations (GEE) method in the longitudinal marginal model, as it offers higher estimation efficiency than the GEE when correlation structure is misspecified. The focus of this paper is on sample size determination and power calculation for QIF based on the Wald test in a marginal logistic model with covariates of treatment, time, and treatment-time interaction. We have made three contributions in this paper: (i) we derived formulas of sample size and power for QIF and compared their performance with those given by the GEE; (ii) we proposed an optimal scheme of sample size determination to overcome the difficulty of unknown true correlation matrix in the sense of minimal average risk; and (iii) we studied properties of both QIF and GEE sample size formulas in relation to the number of follow-up visits and found that the QIF gave more robust sample sizes than the GEE. Using numerical examples, we illustrated that without sacrificing statistical power, the QIF design leads to sample size saving and hence lower study cost in comparison with the GEE analysis. We conclude that the QIF analysis is appealing for longitudinal studies.  相似文献   

16.
Family‐based genetic association studies of related individuals provide opportunities to detect genetic variants that complement studies of unrelated individuals. Most statistical methods for family association studies for common variants are single marker based, which test one SNP a time. In this paper, we consider testing the effect of an SNP set, e.g., SNPs in a gene, in family studies, for both continuous and discrete traits. Specifically, we propose a generalized estimating equations (GEEs) based kernel association test, a variance component based testing method, to test for the association between a phenotype and multiple variants in an SNP set jointly using family samples. The proposed approach allows for both continuous and discrete traits, where the correlation among family members is taken into account through the use of an empirical covariance estimator. We derive the theoretical distribution of the proposed statistic under the null and develop analytical methods to calculate the P‐values. We also propose an efficient resampling method for correcting for small sample size bias in family studies. The proposed method allows for easily incorporating covariates and SNP‐SNP interactions. Simulation studies show that the proposed method properly controls for type I error rates under both random and ascertained sampling schemes in family studies. We demonstrate through simulation studies that our approach has superior performance for association mapping compared to the single marker based minimum P‐value GEE test for an SNP‐set effect over a range of scenarios. We illustrate the application of the proposed method using data from the Cleveland Family GWAS Study.  相似文献   

17.
Diagnostic tests rarely provide perfect results. The misclassification induced by imperfect sensitivities and specificities of diagnostic tests must be accounted for when planning prevalence studies or investigations into properties of new tests. The previous work has shown that applying a single imperfect test to estimate prevalence can often result in very large sample size requirements, and that sometimes even an infinite sample size is insufficient for precise estimation because the problem is non‐identifiable. Adding a second test can sometimes reduce the sample size substantially, but infinite sample sizes can still occur as the problem remains non‐identifiable. We investigate the further improvement possible when three diagnostic tests are to be applied. We first develop methods required for studies when three conditionally independent tests are available, using different Bayesian criteria. We then apply these criteria to prototypic scenarios, showing that large sample size reductions can occur compared to when only one or two tests are used. As the problem is now identifiable, infinite sample sizes cannot occur except in pathological situations. Finally, we relax the conditional independence assumption, demonstrating in this once again non‐identifiable situation that sample sizes may substantially grow and possibly be infinite. We apply our methods to the planning of two infectious disease studies, the first designed to estimate the prevalence of Strongyloides infection, and the second relating to estimating the sensitivity of a new test for tuberculosis transmission. The much smaller sample sizes that are typically required when three as compared to one or two tests are used should encourage researchers to plan their studies using more than two diagnostic tests whenever possible. User‐friendly software is available for both design and analysis stages greatly facilitating the use of these methods. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

18.
Next‐generation sequencing technologies are making it possible to study the role of rare variants in human disease. Many studies balance statistical power with cost‐effectiveness by (a) sampling from phenotypic extremes and (b) utilizing a two‐stage design. Two‐stage designs include a broad‐based discovery phase and selection of a subset of potential causal genes/variants to be further examined in independent samples. We evaluate three parameters: first, the gain in statistical power due to extreme sampling to discover causal variants; second, the informativeness of initial (Phase I) association statistics to select genes/variants for follow‐up; third, the impact of extreme and random sampling in (Phase 2) replication. We present a quantitative method to select individuals from the phenotypic extremes of a binary trait, and simulate disease association studies under a variety of sample sizes and sampling schemes. First, we find that while studies sampling from extremes have excellent power to discover rare variants, they have limited power to associate them to phenotype—suggesting high false‐negative rates for upcoming studies. Second, consistent with previous studies, we find that the effect sizes estimated in these studies are expected to be systematically larger compared with the overall population effect size; in a well‐cited lipids study, we estimate the reported effect to be twofold larger. Third, replication studies require large samples from the general population to have sufficient power; extreme sampling could reduce the required sample size as much as fourfold. Our observations offer practical guidance for the design and interpretation of studies that utilize extreme sampling. Genet. Epidemiol. 35: 236‐246, 2011. © 2011 Wiley‐Liss, Inc.  相似文献   

19.
Cluster randomized designs are frequently employed in pragmatic clinical trials which test interventions in the full spectrum of everyday clinical settings in order to maximize applicability and generalizability. In this study, we propose to directly incorporate pragmatic features into power analysis for cluster randomized trials with count outcomes. The pragmatic features considered include arbitrary randomization ratio, overdispersion, random variability in cluster size, and unequal lengths of follow-up over which the count outcome is measured. The proposed method is developed based on generalized estimating equation (GEE) and it is advantageous in that the sample size formula retains a closed form, facilitating its implementation in pragmatic trials. We theoretically explore the impact of various pragmatic features on sample size requirements. An efficient Jackknife algorithm is presented to address the problem of underestimated variance by the GEE sandwich estimator when the number of clusters is small. We assess the performance of the proposed sample size method through extensive simulation and an application example to a real clinical trial is presented.  相似文献   

20.
Analysis of population‐based case–control studies with complex sampling designs is challenging because the sample selection probabilities (and, therefore, the sample weights) depend on the response variable and covariates. Commonly, the design‐consistent (weighted) estimators of the parameters of the population regression model are obtained by solving (sample) weighted estimating equations. Weighted estimators, however, are known to be inefficient when the weights are highly variable as is typical for case–control designs. In this paper, we propose two alternative estimators that have higher efficiency and smaller finite sample bias compared with the weighted estimator. Both methods incorporate the information included in the sample weights by modeling the sample expectation of the weights conditional on design variables. We discuss benefits and limitations of each of the two proposed estimators emphasizing efficiency and robustness. We compare the finite sample properties of the two new estimators and traditionally used weighted estimators with the use of simulated data under various sampling scenarios. We apply the methods to the U.S. Kidney Cancer Case‐Control Study to identify risk factors. Published 2012. This article is a US Government work and is in the public domain in the USA.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号