首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In the examination of the association between vaccines and rare adverse events after vaccination in postlicensure observational studies, it is challenging to define appropriate risk windows because prelicensure RCTs provide little insight on the timing of specific adverse events. Past vaccine safety studies have often used prespecified risk windows based on prior publications, biological understanding of the vaccine, and expert opinion. Recently, a data‐driven approach was developed to identify appropriate risk windows for vaccine safety studies that use the self‐controlled case series design. This approach employs both the maximum incidence rate ratio and the linear relation between the estimated incidence rate ratio and the inverse of average person time at risk, given a specified risk window. In this paper, we present a scan statistic that can identify appropriate risk windows in vaccine safety studies using the self‐controlled case series design while taking into account the dependence of time intervals within an individual and while adjusting for time‐varying covariates such as age and seasonality. This approach uses the maximum likelihood ratio test based on fixed‐effects models, which has been used for analyzing data from self‐controlled case series design in addition to conditional Poisson models. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

2.
3.
Multisample U‐statistics encompass a wide class of test statistics that allow the comparison of 2 or more distributions. U‐statistics are especially powerful because they can be applied to both numeric and nonnumeric data, eg, ordinal and categorical data where a pairwise similarity or distance‐like measure between categories is available. However, when comparing the distribution of a variable across 2 or more groups, observed differences may be due to confounding covariates. For example, in a case‐control study, the distribution of exposure in cases may differ from that in controls entirely because of variables that are related to both exposure and case status and are distributed differently among case and control participants. We propose to use individually reweighted data (ie, using the stratification score for retrospective data or the propensity score for prospective data) to construct adjusted U‐statistics that can test the equality of distributions across 2 (or more) groups in the presence of confounding covariates. Asymptotic normality of our adjusted U‐statistics is established and a closed form expression of their asymptotic variance is presented. The utility of our approach is demonstrated through simulation studies, as well as in an analysis of data from a case‐control study conducted among African‐Americans, comparing whether the similarity in haplotypes (ie, sets of adjacent genetic loci inherited from the same parent) occurring in a case and a control participant differs from the similarity in haplotypes occurring in 2 control participants.  相似文献   

4.
The calculation of empirical p-values for genome-wide non-parametric linkage tests continues to present significant computational challenges for many complex disease mapping studies. The gold standard approach is to use gene dropping to simulate null genome scans. Unfortunately, this approach is too computationally expensive for many data sets of interest. An alternative, more efficient method for sampling null genome scans is to pre-calculate pools of family-specific statistics and then resample from these replicate pools to generate "pseudo-replicate" genome scans. In this study, we use simulations to explore properties of the replicate pool p-value estimator pRP and show that it provides an excellent approximation to the traditional gene-dropping estimator for significantly less computational effort. While the computational efficiency of the replicate pool estimator is noticeable in almost all data sets, by applying the replicate pool method to several previously characterized data sets we show that savings in computational effort can be especially significant (on the order of 10,000-fold or more) when one or more large families are analyzed. We also estimate replicate pool p-values for the schizophrenia data described by Abecasis et al. and show that pRP closely approximates gene-drop p-values for all linkage peaks reported for this study. Lastly, we expand upon Song et al.'s previous work by deriving a conservative estimator of the variance for PRP that can easily be computed in practical settings. We have implemented the replicate pool method along with our variance estimator in a new program called Pseudo, which is the first widely available automated implementation of the replicate pool method.  相似文献   

5.
Scan statistics are used in public health applications to detect increases in rates or clusters of disease indicated by an unusually large number of events. Most of the work has been for the retrospective case, in which a single set of historical data is to be analyzed. A modification of this retrospective scan statistic has been recommended for use when incidences of an event are recorded as they occur over time (prospectively) to determine whether the underlying incidence rate has increased, preferably as soon as possible after such an increase. In this paper, we investigate the properties of the scan statistic when used in prospective surveillance of the incidence rate under the assumption of independent Bernoulli observations. We show how to evaluate the expected number of Bernoulli observations needed to generate a signal that the incidence rate has increased. We compare the performance of the prospective scan statistic method with that obtained using the Bernoulli-based cumulative sum (CUSUM) technique. We show that the latter tends to be more effective in detecting sustained increases in the rate, but the scan method may be preferred in some applications due to its simplicity and can be used with relatively little loss of efficiency.  相似文献   

6.
An approximation for the distribution of the scan statistic   总被引:2,自引:0,他引:2  
The scan statistic evaluates whether an apparent cluster of disease in time is due to chance. The statistic employs a 'moving window' of length w and finds the maximum number of cases revealed through the window as it scans or slides over the entire time period T. Computation of the probability of observing a certain size cluster, under the hypothesis of a uniform distribution, is infeasible when N, the total number of events, is large, and w is of moderate or small size relative to T. We give an approximation that is an asymptotic upper bound, easy to compute, and, for the purposes of hypothesis testing, more accurate than other approximations presented in the literature. The approximation applies both when N is fixed, and when N has a Poisson distribution. We illustrate the procedure on a data set of trisomic spontaneous abortions observed in a two year period in New York City.  相似文献   

7.
In a meta-analysis, we assemble a sample of independent, nonidentically distributed p-values. The Fisher's combination procedure provides a chi-squared test of whether the p-values were sampled from the null uniform distribution. After rejecting the null uniform hypothesis, we are faced with the problem of how to combine the assembled p-values. We first derive a distribution for the p-values. The distribution is parameterized by the standardized mean difference (SMD) and the sample size. It includes the uniform as a special case. The maximum likelihood estimate (MLE) of the SMD can then be obtained from the independent, nonidentically distributed p-values. The MLE can be interpreted as a weighted average of the study-specific estimate of the effect size with a shrinkage. The method is broadly applicable to p-values obtained in the maximum likelihood framework. Simulation studies show that our method can effectively estimate the effect size with as few as 6 p-values in the meta-analyses. We also present a Bayes estimator for SMD and a method to account for publication bias. We demonstrate our methods on several meta-analyses that assess the potential benefits of citicoline for patients with memory disorders or patients recovering from ischemic stroke.  相似文献   

8.
An omnibus permutation test of the overall null hypothesis can be used to assess the association of an entire ensemble of genetic markers with disease in case-control studies. In this approach, p-values for univariate marker-specific Armitage trend tests are combined to form a scalar statistic, which is then used in a permutation test to determine an overall p-value. Two previously described competing methods utilize either a standard two-sample Hotelling's T2 statistic or a global U statistic that is a weighted sum of univariate U statistics. In contrast to Hotelling's test, omnibus tests are much less sensitive to missing data, and utilize all available data. In contrast to the global U test, omnibus tests do not require that the direction of the effects of the individual markers on the risk of disease be correctly specified in advance; in fact, any combination of one- and two-sided univariate tests can be used. Simulations show that, even under circumstances favoring the competing tests (no missing data; direction of effects known), omnibus permutation tests based on Fisher's combining function or the Anderson-Darling statistic typically have power comparable to or greater than Hotelling's and the global U tests.  相似文献   

9.
Haplotype sharing analysis is a well‐established option for the investigation of the etiology of complex diseases. The statistical power of haplotype association methods depends strongly on how the information of unobserved haplotypes can be captured by multilocus genotypes. In this study we combine an entropy‐based marker selection algorithm (EMS), with a haplotype sharing‐based Mantel statistics into a new algorithm. Genetic markers are iteratively selected by their multilocus linkage disequilibrium (LD), which is assessed by a normalized entropy difference. The initial marker set is gradually enlarged to increase the available information on the amount of sharing around a potential susceptibility marker. Markers are rejected from joint phasing if they do not increase the multilocus LD. In simulated candidate gene studies, the Mantel statistics combined with the new EMS performs as well or better at detecting the disease single nucleotide polymorphism—or in indirect association analysis its flanking markers—than the Mantel statistics without selection of markers prior to haplotype estimation and the Mantel statistics using sliding windows of size five. It is therefore appealing to apply our selection approach for haplotype‐based association analysis, since marker selection driven by the observed data avoids both the arbitrary choice of markers when using a fixed window size, as well as the estimation of haplotype block structure. Genet. Epidemiol. 34: 354–363, 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

10.
Although a standard genome‐wide significance level has been accepted for the testing of association between common genetic variants and disease, the era of whole‐genome sequencing (WGS) requires a new threshold. The allele frequency spectrum of sequence‐identified variants is very different from common variants, and the identified rare genetic variation is usually jointly analyzed in a series of genomic windows or regions. In nearby or overlapping windows, these test statistics will be correlated, and the degree of correlation is likely to depend on the choice of window size, overlap, and the test statistic. Furthermore, multiple analyses may be performed using different windows or test statistics. Here we propose an empirical approach for estimating genome‐wide significance thresholds for data arising from WGS studies, and we demonstrate that the empirical threshold can be efficiently estimated by extrapolating from calculations performed on a small genomic region. Because analysis of WGS may need to be repeated with different choices of test statistics or windows, this prediction approach makes it computationally feasible to estimate genome‐wide significance thresholds for different analysis choices. Based on UK10K whole‐genome sequence data, we derive genome‐wide significance thresholds ranging between 2.5 × 10?8 and 8 × 10?8 for our analytic choices in window‐based testing, and thresholds of 0.6 × 10?8–1.5 × 10?8 for a combined analytic strategy of testing common variants using single‐SNP tests together with rare variants analyzed with our sliding‐window test strategy.  相似文献   

11.
Heinze G 《Statistics in medicine》2006,25(24):4216-4226
In logistic regression analysis of small or sparse data sets, results obtained by classical maximum likelihood methods cannot be generally trusted. In such analyses it may even happen that the likelihood meets the convergence criteria while at least one parameter estimate diverges to +/-infinity. This situation has been termed 'separation', and it typically occurs whenever no events are observed in one of the two groups defined by a dichotomous covariate. More generally, separation is caused by a linear combination of continuous or dichotomous covariates that perfectly separates events from non-events. Separation implies infinite or zero maximum likelihood estimates of odds ratios, which are usually considered unrealistic. I provide some examples of separation and near-separation in clinical data sets and discuss some options to analyse such data, including exact logistic regression analysis and a penalized likelihood approach. Both methods supply finite point estimates in case of separation. Profile penalized likelihood confidence intervals for parameters show excellent behaviour in terms of coverage probability and provide higher power than exact confidence intervals. General advantages of the penalized likelihood approach are discussed.  相似文献   

12.
BACKGROUND: The authors show how information collected on retrospective occurrence times may be combined with prospective occurrence times in the analysis of recurrent events from cohort studies. METHODS: We demonstrate how the observed data can be expanded from one to two records per participant and account for the within-individual dependence when estimating variances. We illustrate our methods using data from the Women's Interagency HIV Study, which recorded 384 retrospective and 352 prospective occurrences of pneumonia in 9478 retrospective and 7857 prospective person-years among 2610 adult women. RESULTS: The hazard of non-Pneumocystis carinii pneumonia among the 2056 HIV-1 infected women was 2.24 times (95% confidence limits: 1.74, 2.89) that of the 554 uninfected women, independent of age. This hazard ratio was homogeneous across retrospective and prospective occurrences (P for interaction = 0.96) and combining occurrence types increased the precision by reducing the standard error by about a fourth. CONCLUSIONS: As expected, HIV-1 infection increases the hazard of pneumonia, with more precise inference obtained by combining information available on bidirectional occurrences. The proposed method for the analysis of bidirectional occurrence times will improve precision when the estimated associations are homogeneous across occurrence types, or may provide added insight into either the data collection or disease process when the estimated associations are heterogeneous.  相似文献   

13.
Brazil is the world's largest consumer of pesticides. Epidemiological studies have shown an association between maternal exposure to pesticides and adverse pregnancy events. An ecological study was conducted to investigate potential relations between per capita pesticide consumption and adverse events in live born infants in micro-regions in the South of Brazil (1996-2000). The data were obtained from the Brazilian Institute of Geography and Statistics (IBGE) and the Health Information Department of the Unified National Health System (DATASUS). Micro-regions were grouped into quartiles of pesticide consumption, and prevalence ratios (PR) were calculated. Linear trend p-values were obtained with the chi-square test. Premature birth (gestational age < 22 weeks) and low 1 and 5-minute Apgar score (< 8) in both boys and girls showed a significantly higher PR in the upper quartile of pesticide consumption. No significant differences were observed for low birth weight. The findings suggest that prenatal pesticide exposure is a risk factor for adverse pregnancy events such as premature birth and inadequate maturation.  相似文献   

14.
In some controlled clinical trials in dental research, multiple failure time data from the same patient are frequently observed that result in clustered multiple failure time. Moreover, the treatments are often delivered by more than one operator and thus the multiple failure times are clustered according to a multilevel structure when the operator effects are assumed to be random. In practice, it is often too expensive or even impossible to monitor the study subjects continuously, but they are examined periodically at some regular pre-scheduled visits. Hence, discrete or grouped clustered failure time data are collected. The aim of this paper is to illustrate the use of the Monte Carlo Markov chain (MCMC) approach and non-informative prior in a Bayesian framework to mimic the maximum likelihood (ML) estimation in a frequentist approach in multilevel modelling of clustered grouped survival data. A three-level model with additive variance components model for the random effects is considered in this paper. Both the grouped proportional hazards model and the dynamic logistic regression model are used. The approximate intra-cluster correlation of the log failure times can be estimated when the grouped proportional hazards model is used. The statistical package WinBUGS is adopted to estimate the parameter of interest based on the MCMC method. The models and method are applied to a data set obtained from a prospective clinical study on a cohort of Chinese school children that atraumatic restorative treatment (ART) restorations were placed on permanent teeth with carious lesions. Altogether 284 ART restorations were placed by five dentists and clinical status of the ART restorations was evaluated annually for 6 years after placement, thus clustered grouped failure times of the restorations were recorded. Results based on the grouped proportional hazards model revealed that clustering effect among the log failure times of the different restorations from the same child was fairly strong (corr(child)=0.55) but the effects attributed to the dentists could be regarded as negligible (corr(dentist)=0.03). Gender and the location of the restoration were found to have no effects on the failure times and no difference in failure times was found between small restorations placed on molars and non-molars. Large restorations placed on molars were found to have shorter failure times compared to small restorations. The estimates of the baseline parameters were increasing indicating increasing hazard rates from interval 1 to 6. Results based on the logistic regression models were similar. In conclusion, the use of the MCMC approach and non-informative prior in a Bayesian framework to mimic the ML estimation in a frequentist approach in multilevel modelling of clustered grouped survival data can be easily applied with the use of the software WinBUGS.  相似文献   

15.
We describe a hierarchical regression modeling approach to selection of a subset of markers from the first stage of a genomewide association scan to carry forward to subsequent stages for testing on an independent set of subjects. Rather than simply selecting a subset of most significant marker-disease associations at some cutoff chosen to maximize the cost efficiency of a multistage design, we propose a prior model for the true noncentrality parameters of these associations composed of a large mass at zero and a continuous distribution of nonzero values. The prior probability of nonzero values and their prior means can be functions of various covariates characterizing each marker, such as their location relative to genes or evolutionary conserved regions, or prior linkage or association data. We propose to take the top ranked posterior expectations of the noncentrality parameters for confirmation in later stages of a genomewide scan. The statistical performance of this approach is compared with the traditional p-value ranking by simulation studies. We show that the ranking by posterior expectations performs better at selecting the true positive association than a simple ranking of p-values if at least some of the prior covariates have predictive value.  相似文献   

16.
The effect of worker's location, orientation, and activity on exposure   总被引:1,自引:0,他引:1  
The impact of a worker's location, orientation, and activity was studied in an experimental room (2.86 m x 2.35 m x 2.86 m) at known flow rates of 5.5 m(3)/min and 3.3 m(3)/min. A person in the room, wearing a full-facepiece, air-supplied respirator represented a worker. Propylene tracer gas was emitted at a constant rate from a 1-m pedestal at the center of the room and a continuous air sample was drawn from a point midway between the worker's mouth and nose. Breathing zone concentration (BZC) was monitored at 12 worker locations within the room for a stationary worker. At each location, BZCs were measured separately for four worker orientations: east, west, south, and north. BZCs of a walking worker were also monitored along the path defined by the 12 worker locations used in the stationary experiments. In a separate set of experiments, area concentration was monitored to see whether the worker's activity disturbed the contaminant concentrations at a fixed sampling point located behind the source looking from the direction of air inlet (location: 1.34 m, 1.20 m, 0.45 m). The following average differences in BZC over the 12 fixed locations were observed: 43% higher for near-field than for far-field locations; 20% higher when the worker was facing the source than when facing away (p-values for all four conditions: < 0.033), and 30% higher for a moving worker than for a stationary worker (p-values for all four conditions: < 0.01). When the worker was walking, the concentration at the fixed area sampling point was generally lower than the area concentration when the worker was absent or stationary in the room, possibly due to greater mixing of room air by the worker's movement. Because a worker's activities may be irregular and complicated, incorporating them as parameters in mathematical models is often not feasible. Instead, these findings may be used to assess uncertainty or adjust exposure estimates from simple models.  相似文献   

17.

Background

The spatial and space-time scan statistics are commonly applied for the detection of geographical disease clusters. Monte Carlo hypothesis testing is typically used to test whether the geographical clusters are statistically significant as there is no known way to calculate the null distribution analytically. In Monte Carlo hypothesis testing, simulated random data are generated multiple times under the null hypothesis, and the p-value is r/(R + 1), where R is the number of simulated random replicates of the data and r is the rank of the test statistic from the real data compared to the same test statistics calculated from each of the random data sets. A drawback to this powerful technique is that each additional digit of p-value precision requires ten times as many replicated datasets, and the additional processing can lead to excessive run times.

Results

We propose a new method for obtaining more precise p-values with a given number of replicates. The collection of test statistics from the random replicates is used to estimate the true distribution of the test statistic under the null hypothesis by fitting a continuous distribution to these observations. The choice of distribution is critical, and for the spatial and space-time scan statistics, the extreme value Gumbel distribution performs very well while the gamma, normal and lognormal distributions perform poorly. From the fitted Gumbel distribution, we show that it is possible to estimate the analytical p-value with great precision even when the test statistic is far out in the tail beyond any of the test statistics observed in the simulated replicates. In addition, Gumbel-based rejection probabilities have smaller variability than Monte Carlo-based rejection probabilities, suggesting that the proposed approach may result in greater power than the true Monte Carlo hypothesis test for a given number of replicates.

Conclusions

For large data sets, it is often advantageous to replace computer intensive Monte Carlo hypothesis testing with this new method of fitting a Gumbel distribution to random data sets generated under the null, in order to reduce computation time and obtain much more precise p-values and slightly higher statistical power.  相似文献   

18.
In vaccine safety studies, subjects are considered at increased risk for adverse events for a period of time after vaccination known as risk window. To our knowledge, risk windows for vaccine safety studies have tended to be pre-defined and not to use information from the current study. Inaccurate specification of the risk window can result in either including the true control period in the risk window or including some of the risk window in the control period, which can introduce bias. We propose a data-based approach for identifying the optimal risk windows for self-controlled case series studies of vaccine safety. The approach involves fitting conditional Poisson regression models to obtain incidence rate ratio estimates for different risk window lengths. For a specified risk window length (L), the average time at risk, T(L), is calculated. When the specified risk window is shorter than the true, the incidence rate ratio decreases with 1/T(L) increasing but there is no explicit relationship. When the specified risk window is longer than the true, the incidence rate ratio increases linearly with 1/T(L) increasing. Theoretically, the risk window with the maximum incidence ratio is the optimal risk window. Because of sparse data problem, we recommend using both the maximum incidence rate ratio and the linear relationship when the specified risk window is longer than the true to identify the optimal risk windows. Both simulation studies and vaccine safety data applications show that our proposed approach is effective in identifying medium and long-risk windows.  相似文献   

19.
A new method for estimating incidence and prevalence is developed, which only requires observation of occurrence of health-related events within a given time window. Occurrence data are often easy to collect but estimating incidence and prevalence of a disease from such data is non-trivial, since the true disease status is not directly observed for all at the beginning of the study period. Our method for overcoming this problem is based on an idea first presented in 'The waiting time distribution as a graphical approach to epidemiologic measures of drug utilization' (Epidemiology 1997; 8:666-670). Their fundamental idea is to analyse the waiting time from start of the window to the first event of each individual, and we formalize this by establishing a parametric likelihood which allows ordinary maximum likelihood analysis and explicit modelling of censoring. The developed method is used to analyse incidence and prevalence of hypertension in a Danish cohort of 70+-year-olds. A simulation study on the finite sample properties of the method is reported, which indicates that the method gives a quite robust and cost-effective alternative to ordinary surveys and follow-up studies for estimating incidence and prevalence.  相似文献   

20.
We have developed a method, called Meta‐STEPP (subpopulation treatment effect pattern plot for meta‐analysis), to explore treatment effect heterogeneity across covariate values in the meta‐analysis setting for time‐to‐event data when the covariate of interest is continuous. Meta‐STEPP forms overlapping subpopulations from individual patient data containing similar numbers of events with increasing covariate values, estimates subpopulation treatment effects using standard fixed‐effects meta‐analysis methodology, displays the estimated subpopulation treatment effect as a function of the covariate values, and provides a statistical test to detect possibly complex treatment‐covariate interactions. Simulation studies show that this test has adequate type‐I error rate recovery as well as power when reasonable window sizes are chosen. When applied to eight breast cancer trials, Meta‐STEPP suggests that chemotherapy is less effective for tumors with high estrogen receptor expression compared with those with low expression. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号