Dealing with missing, abnormal and incoherent data in E3N cohort study |
| |
Authors: | Garcia-Acosta S Clavel-Chapelon F |
| |
Affiliation: | INSERM U521, Institut Gustave-Roussy, rue Camille Desmoulins, 94805 Villejuif, France. |
| |
Abstract: | BACKGROUND: The E3N Study, 'Etude Epidémiologique auprès de femmes de la Mutuelle Générale de l'Education Nationale', is a cohort study, aiming at studying cancer risk factors on 100,000 women. Even if the incidence of problematic (missing, incoherent, etc.) data is low, any multivariate analysis which would be based only on complete subjects would rely on a too small sample, which would not necessarily be representative of the studied population. Results could thus be biased. METHODS: Our dealing with problematic data includes RESULTS: We looked at the number of individuals on which an analysis on 19 variables could be undertaken. The management of missing data made exploitable one fourth of the cohort, i.e.74.6% of individuals instead of 50.5%. Moreover, for 89.0% of subjects, one variable at most (out of the 19 studied) has missing datum. CONCLUSIONS: The main difficulty does not stand so much in the choice and implementation of methods to deal with problematic data than in the identification of their process of existence. Most of what was gained was due to the simplest methods: cold-deck and deductive method. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|