Predicting features of breast cancer with gene expression patterns |
| |
Authors: | Lu Xuesong Lu Xin Wang Zhigang C Iglehart J Dirk Zhang Xuegong Richardson Andrea L |
| |
Affiliation: | (1) Bioinformatics Division, TNLIST and Department of Automation, Tsinghua University, Beijing, 100084, China;(2) Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA;(3) Department of Biostatistics, Dana-Farber Cancer Institute, Boston, MA, USA;(4) Present address: Department of Family and Preventive Medicine, University of California San Diego, San Diego, CA 92093, USA;(5) Department of Surgery, Brigham and Women’s Hospital, Boston, MA, USA;(6) Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA;(7) Department of Pathology, Brigham and Women’s Hospital, 75 Francis Street, Boston, MA 02115, USA |
| |
Abstract: | ![]() Data from gene expression arrays hold an enormous amount of biological information. We sought to determine if global gene expression in primary breast cancers contained information about biologic, histologic, and anatomic features of the disease in individual patients. Microarray data from the tumors of 129 patients were analyzed for the ability to predict biomarkers [estrogen receptor (ER) and HER2], histologic features [grade and lymphatic-vascular invasion (LVI)], and stage parameters (tumor size and lymph node metastasis). Multiple statistical predictors were used and the prediction accuracy was determined by cross-validation error rate; multidimensional scaling (MDS) allowed visualization of the predicted states under study. Models built from gene expression data accurately predict ER and HER2 status, and divide tumor grade into high-grade and low-grade clusters; intermediate-grade tumors are not a unique group. In contrast, gene expression data is inaccurate at predicting tumor size, lymph node status or LVI. The best model for prediction of nodal status included tumor size, LVI status and pathologically defined tumor subtype (based on combinations of ER, HER2, and grade); the addition of microarray-based prediction to this model failed to improve the prediction accuracy. Global gene expression supports a binary division of ER, HER2, and grade, clearly separating tumors into two categories; intermediate values for these bio-indicators do not define intermediate tumor subsets. Results are consistent with a model of regional metastasis that depends on inherent biologic differences in metastatic propensity between breast cancer subtypes, upon which time and chance then operate. Xuesong Lu and Xin Lu contributed equally to this work. |
| |
Keywords: | Breast cancer Computational molecular biology Gene expression profiling Metastasis |
本文献已被 PubMed SpringerLink 等数据库收录! |
|