Cancer Data Sciences Program

The DF/HCC Cancer Data Sciences Program fosters a broad range of research into statistical, computational, and mathematical questions that arise in cancer investigations. Program members are engaged in interdisciplinary and inter-programmatic projects across population, clinical, and basic cancer research to support DF/HCC’s broad spectrum of research activities, the majority of which are data-intensive and require scientific input from Data Science Program members.

The Program seeks to directly advance both cancer research and treatment. Program members either initiate cancer research when developing mathematical models of carcinogenesis or decision analyses of treatment options or participate as team members in others’ projects, when collaborating in design, analysis, and storage of clinical, population, and basic science studies. Instrumental to the Program is research into the areas of software, databases, and quantitative methodologies. Currently, the Program has more than 50 members.

Recent Publications

  • Marco E, Meuleman W, Huang J, Glass K, Pinello L, Wang J, Kellis M, Yuan GC. Multi-scale chromatin state annotation using a hierarchical hidden Markov model. Nat Commun 2017; 8:15011. PubMed
  • Kim JJ, Burger EA, Sy S, Campos NG. Optimal Cervical Cancer Screening in Women Vaccinated Against Human Papillomavirus. Journal of the National Cancer Institute 2017; 109:1-9. PubMed
  • Alver BH, Kim KH, Lu P, Wang X, Manchester HE, Wang W, Haswell JR, Park PJ, Roberts CW. The SWI/SNF chromatin remodelling complex is required for maintenance of lineage specific enhancers. Nat Commun 2017; 8:14648. PubMed
  • Liu LL, Brumbaugh J, Bar-Nur O, Smith Z, Stadtfeld M, Meissner A, Hochedlinger K, Michor F. Probabilistic Modeling of Reprogramming to Induced Pluripotent Stem Cells. Cell Rep 2016; 17:3395-3406. PubMed
  • Taylor-Weiner A, Zack T, O'Donnell E, Guerriero JL, Bernard B, Reddy A, Han GC, AlDubayan S, Amin-Mansour A, Schumacher SE, Litchfield K, Turnbull C, Gabriel S, Beroukhim R, Getz G, Carter SL, Hirsch MS, Letai A, Sweeney C, Van Allen EM. Genomic evolution and chemoresistance in germ-cell tumours. Nature 2016; 540:114-118. PubMed
  • Love MI, Hogenesch JB, Irizarry RA. Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation. Nat Biotechnol 2016. PubMed
  • Kim TH, Saadatpour A, Guo G, Saxena M, Cavazza A, Desai N, Jadhav U, Jiang L, Rivera MN, Orkin SH, Yuan GC, Shivdasani RA. Single-Cell Transcript Profiles Reveal Multilineage Priming in Early Progenitors Derived from Lgr5(+) Intestinal Stem Cells. Cell Rep 2016; 16:2053-60. PubMed
  • Li B, Li T, Pignon JC, Wang B, Wang J, Shukla SA, Dou R, Chen Q, Hodi FS, Choueiri TK, Wu C, Hacohen N, Signoretti S, Liu JS, Liu XS. Landscape of tumor-infiltrating T cell repertoire of human cancers. Nat Genet 2016; 48:725-32. PubMed
  • Kaplinsky J, Arnaout R. Robust estimates of overall immune-repertoire diversity from high-throughput measurements on samples. Nat Commun 2016; 7:11881. PubMed
  • Kim J, Mouw KW, Polak P, Braunstein LZ, Kamburov A, Tiao G, Kwiatkowski DJ, Rosenberg JE, Van Allen EM, D'Andrea AD, Getz G. Somatic ERCC2 mutations are associated with a distinct genomic signature in urothelial tumors. Nat Genet 2016. PubMed
  • Barrera LA, Vedenko A, Kurland JV, Rogers JM, Gisselbrecht SS, Rossin EJ, Woodard J, Mariani L, Kock KH, Inukai S, Siggers T, Shokri L, Gordân R, Sahni N, Cotsapas C, Hao T, Yi S, Kellis M, Daly MJ, Vidal M, Hill DE, Bulyk ML. Survey of variation in human transcription factors reveals prevalent DNA binding changes. Science 2016; 351:1450-4. PubMed
  • Du Z, Sun T, Hacisuleyman E, Fei T, Wang X, Brown M, Rinn JL, Lee MG, Chen Y, Kantoff PW, Liu XS. Integrative analyses reveal a long noncoding RNA-mediated sponge regulatory network in prostate cancer. Nat Commun 2016; 7:10982. PubMed
  • Haradhvala NJ, Polak P, Stojanov P, Covington KR, Shinbrot E, Hess JM, Rheinbay E, Kim J, Maruvka YE, Braunstein LZ, Kamburov A, Hanawalt PC, Wheeler DA, Koren A, Lawrence MS, Getz G. Mutational Strand Asymmetries in Cancer Genomes Reveal Mechanisms of DNA Damage and Repair. Cell 2016; 164:538-49. PubMed
  • Shukla SA, Rooney MS, Rajasagi M, Tiao G, Dixon PM, Lawrence MS, Stevens J, Lane WJ, Dellagatta JL, Steelman S, Sougnez C, Cibulskis K, Kiezun A, Hacohen N, Brusic V, Wu CJ, Getz G. Comprehensive analysis of cancer-associated somatic mutations in class I HLA genes. Nat Biotechnol 2015. PubMed
  • Kim JJ, Tosteson AN, Zauber AG, Sprague BL, Stout NK, Alagoz O, Trentham-Dietz A, Armstrong K, Pruitt SL, Rutter CM, . Cancer Models and Real-world Data: Better Together. Journal of the National Cancer Institute 2015. PubMed
  • Van Allen EM, Miao D, Schilling B, Shukla SA, Blank C, Zimmer L, Sucker A, Hillen U, Foppen MH, Goldinger SM, Utikal J, Hassel JC, Weide B, Kaehler KC, Loquai C, Mohr P, Gutzmer R, Dummer R, Gabriel S, Wu CJ, Schadendorf D, Garraway LA. Genomic correlates of response to CTLA-4 blockade in metastatic melanoma. Science 2015. PubMed
  • Janiszewska M, Liu L, Almendro V, Kuang Y, Paweletz C, Sakr RA, Weigelt B, Hanker AB, Chandarlapaty S, King TA, Reis-Filho JS, Arteaga CL, Park SY, Michor F, Polyak K. In situ single-cell analysis identifies heterogeneity for PIK3CA mutation and HER2 amplification in HER2-positive breast cancer. Nat Genet 2015; 47:1212-9. PubMed
  • Rogers JM, Barrera LA, Reyon D, Sander JD, Kellis M, Joung JK, Bulyk ML. Context influences on TALE-DNA binding revealed by quantitative profiling. Nat Commun 2015; 6:7440. PubMed
  • Ho JW, Jung YL, Liu T, Alver BH, Lee S, Ikegami K, Sohn KA, Minoda A, Tolstorukov MY, Appert A, Parker SC, Gu T, Kundaje A, Riddle NC, Bishop E, Egelhofer TA, Hu SS, Alekseyenko AA, Rechtsteiner A, Asker D, Belsky JA, Bowman SK, Chen QB, Chen RA, Day DS, Dong Y, Dose AC, Duan X, Epstein CB, Ercan S, Feingold EA, Ferrari F, Garrigues JM, Gehlenborg N, Good PJ, Haseley P, He D, Herrmann M, Hoffman MM, Jeffers TE, Kharchenko PV, Kolasinska-Zwierz P, Kotwaliwale CV, Kumar N, Langley SA, Larschan EN, Latorre I, Libbrecht MW, Lin X, Park R, Pazin MJ, Pham HN, Plachetka A, Qin B, Schwartz YB, Shoresh N, Stempor P, Vielle A, Wang C, Whittle CM, Xue H, Kingston RE, Kim JH, Bernstein BE, Dernburg AF, Pirrotta V, Kuroda MI, Noble WS, Tullius TD, Kellis M, MacAlpine DM, Strome S, Elgin SC, Liu XS, Lieb JD, Ahringer J, Karpen GH, Park PJ. Comparative analysis of metazoan chromatin organization. Nature 2014; 512:449-52. PubMed
  • Siggers T, Reddy J, Barron B, Bulyk ML. Diversification of transcription factor paralogs via noncanonical modularity in C2H2 zinc finger DNA binding. Mol Cell 2014; 55:640-8. PubMed
  • He Y, Landrum MB, Zaslavsky AM. Combining information from two data sources with misreporting and incompleteness to assess hospice-use among cancer patients: a multiple imputation approach. Stat Med 2014. PubMed
  • Bernau C, Riester M, Boulesteix AL, Parmigiani G, Huttenhower C, Waldron L, Trippa L. Cross-study validation for the assessment of prediction algorithms. Bioinformatics 2014; 30:i105-12. PubMed
  • Van Allen EM, Wagle N, Stojanov P, Perrin DL, Cibulskis K, Marlow S, Jane-Valbuena J, Friedrich DC, Kryukov G, Carter SL, McKenna A, Sivachenko A, Rosenberg M, Kiezun A, Voet D, Lawrence M, Lichtenstein LT, Gentry JG, Huang FW, Fostel J, Farlow D, Barbie D, Gandhi L, Lander ES, Gray SW, Joffe S, Janne P, Garber J, MacConaill L, Lindeman N, Rollins B, Kantoff P, Fisher SA, Gabriel S, Getz G, Garraway LA. Whole-exome sequencing and clinical interpretation of formalin-fixed, paraffin-embedded tumor samples to guide precision cancer medicine. Nat Med 2014; 20:682-8. PubMed
  • Riester M, Wei W, Waldron L, Culhane AC, Trippa L, Oliva E, Kim SH, Michor F, Huttenhower C, Parmigiani G, Birrer MJ. Risk prediction for late-stage ovarian cancer by meta-analysis of 1525 patient samples. Journal of the National Cancer Institute 2014. PubMed
  • Waldron L, Haibe-Kains B, Culhane AC, Riester M, Ding J, Wang XV, Ahmadifar M, Tyekucheva S, Bernau C, Risch T, Ganzfried BF, Huttenhower C, Birrer M, Parmigiani G. Comparative meta-analysis of prognostic gene signatures for late-stage ovarian cancer. Journal of the National Cancer Institute 2014. PubMed
  • Jung YL, Luquette LJ, Ho JW, Ferrari F, Tolstorukov M, Minoda A, Issner R, Epstein CB, Karpen GH, Kuroda MI, Park PJ. Impact of sequencing depth in ChIP-seq experiments. Nucleic Acids Res 2014. PubMed
  • Aschard H, Vilhjálmsson BJ, Greliche N, Morange PE, Trégouët DA, Kraft P. Maximizing the power of principal-component analysis of correlated phenotypes in genome-wide association studies. Am J Hum Genet 2014. PubMed
  • Hefti MM, Hu R, Knoblauch NW, Collins LC, Haibe-Kains B, Tamimi RM, Beck AH. Estrogen receptor negative/progesterone receptor positive breast cancer is not a reproducible subtype. Breast Cancer Res 2014; 15:R68. PubMed
  • Gorfine M, Hsu L, Parmigiani G. Frailty Models for Familial Risk with Application to Breast Cancer. Journal of the American Statistical Association 2014; 108:1205-1215. PubMed
  • Parast L, Tian L, Cai T. Landmark Estimation of Survival and Treatment Effect in a Randomized Clinical Trial. Journal of the American Statistical Association 2014; 109:384-394. PubMed
  • Wang Y, Schrag D, Brooks GA, Dominici F. National trends in pancreatic cancer outcomes and pattern of care among Medicare beneficiaries, 2000 through 2010. Cancer 2014. PubMed
  • Almendro V, Cheng YK, Randles A, Itzkovitz S, Marusyk A, Ametller E, Gonzalez-Farre X, Muñoz M, Russnes HG, Helland A, Rye IH, Borresen-Dale AL, Maruyama R, van Oudenaarden A, Dowsett M, Jones RL, Reis-Filho J, Gascon P, Gönen M, Michor F, Polyak K. Inference of tumor evolution during chemotherapy by computational modeling and in situ analysis of genetic and phenotypic cellular diversity. Cell Rep 2014; 6:514-27. PubMed
  • Leder K, Pitter K, Laplant Q, Hambardzumyan D, Ross BD, Chan TA, Holland EC, Michor F. Mathematical modeling of PDGF-driven glioblastoma reveals optimized radiation dosing schedules. Cell 2014; 156:603-16. PubMed
  • Brastianos PK, Taylor-Weiner A, Manley PE, Jones RT, Dias-Santagata D, Thorner AR, Lawrence MS, Rodriguez FJ, Bernardo LA, Schubert L, Sunkavalli A, Shillingford N, Calicchio ML, Lidov HG, Taha H, Martinez-Lage M, Santi M, Storm PB, Lee JY, Palmer JN, Adappa ND, Scott RM, Dunn IF, Laws ER, Stewart C, Ligon KL, Hoang MP, Van Hummelen P, Hahn WC, Louis DN, Resnick AC, Kieran MW, Getz G, Santagata S. Exome sequencing identifies BRAF mutations in papillary craniopharyngiomas. Nat Genet 2014; 46:161-5. PubMed
  • Lawrence MS, Stojanov P, Mermel CH, Robinson JT, Garraway LA, Golub TR, Meyerson M, Gabriel SB, Lander ES, Getz G. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 2014; 505:495-501. PubMed
  • Lohr JG, Stojanov P, Carter SL, Cruz-Gordillo P, Lawrence MS, Auclair D, Sougnez C, Knoechel B, Gould J, Saksena G, Cibulskis K, McKenna A, Chapman MA, Straussman R, Levy J, Perkins LM, Keats JJ, Schumacher SE, Rosenberg M, , Getz G, Golub TR. Widespread genetic heterogeneity in multiple myeloma: implications for targeted therapy. Cancer Cell 2014; 25:91-101. PubMed
  • He HH, Meyer CA, Hu SS, Chen MW, Zang C, Liu Y, Rao PK, Fei T, Xu H, Long H, Liu XS, Brown M. Refined DNase-seq protocol and data analysis reveals intrinsic bias in transcription factor footprint identification. Nat Methods 2013; 11:73-8. PubMed
  • Haibe-Kains B, El-Hachem N, Birkbak NJ, Jin AC, Beck AH, Aerts HJ, Quackenbush J. Inconsistency in large pharmacogenomic studies. Nature 2013; 504:389-93. PubMed
  • Kaplan N, Dekker J. High-throughput genome scaffolding from in vivo DNA interaction frequency. Nat Biotechnol 2013; 31:1143-7. PubMed
  • Naumova N, Imakaev M, Fudenberg G, Zhan Y, Lajoie BR, Mirny LA, Dekker J. Organization of the mitotic chromosome. Science 2013; 342:948-53. PubMed
  • Wang S, Sun H, Ma J, Zang C, Wang C, Wang J, Tang Q, Meyer CA, Zhang Y, Liu XS. Target analysis by integration of transcriptome and ChIP-seq data with BETA. Nat Protoc 2013; 8:2502-15. PubMed
  • Ionita-Laza I, Lee S, Makarov V, Buxbaum JD, Lin X. Sequence kernel association tests for the combined effect of rare and common variants. Am J Hum Genet 2013; 92:841-53. PubMed
  • Parmigiani G, Boca S, Ding J, Trippa L. Statistical tools and R software for cancer driver probabilities. Methods Mol Biol 2013; 1101:113-34. PubMed
  • Kim TM, Laird PW, Park PJ. The landscape of microsatellite instability in colorectal and endometrial cancer genomes. Cell 2013; 155:858-68. PubMed
  • Ogburn EL, Vanderweele TJ. Bias attenuation results for nondifferentially mismeasured ordinal and coarsened confounders. Biometrika 2013; 100:241-248. PubMed
  • Lawrence M, Huber W, Pagès H, Aboyoun P, Carlson M, Gentleman R, Morgan MT, Carey VJ. Software for computing and annotating genomic ranges. PLoS Comput. Biol. 2013; 9:e1003118. PubMed
  • Giobbie-Hurder A, Gelber RD, Regan MM. Challenges of guarantee-time bias. J Clin Oncol 2013; 31:2963-9. PubMed
  • Gisselbrecht SS, Barrera LA, Porsch M, Aboukhalil A, Estep PW, Vedenko A, Palagi A, Kim Y, Zhu X, Busser BW, Gamble CE, Iagovitina A, Singhania A, Michelson AM, Bulyk ML. Highly parallel assays of tissue-specific enhancers in whole Drosophila embryos. Nat Methods 2013; 10:774-80. PubMed
  • Partridge AH, Gelber S, Piccart-Gebhart MJ, Focant F, Scullion M, Holmes E, Winer EP, Gelber RD. Effect of age on breast cancer outcomes in women with human epidermal growth factor receptor 2-positive breast cancer: results from a herceptin adjuvant trial. J Clin Oncol 2013; 31:2692-8. PubMed
  • Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, Carter SL, Stewart C, Mermel CH, Roberts SA, Kiezun A, Hammerman PS, McKenna A, Drier Y, Zou L, Ramos AH, Pugh TJ, Stransky N, Helman E, Kim J, Sougnez C, Ambrogio L, Nickerson E, Shefler E, Cortés ML, Auclair D, Saksena G, Voet D, Noble M, DiCara D, Lin P, Lichtenstein L, Heiman DI, Fennell T, Imielinski M, Hernandez B, Hodis E, Baca S, Dulak AM, Lohr J, Landau DA, Wu CJ, Melendez-Zajgla J, Hidalgo-Miranda A, Koren A, McCarroll SA, Mora J, Lee RS, Crompton B, Onofrio R, Parkin M, Winckler W, Ardlie K, Gabriel SB, Roberts CW, Biegel JA, Stegmaier K, Bass AJ, Garraway LA, Meyerson M, Golub TR, Gordenin DA, Sunyaev S, Lander ES, Getz G. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 2013; 499:214-8. PubMed
  • Du Z, Fei T, Verhaak RG, Su Z, Zhang Y, Brown M, Chen Y, Liu XS. Integrative genomic analyses reveal clinically relevant long noncoding RNAs in human cancer. Nat Struct Mol Biol 2013; 20:908-13. PubMed
  • Zhao R, Michor F. Patterns of proliferative activity in the colonic crypt determine crypt stability and rates of somatic evolution. PLoS Comput. Biol. 2013; 9:e1003082. PubMed
  • Glass K, Huttenhower C, Quackenbush J, Yuan GC. Passing messages between biological networks to refine predicted interactions. PLoS ONE 2013; 8:e64832. PubMed
Hide