- Undergraduate: BE, Automatic Control, Shanghai Jiao Tong University, Shanghai, China
- Post Graduate: MS, Pattern Recognition and Intelligent Systems, Shanghai Jiao Tong University, Shanghai, China
PhD, Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA
Research and Academic Interests
- Bioinformatics, systems biology, cancer genomics, radiogenomics, and drug development: Biomarker discovery, genomic regulation network, integrative genomic data analysis, tumor heterogeneity, drug target identification, drug toxicity/efficacy prediction.
- Machine learning, statistical pattern recognition, signal/image processing, and data visualization: Clustering, classification, feature selection/extraction, optimization, blind source separation, data projection.
Peer-Review Journal Publications:
- Zhu, N. Wang, D.J. Miller, Y. Wang, Convex analysis of mixtures for separating non-negative well-grounded sources, Scientific Reports, 6, Article number: 38350 (2016)
- H. Li, Y. Zhu, E. Burnside, E. Huang, K. Drukker, K. Hoadley, C. Fan, S. Conzen, M. Zuley, J. Net, E. Sutton, G. Whitman, E. Morris, C. Perou, Y. Ji, and M. Giger, Quantitative MRI radiomics in the prediction of molecular classifications of breast cancer subtypes in the TCGA/TCIA Dataset, NPJ Breast Cancer, 2, Article number: 16012, 2016.
- Li, Y. Zhu, E.S. Burnside, K. Drukker, K.A. Hoadley, C. Fan, S.D. Conzen, L. Lan, M. Zuley, G. Whitman, E.J. Sutton, J.M. Net, M. Ganott, K.R. Brandt, E. Bonaccio, A. Rao, C. Jaffe, E Huang, J.B. Freymann, J. Kirby, E. Morris, C.M. Perou, Y. Ji, M.L. Giger, MRI radiomics signatures for predicting the risk of breast cancer recurrence as given by research versions of gene assays of MammaPrint, Oncotype DX, and PAM50, Radiology, 2016 May 5:152110
- Y. Zhu, H. Li, W. Guo, K. Drukker, L. Lan, M.L. Giger, Y. Ji, Deciphering genomic underpinnings of quantitative MRI-based radiomic phenotypes of invasive breast carcinoma, Scientific Reports, vol. 5, Article Number: 17787, 2015
- E.S. Burnside, K. Drukker, H. Li, E. Bonaccio, M. Zuley, M. Ganott, J.M. Net, E.J. Sutton, K.R. Brandt, G. Whitman, C.H. Le-Petross, S.D. Conzen, L. Lan, Y. Ji, Y. Zhu, C. Jaffe, E. Huang, J. Kirby, J.B. Freymann, E. Morris, M.L. Giger, Using computer-extracted image phenotypes from tumors on breast MRI to predict stage. Cancer, 2015 Nov 30. doi: 10.1002/cncr.29791
- S. Sengupta, K. Gulukota, Y. Zhu, C. Ober, K. Naughton, W. Wentworth-Sheilds, Y. Ji, Ultra-fast local-haplotype variant calling using paired-end DNA-sequencing data reveals somatic mosaicism in tumor and normal blood samples, Nucleic Acids Research, 2015 Sep 29. pii: gkv953
- W. Guo, H. Li, Y. Zhu, L. Lan, S. Yang, K. Drukker, M.L. Giger, Y. Ji, Prediction of clinical phenotypes in invasive breast carcinomas from the integration of radiomics and genomics data, Journal of Medical Imaging, vol. 2, no. 4, 041007, 2015.
- Y. Zhu, Y. Xu, D.L. Helseth, K. Gulukota, S. Yang, L.L. Pesce, R. Mitra, P. Müller, S. Sengupta, W. Guo, J.C. Silverstein, I. Foster, N. Parsad, K.P. White, Y. Ji, Zodiac: A comprehensive depiction of genetic interactions in cancer by integrating TCGA data, Journal of The National Cancer Institute, vol. 107, no. 8, djv129, 2015.
- Y. Xu, Y. Zhu, P. Müller, R. Mitra, and Y Ji, Charactering cancer-specific networks by integrating TCGA data, Cancer Informatics, vol. 2014:13(S2), pp. 125–131, 2014.
- M. Brehme, C. Voisine, T. Rolland, S. Wachi, J. Soper, Y. Zhu, K. Orton, A. Villella, D. Garza, M. Vidal, H. Ge, R.I. Morimoto, A chaperome subnetwork safeguards proteostasis in aging and neurodegenerative disease, Cell Reports, vol. 9, no 3, pp. 1135–1150, 2014
- Y. Zhu, P. Qiu, Y. Ji, TCGA-Assembler: open-source software for retrieving and processing TCGA data, Nature Methods, vol. 11, no. 6, pp. 599-600, 2014.
- R. Mitra, P. Müller, Y. Ji, Y. Zhu, G. Mills, Y. Lu. A Bayesian hierarchical model for inference across related reverse phase protein arrays experiments, Journal of Applied Statistics, vol. 41, no. 11, pp. 2483-2492, 2014.
- J. Lee, P. Müller, Y. Zhu, Y. Ji, A nonparametric Bayesian model for local clustering with application to proteomics, Journal of the American Statistical Association, vol. 108, no. 503, pp. 775-788, 2013.
- Y. Zhu, H. Li, D.J. Miller, Z. Wang, J. Xuan, R. Clarke, E.P. Hoffman, and Y. Wang, caBIGTM VISDA: modeling, visualization, and discovery for cluster analysis of genomic data, BMC Bioinformatics, vol. 9, 383, 2008.
- Y. Zhu, Z. Wang, D.J. Miller, R. Clarke, J. Xuan, E.P. Hoffman, and Y. Wang, A ground truth based comparative study on clustering of gene expression data, Frontiers in Bioscience, vol. 13, pp. 3839-3849, 2008.
- J. Yang, X. Ling, Y. Zhu, and Z. Zheng, A face detection and recognition system in color image series, Mathematics and Computers in Simulation, vol. 77, no. 5-6, pp. 531-539, 2008.
- J. Wang, H. Li, Y. Zhu, M. Yousef, M. Nebozhyn, M. Showe, L. Showe, J. Xuan, R. Clarke, and Y. Wang, VISDA: An open-source caBIGTM analytical tool for data clustering and beyond, Bioinformatics, vol. 23, no. 15, pp. 2024-2027, 2007.
- Z. Zheng, J. Yang, and Y. Zhu, Initialization enhancer for non-negative matrix factorization, Engineering Applications of Artificial Intelligence, vol. 20, no. 1, pp. 101-110, 2007.
- M. Bakay, Z. Wang, G. Melcon, L. Schiltz, J. Xuan, P. Zhao, V. Sartorelli, J. Seo, E. Pegoraro, C. Angelini, B. Shneiderman, D. Escolar, Y. Chen, S.T. Winokur, L.M. Pachman, C. Fan, R. Mandler, Y. Nevo, E. Gordon, Y. Zhu, Y. Dong, Y. Wang, and E.P. Hoffman, Nuclear envelope dystrophies show a transcriptional fingerprint suggesting disruption of Rb–MyoD pathways in muscle regeneration, Brain, vol. 129, no. 4, pp. 996-1013, 2006.
- Z. Zheng, J. Yang, and Y. Zhu, Face detection and recognition using color sequential images, Journal of Research and Practice in Information Technology, vol. 38, no. 2, 2006.
Peer-Review Conference Publications:
- Y. Zhu, T. Chan, E.P. Hoffman, Y. Wang, Gene expression dissection by non-negative well-grounded source separation, Proc. IEEE Intl. Workshop on Machine Learning for Signal Processing, Cancún, Mexico, October 2008.
- J. Molineaux, J. Xuan, T. Gong, Y. Zhu, E.P. Hoffman, R. Clarke and Y. Wang, ModVis: an information visualization tool for gene module discovery, Proc. 19th Intl. Conf. Computer Applications in Industry and Engineering, pp. 24-31, Las Vegas, NV, November 2006.
- Y. Feng, Z. Wang, Y. Zhu, J. Xuan, D.J. Miller, R. Clarke, E.P. Hoffman, and Y. Wang, Learning the tree of phenotypes using genomic data and VISDA, Proc. Sixth IEEE Symposium on Bioinformatics and Bioengineering, pp. 165-170, Washington DC, October 2006.
- Y. Zhu, Z. Wang, J. Xuan, E.P. Hoffman, and Y. Wang, Phenotypic-specific gene module discovery using diagnostic tree and VISDA, Proc. IEEE Intl. Conf. Engineering in Medicine and Biology Society, pp. 5767-5770, New York City, NY, August 2006.
- T. Gong, Y. Zhu, J. Xuan, H. Li, R. Clarke, E.P. Hoffman, and Y. Wang, Latent variable and nICA modeling of pathway gene module composite, Proc. IEEE Intl. Conf. Engineering in Medicine and Biology Society, pp. 5872-5875, New York City, NY, August 2006.
- T. Gong, J. Xuan, Y. Zhu, H. Li, R. Clarke, E.P. Hoffman, and Y. Wang, Composite gene module discovery using non-negative independent component analysis, Proc. IEEE/NLM Life Science Systems and Applications Workshop, pp. 1-2, Bethesda, MD, July 2006.
TCGA-Assembler is an open-source, freely available tool that automatically downloads, assembles, and processes public The Cancer Genome Atlas (TCGA) data, to facilitate downstream data analysis by relieving investigators from the burdens of data preparation. TCGA-Assembler includes two modules. Module A acquires public TCGA data from TCGA Data Coordinating Center and assembles individual data files into locally stored data tables. Module B does various manipulations on the data tables to prepare them for downstream analysis.
Supporting Paper 1: Y. Zhu, P. Qiu, Y. Ji, TCGA-Assembler: Open-Source Software for Retrieving and Processing TCGA Data, Nature Methods, vol. 11, no. 6, 2014.
Zodiac provides a comprehensive depiction of cancer genomic interactions inferred by Bayesian Graphical Models (BGM) based on multimodal The Cancer Genome Atlas (TCGA) data, including gene expression, protein expression, DNA methylation, and copy number. Zodiac consists of two main components, 1) a large database containing nearly 200 million regulation maps of intragenic interactions of each gene and intergenic interactions between each pair of genes, and 2) analytics tools that perform high-quality inference on data-enhanced networks based on either multimodal TCGA data or user's in-house data.
Supporting Paper 1: Y. Zhu, Y. Xu, D.L. Helseth, K. Gulukota, S. Yang, L.L. Pesce, R. Mitra, P. Müueller, S. Sengupta, W. Guo, J.C. Silverstein, I. Foster, N. Parsad, K.P. White, Y. Ji, Zodiac: A Comprehensive Depiction Genetic Interactions in Cancer by Integrating TCGA Data, submitted.
- VIsual Statistical Data Analyzer (VISDA)
VISDA is an analytical tool for cluster modeling, visualization, and discovery. Being statistically-principled and visually-insightful, VISDA exploits human gift for pattern recognition and allows users to discover hidden clustered data structure within high dimensional and complex biomedical data sets. The unique features of VISDA include its hybrid algorithm, robust performance, and "tree of phenotype". With global and local biomarker identification and prediction functionalities, VISDA allows users across the cancer research community to analyze their genomic/proteomic data to define new cancer subtypes based on the gene expression patterns, construct hierarchical trees of multiclass cancer phenotypic composites, or to discover the correlation between cancer statistics and risk factors.
Supporting Paper 1: Y. Zhu, H. Li, D.J. Miller, Z. Wang, J. Xuan, R. Clarke, E.P. Hoffman, and Y. Wang, caBIG® VISDA: modeling, visualization, and discovery for cluster analysis of genomic data, BMC Bioinformatics, vol. 9, 383, 2008.
Supporting Paper 2: J. Wang, H. Li, Y. Zhu, M. Yousef, M. Nebozhyn, M. Showe, L. Showe, J. Xuan, R. Clarke, and Y. Wang, VISDA: An open-source caBIG® analytical tool for data clustering and beyond, Bioinformatics
, vol. 23, no. 15, pp. 2024-2027, 2007.