Research

Research

Computational Biology and Bioinformatics

High-throughput genomic technologies are rapidly evolving including the areas of DNA and RNA sequencing. Novel types of complex data are being quickly generated and require novel methods for quality control and analysis. We are currently focused on developing and/or applying methods for identifying genomic alterations in cancer, quantifying the mutagenic effect of carcinogens, and characterizing cellular heterogeneity using single cell RNA sequencing. We have developed the CELDA framework (CEllular Latent Dirichlet Allocation), which can be used to identify hidden transcriptional states and cellular populations in count-based single-cell RNA-seq data. A beta version of this software can be accessed at GitHub.

Identifying Early Drivers of Lung Cancer

Lung adenocarcinomas and lung squamous cell carcinomas are the most common types of lung cancer and remain major causes of death worldwide despite advances in smoking cessation, early detection, and targeted and immunological therapies. Many patients have lung cancers that do not harbor a known activating mutation and therefore cannot be given targeted therapies. In collaboration with labs from Dana-Farber Cancer Institute, the Broad Institute, and The Cancer Genome Atlas (TCGA) consortium, we analyze next-generation sequencing data to identify novel drivers of lung tumorigenesis. Targeting these genes with novel therapies will hopefully lead to a reduction in overall lung cancer mortality. In collaboration with the Spira/Lenburg lab at BUSM, we are identifying the genomic alterations in premalignant lesions for squamous cell carcinoma with the ultimate goal of defining strategies for early detection.

Therapeutic Development and Pathogenesis of COPD

Chronic Obstructive Pulmonary Disease (COPD) is the 4th leading cause of death in the world. Our understanding of the molecular mechanisms responsible for the initiation and progression of this disease are limited. By examining expression differences between individuals with and without COPD or differences within a person along a gradient of disease, we hope to elucidate the molecular mechanisms that responsible for disease initiation. Utilizing publicly available resources such as the Connectivity Map, we are also using gene expression data to predict novel therapeutics for the treatment of COPD.