Liliana Photo

Liliana Florea

Associate Professor
McKusick-Nathans Institute of Genetic Medicine
Department of Medicine and Department of Computer Science
Faculty Manager, Computational Biology Consulting Core
1900 E. Monument, Welch 113
Johns Hopkins University School of Medicine
Baltimore, MD 21205
Ph: (443) 287-5624
E-mail: florea [at]

Lab | Core | CV | Publications

I am interested in developing and applying computational techniques to model and solve problems in biology and genetic medicine, in particular leveraging the sequence data generated by the next generation squencing technologies. Application areas include: next generation sequencing data analysis; genome analysis and comparison; cDNA-to-genome alignment; gene and alternative splicing annotation; RNA editing; microbial comparative genomics; miRNA genomics; and computational vaccine design. Read more about these, and about what is new in the lab, below.

What is new?


Our newly released online Coursera course Command Line Tools for Genomic Data Science was formulated for biologists and computational scientists alike who wish to enter the 'big data' genomics science. The course covers topics starting from basic Unix commands, to biological data types and formats, to tools and practical workflows for alignment, sequence variation and transcriptomics. For some of our earlier teaching at GWU, please see the curriculum here.


Next generation sequencing data analysis

In recent years, next generation sequencing has revolutionized biomedical sciences. However, the vast amounts of short reads it generates are difficult to analyze, demanding new and sophisticated bioinformatics methods. Our lab is developing algorithms and associated software tools to analyze next generation sequencing data to answer a variety of questions in biology and medicine.

Relevant software:

Genome assembly, comparison and analysis

High throughput sequencing has made it possible to sequence and assemble the genome of virtually any organism. Most model species were sequenced and annotated as part of large genome sequencing projects, undertaken by broad international consortia. More recently, next generation sequencing has tremendously accelerated the pace of producing new genomes, but the short reads lead to fragmented assemblies that are difficult to analyze. We are developing algorithms and tools to aid in genome assembly and assembly curation, as well as to compare and annotate genomes.

Relevant software:

Alternative splicing and RNA editing

Alternative splicing and RNA editing are important post-transcriptional regulatory mechanisms that contribute to creating functional diversity. In human and other species, dysregulations in splicing and editing have been associated with diseases. We are developing methods to discover and catalogue alternative splicing and RNA editing variations in humans and animals, by analyzing the vast amounts of gene sequence data produced with conventional and next generation sequencing.

Relevant software:

Host-pathogen interaction

Genome analyses of bacterial and viral pathogens have provided unique insights into their virulence, host adaptation and evolution, and as determinants and markers of disease. In collaborative work, we have developed algorithm to identify sequence features in the genomes of viruses and bacteria, such as virus encoded non-coding RNAs and selection signatures, as well as tools to compare and visualize microbial genomes for comparative and functional analyses.

Relevant software:


Our work has been supported in part by NSF awards ABI-1159078, ABI-1356078 and IOS-1339134, and by a Research Fellowship in Computational and Evolutionary Molecular Biology from the Alfred P. Sloan Research Foundation.