• September 1, 2021. The JEFworks lab receives a new grant from the NIGMS MIRA program to develop computational methods for delineating subcellular and cellular spatial transcriptional heterogeneity.
  • May 2021. Assistant Professor Jean Fan was awarded an NSF CAREER grant, a 5-year grant awarded to promising young faculty across all disciplines.
  • May 27, 2020. Congrats to Jeff Leek on being selected as the 2020 Spiegelman award winner, for the constellation of his high-impact research, educational contributions, and visionary efforts to advance society thru Data Science.
  • November 2019. 2 CCB faculty members and 2 former CCB students make the 2019 list of world's most highly-cited researchers. Current faculty on the list are Mike Schatz and Steven Salzberg; former students are Adam Phillippy (Senior Investigator at NIH) and Cole Trapnell (Asst Professor at Univ of Washington). CCB faculty member Mihaela Pertea made the list in a previous year. See the 2019 list at
  • August 2019. HISAT2 and HISAT-genotype are published in Nature Biotechnology, led by former CCB postdoc Daehwan Kim, now an Asst. Professor at UT Southwestern Medical Center.
  • November 2018. A consortium led by Ph.D. student Rachel Sherman and Prof. Steven Salzberg publishes a study in Nature Genetics describing the African pan-genome, which contains nearly 300 Mbp missing from the human reference genome. See the press release at
  • April 18, 2018. Prof. Steven Salzberg elected a member of the American Academy of Arts & Sciences. Founded in 1780, the Academy honors exceptional scholars, leaders, artists, and innovators and engages them in sharing knowledge and addressing challenges facing the world. See the press release at the American Academy of Arts and Sciences website.
  • June 8, 2017. Daehwan Kim and Steven Salzberg release the first version of HISAT-genotype. HISAT-genotype is our next generation platform that enables rapid and accurate genomic analysis of our genomes using next-generation sequencing data on a desktop within a few hours. The platform currently supports HLA typing, discovery of novel HLA alleles, DNA fingerprinting analysis, and other functionalities. Please refer to the HISAT-genotype homepage for further details.
  • April 13, 2017. A group of CCB scientists publish recount2 in Nature Biotechnology. recount2 provides processed and summarized expression data for over 70,000 human RNA-seq samples from the Sequence Read Archive (SRA), The Cancer Genome Atlas (TCGA), and The Genotype-Tissue Expression (GTEx) project. The associated Bioconductor package provides a convenient API for querying, downloading, and analyzing the data.
  • Oct. 26-29, 2016. The 2016 Biological Data Science conference at Cold Spring Harbor Lab is co-chaired by CCB members Jeff Leek (Biostatistics) and Michael Schatz (Computer Science).
  • August 11, 2016. Mihaela Pertea and colleagues publish a new protocol describing how to use StringTie, HISAT, and Ballgown to analyze RNA sequencing experiments, including alignment of raw reads, assembly and quanitification of transcripts, and measurement of differentially expressed genes. See the paper at Nature here or download the PDF from us here.
  • July 9, 2016. Steven Salzberg gives a keynote talk on Open Science at BOSC 2016, part of the larger ISMB 2016 conference. His talk can be viewed on YouTube here.
  • May 26, 2016. Michael Schatz joins Hopkins as our newest Bloomberg Distinguished Associate Professor, with appointments in Computer Science and Biology.
  • April 14, 2016. Alexis Battle is named as one of the 2016 Searle Scholars. Searle Scholars are selected for their potential to make significant contributions to chemical and biological research over the course of their careers; 15 Scholars were chosen in 2016. Each researcher is awarded $300,000 in flexible funding to support his or her work during the next three years. See the announcement here.
  • February 19, 2016. Daehwan Kim, Li Song, Florian Breitwieser and Steven Salzberg release a new, very rapid and memory-efficient system, Centrifuge, for the classification of DNA sequences from microbial samples, with better sensitivity than and comparable accuracy to other leading systems. The system uses a novel indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (e.g., 4.3 GB for ~4,100 bacterial genomes).
  • January 2016. Michael Schatz joins the Hopkins CS Department and CCB as an Associate Professor. Dr. Schatz, formerly at Cold Spring Harbor Laboratory, is widely known for his work on genome assembly algorithms and next-generation sequencing technology. See his CSHL website for more, and look for his new Hopkins website to appear in early 2016.
  • September 8, 2015. Daehwan Kim, Joe Paggi and Steven Salzberg release a new, rapid and accurate alignment program, HISAT2, that aligns NGS reads (both DNA and RNA) against a population of human genomes. In this program, we have extended the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index to incorporate genomic differences among individuals into the reference genome, while keeping memory requirements low enough to fit the entire index onto a desktop computer. HISAT2 is a successor to both HISAT and TopHat2.
  • March 10, 2015. Daehwan Kim, Ben Langmead and Steven Salzberg publish a study in the journal Nature Methods describing HISAT, a highly efficient software tool for aligning reads from RNA sequencing experiments. HISAT a novel hierarchical indexing scheme based on the Ferragina-Manzini (FM) index. Tests on real and simulated data sets showed that HISAT is the fastest software currently available, with equal or better accuracy than any other method.
  • March 6, 2015. Alyssa Frazee and colleagues publish Bioconductor software package 'Ballgown' which provides a bridge between assembly tools like Cufflinks and downstream statistical modeling tools for RNA-Seq expression analysis. The paper was published online March 6 in the journal Nature Biotechnology, and the software is available at Bioconductor and on GitHub.
  • February 18, 2015. Mihaela Pertea and colleagues publish a new algorithm, StringTie, for assembling transcripts from RNA sequencing experiments. StringTie uses a novel network flow algorithm to achieve results significantly superior to all existing methods for the widely-used RNA-seq experimental protocol. The paper was published online Feb 18 in the journal Nature Biotechnology, and the software can be found at
  • December 19, 2014. Dan Arking and colleagues publish new results suggesting that the amount of mitochondrial DNA (mtDNA) found in peoples’ blood directly relates to how frail they are medically. This DNA may prove to be a useful predictor of overall risk of frailty and death from any cause 10 to 15 years before symptoms appear. The paper was published online Dec 4 in the Journal of Molecular Medicine.
  • December 18, 2014. Alexis Battle and colleagues publish a study in the journal Science showing that expression quantitative trait loci (eQTLs) tend to have significantly reduced effect sizes on protein levels, suggesting that their potential impact on downstream phenotypes is attenuated or buffered. They also identify a class of cis QTLs that affect protein abundance with little or no effect on messenger RNA or ribosome levels.
  • November 2014. Li Song, Liliana Florea and Ben Langmead publish Lighter, a new method for correcting errors in DNA sequencing data. Lighter uses sampling and small data structures called Bloom filters to avoid building large count tables, and is substantially faster and more memory efficient than competing approaches. The paper is published in the journal Genome Biology. The software is available on GitHub.
  • June 24, 2014. Three CCB faculty - Mihaela Pertea, Art Delcher, and Steven Salzberg - are named 'Highly Cited' by Thomson Reuters. By analyzing the number of times scientists were cited in others' papers, Thomson Reuters and ScienceWatch have created a new list of the top 3,215 most highly cited researchers of the past decade. 25 current Hopkins faculty made the list. See the announcement from ScienceWatch for details and the full list.
  • March 20, 2014. An international team led by David Neale at UC Davis published the genome of the loblolly pine tree, the largest genome sequenced and assembled to date. At 23 billion bases, the pine is more than 7 times the size of the human genome. The assembly team included Daniela Puiu and Steven Salzberg from JHU, as well as Jim Yorke and Aleksey Zimin at the University of Maryland. See the article here and one of several news stories here.
  • March 2014. Kraken, a new tool developed by Derrick Wood and his advisor Steven Salzberg, is published in Genome Biology. Kraken is very fast program for classifying sequences from metagenomics or microbiome experiments, with run times up to 900 times faster than competing tools. See the article here.
  • February 2014. Ben Langmead is awarded a Sloan Research Fellowship. Since 1955, these fellowships have been given out annually to early-career scientists "whose achievements and potential identify them as rising stars, the next generation of scientific leaders." See the announcement from the Sloan Foundation here.
  • February 2014. Steven Salzberg and Mihaela Pertea published a new method for fast, accurate detection of mutations in exome studies and in comparisons of normal versus diseased tissue. The new method, called DIAMUND, appeared in the journal Human Mutation.
  • July 2013. Ben Langmead and colleague Michael Schatz (Cold Spring Harbor Laboratory) published "The DNA Data Deluge" in IEEE Spectrum. The article calls attention to the current glut of sequencing data and reviews computational problems and solutions. See the JHU press release and Johns Hopkins Magazine for highlights.
    Article: The DNA Data Deluge
  • May 2013. ChIP-PED, a data mining tool developed by Hongkai Ji, his student George Wu, and colleagues is published in Bioinformatics. Using this new tool, biologists can superimpose their ChIP-seq or ChIP-chip data on 20,000+ publicly available human and mouse gene expression samples to discover new cell types, tissues, and disease conditions associated with transcription factor functions. See the article here.
  • May 2013. A team of researchers led by Associate Professor Jonathan Pevsner and his student Matt Shirley used whole-genome sequencing of 6 individuals followed by targeted sequencing in many more to identify the genetic cause of Sturge-Weber Syndrome and port-wine stains. In their report, published May 8 in the New England Journal of Medicine, they traced the genetic cause to a gene called GNAQ on chromosome 9.
    Article: Sturge-Weber Syndrome and Port-Wine Stains Caused by Somatic Mutation in GNAQ
  • April 2013. Hongkai Ji, his student Yang Ning, and colleagues Xia Li and Qianfei Wang (Beijing Institute of Genomics) published "Differential Principal Component Analysis of ChIP-seq" in PNAS. The article introduces a new method, dPCA, for integrative analysis of many ChIP-seq datasets. dPCA allows one to perform unsupervised pattern discovery, statistical inference and dimension reduction within a single model framework. See the PNAS article here.
  • December, 2012. The Center for Computational Biology is established. CCB is a new research center within the McKusick-Nathans Institute of Genetic Medicine, supported by the Johns Hopkins School of Medicine and the Bloomberg School of Public Health. CCB faculty come from a wide range of departments including Medicine, Biostatistics, Computer Science, Biomedical Engineering, and Oncology.
  • September 16, 2012. A team of researchers at JHU and Arizona State University publishes a paper in Nature Neuroscience analyzing the epigenetic basis of honeybee subcastes. Bees of different subcastes are genetically similar, and no epigenetic differences were found between the irreversible worker and queen subcastes. But substantial differences were found between nurse and forager subcastes. Most of these differences are shown to be reversible when bees switch subcastes. This is the first evidence in any organism of reversible epigenetic changes associated with behavior.
    Article: Reversible switching between epigenetic states in honeybee behavioral subcastes. Johns Hopkins news release.
  • May 31, 2012. An international team of JHU and Russian investigators publishes the GenometriCorr R package. This package enables a researcher to ask whether two sets of points or intervals on a genome are spatially correlated (either positive or negative correlation). The software produces p-values indicating the strength of correlation. Spatial relationships considered include overlaps, absolute distance, and relative distances.
    Article: Exploring Massive, Genome Scale Datasets with the GenometriCorr Package. PLoS Computational Biology (2012). The software can be used via a Galaxy interface, a Tcl/Tk graphical interface, or as a function call within R.
  • April 9, 2012. TopHat 2.0 is released! TopHat 2.0 is a major new release of the TopHat software, which aligns next-generation sequences from RNA-seq experiments to a genome with or without a reference annotation, allowing for the discovery of many novel splice sites. TopHat 2.0's improvements, developed primarily by Daehwan Kim and Geo Pertea, include compatibility with Bowtie 2 and parallelization of many steps to reduce runtime. See the TopHat website for details and downloads.
  • March 4, 2012. Ben Langmead and Steven Salzberg publish a paper in Nature Methods describing Bowtie 2, a new, more sensitive, very fast alignment program for next-generation sequences. Bowtie 2 handles gapped and ungapped alignments and is superior in speed and accuracy to virtually all previous methods. Article: B. Langmead and S.L. Salzberg. Fast gapped-read alignment with Bowtie 2. Nature Methods 9, 357-359 (2012). Publ. online 4 March.
  • November 29, 2011. JHU scientists Todd Treangen and Steven Salzberg publish a review in Nature Reviews Genetics discussing the computational problems surrounding repeats and describe strategies used by current bioinformatics systems to solve them.
    Article: Treangen TJ and Salzberg SL. Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nature Reviews Genetics 2011 Nov 29
  • October 27, 2011. Faculty members Carlo Colantuoni (from the Lieber Institute), Jeff Leek, and colleagues from JHU and NIH publish a paper in Nature about their discovery of a wave of gene expression changes occurring during fetal development.
    Article: Colantuoni C, Lipska BK, Ye T, Hyde TM, Tao, Leek JT, Colantuoni E, Elkahloun AG, Herman M, Weinberger DR & Kleinman J. Temporal dynamics and genetic control of transcription in the human brain. Nature 478, 519-523
  • October 16, 2011. Bowtie 2.0 beta is released! A major new version of the popular Bowtie short-read alignment program, developed by Hopkins/UMD researcher Ben Langmead, is now available from the Bowtie website and the new Bowtie2 page. Bowtie 2 supports gapped alignment with affine gap penalties, and allows any number of gaps and any gap length. For reads longer than about 50 bp Bowtie 2 is generally faster, more sensitive, and uses less memory than Bowtie 1. Other differences are described on the Bowtie page.