Predicting enhancer activity [guest post]

ResearchBlogging.orgLess than 2% of genomic DNA codes for protein. The remaining noncoding portions have been dismissively referred to as junk. Junk implies that because the DNA doesn’t code for proteins, it isn’t functional. In recent years, researchers showed that so-called junk DNA contains regulatory regions, promoters and enhancers that regulate gene expression. Identifying and cloning a gene is one thing, but knowing when and where it’s expressed is crucial to understand how organisms develop and function. Identifying regulatory regions, however, has been a challenge. Promoters tend to be located adjacent to the genes they control, but enhancers are scattered throughout the genome, sometimes 1 million bases of DNA away from the gene they regulate.
Axel Visel and his colleagues found a way to identify when enhancers are regulating genes. A protein called p300, expressed throughout the body, is found in many enhancer-associated protein complexes. p300 is also required for embryonic development, a crucial time when activated genes are literally building the body. Visel dissected forebrain, midbrain and limb tissue from more than 150 mouse embryos, cross-linked the DNA and protein, and digested the DNA (pieces of DNA bound to protein are protected). Using antibodies, Visel purified only those pieces of DNA bound to p300 and then sequenced that DNA and identified it as a possible enhancer by alignment to the mouse genome. This technique, called chromatin immunoprecipitation coupled to massively parallel sequencing (ChIP-seq), is not new, but using p300 as the bait was a clever twist.
To confirm these regions of DNA regulate gene expression in vivo, Visel identified orthologous regions from human DNA. Candidate enhancers, average size 2.4 kb, were cloned upstream of mouse minimal hsp68 promoter and lacZ, a reporter vector used previously by this and other groups. Candidate enhancers were not cloned in any particular orientation. Vectors were injected into fertilized mouse eggs, the eggs were implanted and at embryonic day 11.5 embryos were harvested for whole-mount X-gal staining. Only similar staining patterns observed in three different embryos (representing three independent transgene integrations) were considered valid. If ChIP-seq identified an enhancer active in the limb but not in the brain, then the human orthologue of the enhancer should turn only the mouse’s limbs blue. In most cases, that’s what happened. (See photo.)

Overall, 87% (75 out of 86) of the enhancers produced expression patterns in mice that agreed with the ChIP-seq results – a tremendous improvement over the same researchers’ previous prediction method (47%, 246 out of 528), in which enhancers were identified based on evolutionary conservation and tested using the same reporter assay in transgenic mice. The p300 ChiP-seq method is especially good because it’s large-scale, enabling scientists to study thousands of enhancers throughout the genome from any tissue during any time in an animal’s life. Visel and colleagues identified thousands of enhancers active in the brain and limbs of mouse embryos and verified 75 using transgenic F0 embryos. Using this technique, future studies can identify in vivo enhancers from additional anatomic regions, embryonic (or adult) stages, and from mouse models of human disease.
As my graduate adviser used to say, if you put junk in, you get junk out. Clearly junk DNA is anything but.

Visel, A., Blow, M., Li, Z., Zhang, T., Akiyama, J., Holt, A., Plajzer-Frick, I., Shoukry, M., Wright, C., Chen, F., Afzal, V., Ren, B., Rubin, E., & Pennacchio, L. (2009). ChIP-seq accurately predicts tissue-specific activity of enhancers Nature, 457 (7231), 854-858 DOI: 10.1038/nature07730

Daniel Gorelick is a neuroscientist who is currently taking a year off from his postdoctoral fellowship to serve as a AAAS Science & Technology Policy fellow in the U.S. Department of State. He writes the Science Planet blog and covers science and technology for, a State Department Web site. E-mail: