I'm Rob, a postdoctoral researcher at the Max Delbruck Centre in Berlin. My main research interest is regulatory sequences in animal genomes - regions of DNA that can control the activity of other genes. I am especially interested the way that cells fold their DNA, and how this can influence the effects of these regulatory sequences. There is growing evidence that this folding may be disregulated in some human diseases, and I think research in this area has the potential for opening new therapeutic avenues.

I'm also a keen programmer, especially in python. I've contributed to a number of open source projects, including doit and metaseq. I'm currently writing a genome browser in python called EIYBrowse (Extend It Yourself Browser). I've found programming an invaluable skill during my research and I like to spread the knowledge around, so I contribute to and teach for Software Carpentry. I've written a series of Software Carpentry lessons on using python's doit for automating scientific analysis pipelines.

I still play my guitar every now and then, snowboard as often as I can (i.e. not very often), occasionally sing "Blue Moon" and love to cycle.

Here's some other stuff I've been enjoying recently:






  • 2015
  • Postdoctoral Researcher

    Berlin Institute for Medical Systems Biology, Berlin (Aug 2015-Current)

    • Working on chromatin folding and nuclear organization in Prof. Ana Pombo's laboratory
  • MRC Funded PhD Studentship

    MRC Clinical Sciences Centre, London (Oct 2011-Jul 2015)

    • Investigated regulation of gene expression by long-range chromatin interactions
    • Co-supervised by Prof. Ana Pombo (MDC) and Prof. Niall Dillon (CSC)
  • Technical Support & Research Assistant

    Charles Beagrie Ltd, Salisbury (Jan 2004-Current)

    • Assisted with data analysis and desktop research for UK Research Data Service Feasibility Study and Keeping Research Data Safe project
  • 2011
  • Research Assistant

    Department of Biochemistry, Cambridge University (Oct 2010-Aug 2011)

    • Studied C. crescentus polynucleotide phosphorylase in Prof. Ben Luisi’s group
    • Cloning, expression, protein affinity purification, enzymatic assays and crystallography
  • 2010
  • Cancer Research UK Summer Studentship

    CRUK London Research Institute, London (Holborn) (Jun 2010-Sep 2010)

    • Studied Polycomb group proteins and their interaction with the ncRNA HOTAIR with Dr. Gordon Peters
    • Chromatin Immunoprecipitation, qPCR, lentiviral shRNA knockdowns, qRT-PCR and western blotting
  • Research Assistant

    Department of Biochemistry, Cambridge University (Jan 2010-May 2010)

    • Development of the NMR analysis tool “DANGLE” with Dr. William Broadhurst
    • Programming, data analysis and statistical work to improve the accuracy of the DANGLE algorithm for predicting protein dihedral angles from chemical shift data
  • 2005
  • Work Experience Student

    Wellcome Trust Sanger Centre, Hinxton (Jun 2005-Jun 2005)

    • 1 week placement with an ENCODE project group
    • 1 week placement in the sequencing centre


Enhancer Journal Club: Enhancer hijacking activates GFI1 family oncogenes in medulloblastoma

By: Rob Beagrie

I read a neat little paper in Nature just before Christmas about enhancer hijacking, which I think is a particularly interesting topic. The original paper can be found at: http://www.nature.com/nature/journal/v511/n7510/full/nature13379.html The abstract is as follows: Medulloblastoma is a highly malignant paediatric brain tumour currently treated with a combination of surgery, radiation and chemotherapy, posing a considerable burden of toxicity to the developing child. Genomics has illuminated the extensive intertumoral heterogeneity of medulloblastoma, identifying four distinct molecular subgroups. Group 3 and group 4 subgroup medulloblastomas account for most paediatric cases; yet, oncogenic drivers for these subtypes remain largely unidentified. Here we describe a series of prevalent, highly disparate genomic structural variants, restricted to groups 3 and 4, resulting in specific and mutually exclusive activation of the growth factor independent 1 family proto-oncogenes, GFI1 and GFI1B. Somatic structural variants juxtapose GFI1 or GFI1B coding sequences proximal to active enhancer elements, including super-enhancers, instigating oncogenic activity. Our results, supported by evidence from mouse models, identify GFI1 and GFI1B as prominent medulloblastoma oncogenes and implicate ‘enhancer hijacking’ as an efficient mechanism driving oncogene activation in a childhood cancer. In this paper, Northcott, Lee, Zichner and colleagues explore potential molecular drivers of medulloblastoma. Medulloblastomas are though to be divisible into four main subgroups. Groups 1 and 2 generally involve upregulation of sonic hedgehog or wingless pathways, whereas groups 3 and 4 have no known cause. By sequencing tumour DNA from medulloblastoma patients, they are able to find a particular region near of chromosome 9 which is frequently involved in structural variation (i.e. deletion, duplications, inversions). These structural variants appear to lead to a consistent upregulation of the GFI1B gene, occurring specifically in group 3 or group 4 tumours. Fig 1. a,b, Positions of recurrent structural variants in medulloblastoma. c, Expression of genes in affected region. d, GFI1B expression by medulloblastoma class. e, Class 3 & 4 medulloblastomas ranked by GFI1B expression. Reprinted by permission from Macmillan Publishers Ltd: Nature 511, 428–434 (2014), copyright (2014). For me, the most interesting part of this paper is the finding that these chromosomal rearrangments frequently juxtapose the GFI1B gene with a series of enhancers in or around the DDX31 gene. The authors suggest that this might be a case of "enhancer hijacking", where an enhancer that normally activates the expression one gene changes its target and causes expression of an unrelated gene in the wrong tissue or the wrong developmental stage. Fig 3. a, Epigenetic enhancer marks over the GFI1B/DDX31 locus. Adapted by permission from Macmillan Publishers Ltd: Nature 511, 428–434 (2014), copyright (2014). An interesting point to note in this figure: as the authors point out, the level of H3K27ac over the DDX31 enhancers actually seems to be higher in samples with a rearrangement in the region. Of course, this needs to be taken with a pinch of salt as you can't really compare the levels of enrichment in ChIP-seq data unless you're using some sort of fancy spike-in ChIP-seq 1 Assuming this quantitative difference is true, there may be an additional local factor which is increasing the activity of these enhancers before or after rearrangement. Interestingly, there doesn't seem to be much difference in the DNA methylation state of the enhancers in affected vs. non-affected samples. This could indicate that the enhancers are primed for further activation even in the absence of genomic rearrangements. It's possible that the increased acetylation of the enhancers could be downstream of GFI1B activation, partiularly if they are directly bound by GFI1B or one of its targets. Another alternative is that the increased activity of these enhancers is sufficient for GFI1B activation, in other words that the fully activated enhancers can activate GFI1B expression even in the absence of a genomic rearrangement. Before rearrangement, the enhancers are only separated from GFI1B by 370 kb, which is well within the range of activity seen for other enhancers. In the light of a few recent papers, perhaps the most interesting explanation is that DDX31 and GFI1B are normally separated by a topological domain boundary that gets removed or repositioned when the region is rearranged. There is some Hi-C data from an in vitro differentiation of hESCs to neural precursors, but as far as I can tell nobody has called topological domain positions from those datasets. In hESC datasets, GFI1B and DDX31 are within the same topological domain, which doesn't really support this interpretation: Hi-C data from human ES cells suggests that GFI1B and the DDX31 enhancers may occupy the same topological domain. Image modified from the 3D genome browser - http://www.3dgenome.org On the other hand, the whole region is syntenic between Human and Mouse, and in mouse NPCs there is a TAD boundary separating DDX31 and GFI1b. Perhaps someone could use CRISPR to delete that domain boundary in a mouse or mESC cell line and see if that causes overexpression of GFI1B. In the second half of the paper, the authors look at GFI1, which is a paralogue of GFI1B and find that GFI1 is also upregulated in a subset of medulloblastomas, and that this can be accompanied by various different chromosomal rearrangements juxtaposing GFI1 with active enhancers. One of the interesting things about these two genes is that they are both marked by H3K27me3 in unaffected samples. This means that their expression is likely to be repressed by Polycomb proteins. An interesting possibility, then, is that positioning of these genes near active enhancers is actually clearing Polycomb proteins from the GFI1/GFI1B promoters, which has been suggested as an mechanism of enhancer action in other systems 2 Fig 6 Model for enhancer hijacking in medulloblastoma. Genomic rearrangements juxtapose the GFI1B or GFI1 genes with either local or distal enhancer clusters, repressive H3K27me3 marks are lost from the respective gene promoters and GFI1/GFI1B are ectopically activated in inappropriate tissues. Adapted by permission from Macmillan Publishers Ltd: Nature 511, 428–434 (2014), copyright (2014). I think the idea of enhancer hijacking (or enhancer adoption, as it's also been called) is a very interesting one. The challenge remains to predict which enhancers might cause gene dysregulation when they are involved in genomic rearrangements, and crucially to determine which genes they are likely to target. (Orlando, D. A. et al. Quantitative ChIP-Seq Normalization Reveals Global Modulation of the Epigenome. Cell Rep. 9, 1163–1170 (2014).) ↩(Vernimmen, D. et al. Polycomb eviction as a new distant enhancer function. Genes Dev. 25, 1583–8 (2011).) ↩

Enhancer Journal Club: Genome-wide identification and characterization of functional neuronal activity–dependent enhancers

By: Rob Beagrie

I noticed a nice paper this week in Nature Neuroscience from Michael Greenberg's lab at Harvard Medical School. The paper discusses the identification of enhancers in neurons which are activated (or deactivated) when those neurons are stimulated by membrane de-polarization. The original paper can be found at: http://www.nature.com/neuro/journal/vaop/ncurrent/full/nn.3808.html The abstract is as follows: Experience-dependent gene transcription is required for nervous system development and function. However, the DNA regulatory elements that control this program of gene expression are not well defined. Here we characterize the enhancers that function across the genome to mediate activity-dependent transcription in mouse cortical neurons. We find that the subset of enhancers enriched for monomethylation of histone H3 Lys4 (H3K4me1) and binding of the transcriptional coactivator CREBBP (also called CBP) that shows increased acetylation of histone H3 Lys27 (H3K27ac) after membrane depolarization of cortical neurons functions to regulate activity-dependent transcription. A subset of these enhancers appears to require binding of FOS, which was previously thought to bind primarily to promoters. These findings suggest that FOS functions at enhancers to control activity-dependent gene programs that are critical for nervous system function and provide a resource of functional cis-regulatory elements that may give insight into the genetic variants that contribute to brain development and disease. It has been known for a while that neurons will respond to stimulation by activating the transcription of certain genes - this is known as "activity–dependent transcription". Whilst the signalling pathway leading from changes in membrane potential to the transcription of early-response genes has been the focus of a few studies, the way in which these few early-response genes activate a broader transcriptional program is not well understood. The authors postulated that the early-response might lead to activation of enhancers, which lie upstream of the broader transcriptional changes in stimulated cells. They first set out to identify enhancers which are activated or deactivated after membrane depolarization by KCl treatment of cortical neurons. They perform ChIP-seq of H3K27Ac/me3, H3K4me1/3, CBP and RNA Pol II before and after KCl addition, plus RNA-seq before and at 1/6 hours post depolarization. They identify putative enhancers as CBP/H3K4me1-enriched sites >1 kb from an annotated TSS, and found that 1468 of them (12%) showed at least a twofold increase in H3K27Ac after depolarization, which they call "neuronal activity–regulated enhancers". In contrast, they only found 738 sites that showed at least a twofold decrease. They nicely validated the neuronal activity–regulated enhancers by showing that 14/14 tested regions showed greater activity in a luciferase assay after depolarization. In contrast, no putative enhancers with constant or decreased H3K27Ac after KCl treatment showed higher activity in the luciferase assay following the same treatment. So these putative enhancer regions can activate luciferase in response to membrane depolarization, but the question remains whether they activate endogenous genes in vivo. To answer this they asked whether the nearest genes to neuronal activity–regulated enhancers also showed increases in transcription on neuronal activation. Whilst they do show a significant increase in the expression level of these genes, the increase is not that large. This could be because many of the enhancer regions don't regulate the nearest gene (either regulating a more distal gene or having no targets) or because the increase in expression in each target gene is small. Presenting the data as a bean or box plot of fold changes, rather than a bar chart of mean expression level could have gone some way to answering this question. The next obvious question is which factors might be responsible for activating the neuronal activity–regulated enhancers. They perform a motif enrichment analysis and find that AP-1 motif is the most highly enriched, which normally binds FOS- and JUN-family proteins. This is nice as several of these transcription factors are known early-response genes, but on the other hand the AP-1 motif is normally thought of as a promoter motif and not as a component of distal enhancers. ChIP seq for FOS protein confirmed that 96% of the identified 12,594 were at distal sites, not at promoters. Whilst this strongly implicates FOS in the activation of these enhancers, only 42% of neuronal activity–regulated enhancers were directly bound by FOS. That would seem to indicate that FOS is neither necessary nor sufficient for enhancer activation in a large number of cases. The paper does a good job of showing that FOS is required for those enhancers where it binds, however. In a panel of eight such enhancers tested in luciferase assays, all eight showed reduced activity if the AP-1 binding site was mutated by a single base pair, or when cells were treated with an shRNA against FOS. They extend this approach genome wide by looking for genes which are upregulated in response to membrane depolarization, but which show at least 33% lower induction in the presence of FOS shRNA. Interestingly, only 53 of the 187 genes identified by this approach had a FOS-bound enhancer within 100kb. This could indicate that many of the genes sensitive to FOS shRNA are indirect targets. Alternatively, since only one shRNA is used, these genes could be direct off-target effects or could lie downstream of off-target genes. One interesting final possibility is that many of the activity regulated enhancers act over distances of >100kb. Never the less, they test 14 of their 53 FOS direct targets in the visual cortex of dark-housed mice exposed to light. 10/14 tested genes showed induction under these conditions, validating that these genes can respond to neuronal stimulus in the intact brain. Overall, I think it's very interesting that H3K27Ac can change quite dramatically after only 2 hours of depolarization with KCl, and that many of the new peaks which appear do seem to mark functional enhancers. One aspect that could be interesting to explore is the peaks which are lost after depolarization. The authors do perform motif analysis for these peaks, which identified candidate TFs like Atoh1 and SRF, but a deeper investigation of how enhancer repression might be involved in the response to neuronal stimulation could be quite revealing.

Enhancer Journal Club: Global view of enhancer–promoter interactome in human cells

By: Rob Beagrie

One of the key hypotheses which is driving interest in enhancer biology right now is that mutations in enhancer sequences may cause medically relevant changes in gene expression (e.g. in cancer). There has been a great deal of progress in recent years towards high throughput identification of enhancers, genome sequencing and measurement of gene expression (e.g. by RNA-seq). In order to show a correlation between sequence variation in enhancers to gene expression, the missing piece of the puzzle is a way to link predicted enhancers with their target promoters. The paper I'll be discussing this month contains a nice summary of the currently used methods, as well as some fairly rigorous comparisons between the possible approaches. The article is called "Global view of enhancer–promoter interactome in human cells" and comes from the Tan lab at the University of Iowa. The original paper can be found at <http://www.pnas.org/content/111/21/E2191.long>. The abstract for the paper is as follows: Enhancer mapping has been greatly facilitated by various genomic marks associated with it. However, little is available in our tool- box to link enhancers with their target promoters, hampering mechanistic understanding of enhancer–promoter (EP) interac- tion. We develop and characterize multiple genomic features for distinguishing true EP pairs from noninteracting pairs. We inte- grate these features into a probabilistic predictor for EP interac- tions. Multiple validation experiments demonstrate a significant improvement over state-of-the-art approaches. Systematic analy- ses of EP interactions across 12 cell types reveal several global features of EP interactions: (i) a larger fraction of EP interactions are cell type specific than enhancers; (ii) promoters controlled by multiple enhancers have higher tissue specificity, but the regulat- ing enhancers are less conserved; (iii) cohesin plays a role in me- diating tissue-specific EP interactions via chromatin looping in a CTCF-independent manner. Our approach presents a systematic and effective strategy to decipher the mechanisms underlying EP communication. The first step in their analysis is to identify promoters and enhancers in their 12 cell types. For promoters, they use the GENCODE annotation of transcripts and for enhancers they use their own previously published method (CSI-ANN) to detect enhancers based on H3K4me1, H3K4me3 and H3K27Ac. Once they have their enhancer/promoter lists they identify "real" pairs using ChIA-PET data. The next step is to describe features of the enhancer/promoter (E/P) pairs that distinguish "real" pairs from random pairs. They use the following four characteristics: In real E/P pairs, the ChIP signal of the enhancer correlates with transcription at the promoter across the 12 cell types. In real E/P pairs, the expression of transcription factors that bind to the enhancer correlates with transcription at the promoter. Real E/P pairs show higher co-evolution of DNA sequence, or higher conservation of their relative position (synteny). Real E/P pairs tend to be closer to each other. For each of the characteristics, they check for statistically robust differences between the real pairs and random non-pairs, then combine the four measures using a random-forest classifier. This forms the basis of their approach, which they call IM-PET. They compare IM-PET to the "nearest-promoter" approach (which is still what most people use in a pinch) and to three other published algorithms. The ROC curves are relatively impressive when they use the original ChIA-PET data from K562 and MCF7 cells for validation, but actually the approaches appear to perform more similarly when newer ChIA-PET or high resolution Hi-C datasets are used. Whilst their analyses do seem to support their method as the best compromise between precision and recall, I wonder whether their "nearest promoter" comparison is fair. It would have been better to use the nearest active promoter as a control, as it seems a reasonable assumption that an inactive gene cannot be the target of an active enhancer. Even using this comparison, their approach would likely compare favourably as "nearest x" can only ever predict one target for each enhancer, and will therefore have lots of false negatives. I think the most interesting comparison would have been against the assumption that enhancers contact all active promoters within a certain distance, or even better, all active promoters within the same topological domain. Another way of putting this question is, how often does IM-PET predict that an active promoter is not a target for a nearby enhancer? I can't see any examples of this in the UCSC tracks from the paper, at least.



Follow me on twitter.

Email me at rob{at}beagrie.com

Rob Beagrie

MRC Clinical Sciences Centre

Hammersmith Hospital


W12 0NN