Hughes Group: Genome Biology
A multidisciplinary team using genomics, genome engineering, synthetic biology, bioinformatics and machine learning to understand gene regulation in health and disease.
Using genomics, computational and synthetic biology approaches to understand how genes are regulated in health and disease.
The biological sciences have undergone a technical revolution in the 21st century. We can now read an organism's genomic blueprint in its entirety and edit it with precision and speed. These abilities are driven by continual technological advances that generate huge amounts of data. Modern biology is therefore as dependent on advanced computational technologies as it is on molecular and cell biology techniques.
A fundamental question in biology is how are specific parts of the blueprint activated and used in particular cells or situations when the genomic sequence is the same in every cell in the body?
The core activity of the genome is the RNA it produces or "expresses" from genes as messenger RNA (mRNA), which act as templates for protein synthesis. It is now known that a class of elements in the genome, often called “regulatory elements” or “enhancers”, control gene activity in different cells. Studying these elements can be very challenging as they are active only in the cells where they are needed, and they are distributed unpredictably and often far away from the genes they control. A major bottleneck has therefore been linking a gene with its regulatory elements in a given cell type or circumstance.
Solving this problem is not only important to understand the basic biology it also has profound implications for human health. It is now known from large-scale genetic studies (Genome-Wide Association Studies or GWAS) that the sequence variations in each of our genomes that pre-dispose us all to common human diseases more often affect regulatory elements rather than the genes themselves. Genetic changes in these elements are therefore likely a poorly diagnosed cause of heritable diseases.
The Hughes group is expert in a wide range of the genomics methods and technologies that can address different aspects of this question: such as RNA-seq methods, which show whether a gene is expressed or not and at what level; DNase-seq and ATAC-seq, which can generate maps of all of the active elements in the genome; and ChIP-seq, which can assess which types of proteins or chemical modification are found at these elements and so indicate their likely function. They also employ genome editing, synthetic biology and single cell epigenomics approaches to sample and test both fundamental and disease-specific aspects of genome biology.
Where suitable approaches are not available, the group develops novel methods such as the Capture-C approaches, which allow thousands of genes to be linked to their regulatory elements in a single assay. This assay is the only high-resolution Chromatin Conformation Capture (3C) assay that multiplexes both viewpoints and cell samples in a single assay and so maps the interactions between genes and regulatory elements with high throughput, sensitivity and statistical rigour.
Due to the huge amounts of data these technologies generate, the group has an equally large computational component, with many members expert in both. The group has developed machine learning approaches such as deepC (https://github.com/rschwess/deepC ), to understand which elements dictate the regulatory interactions in the human genome and to predict the effect of sequence variation at base-pair resolution. It also develops new approaches to extract meaningful data from these hugely rich datasets (https://lanceotron.molbiol.ox.ac.uk/ ) and to allow humans to interact with and integrate the multidimensional datasets they produce (https://mlv.molbiol.ox.ac.uk/ ).