Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

 Patrick PflughauptDr Patrick Pflughaupt completed his DPhil as part of the Sahakyan Group: Integrative Computational Biology and Machine Learning at the University of Oxford, focusing on the genomic sequence influence of DNA fragility.

After graduating with a BSc in Chemistry from University College London (UCL), Dr Pflughaupt began exploring computational approaches to biological research. During his MRes in Molecular Modelling and Materials Science at UCL, he studied protein dynamics simulations under Professor Francesco Gervasio. Here, he used unbiased molecular dynamics simulations and enhanced sampling techniques (funnel metadynamics) to investigate the local interaction dynamics and binding free energies between the HIV-1 gp41 peptide and monoclonal antibody 2F5. This work identified potential mutation sites for improving their binding interactions. Having gained a taste for computational modelling of proteins, his interests steered him to explore the building blocks of proteins: the DNA sequence.

Dr Pflughaupt was then awarded DPhil scholarships (Clarendon Fund, Radcliffe Department of Medicine, Medical Research Council, and Hertford College) to pursue the DPhil programme under joint supervision of Dr Aleksandr Sahakyan and Professor Peter McHugh in the Centre of Computational Biology, Weatherall Institute of Molecular Medicine (WIMM), University of Oxford. He commenced his DPhil journey by first exploring a longstanding, unexplained phenomenon called Chargaff’s second parity rule (PR-2). PR-2 describes how the fraction of bases or DNA k-mers on a single DNA strand matches the fraction of the complementary bases or k-mers in the very same strand. Here, he targeted this problem with a modern scientific arsenal that includes symbolic regression and machine learning techniques. With no prior assumptions, he performed a series of simulations starting from random independent mutation rate constants to study genome evolution over timescales within the age of life on Earth. By comparing simulations that produced PR-2-compliant genomes against those that did not, he arrived at a set of more general relationships between mutation rates that govern PR-2 compliance. Dr Pflughaupt demonstrated that these newly found constraints are satisfied by all previously studied species/genomes in the literature – even those genomes that do not meet a more stringent “no-strand-bias assumption” theory from prior works. This work has culminated as his first first-author publication in Nucleic Acids Research.

Dr Pflughaupt’s major part of his DPhil thesis was investigating the sequence-based influences on DNA breakpoints under various DNA breakage conditions encompassing physiological, pathological and spontaneous processes. This effort involved processing and summarising all such DNA fragility datasets to squeeze out the essence of the underlying sequence dependence, thus, also understanding the commonalities and differences between these processes. He has identified that the DNA sequence context can be separated into three ranges influencing the formation of a breakpoint. He subsequently summarised all the results into a feature library (DNAfrAIlib) easily portable to any sequence-driven machine learning project, to make those automatically aware of DNA fragility. The model offered novel insights into understanding the effect of structural variants, chromothripsis, and viral integration into the human genome. As part of this project, Dr Pflughaupt also contributed to the development of quantum mechanical (QM) electronic and geometric parameters for DNA k-mers as features for machine learning, utilising them in the DNA fragility-aware modelling process. This QM-based feature library is published in Scientific Data. Some of his work was also used to understand the structure-driven effects on genomic DNA damage propensity at G-quadruplex sites. Dr Pflughaupt’s work on DNA fragility is now published as a first author in Nucleic Acids Research, which was the basis of being a finalist in the 2023 MRC Weatherall Institute of Molecular Medicine (WIMM) Student Presentation, and subsequently presenting the work at the MRC WIMM Day.

Dr Pflughaupt then applied his understanding of sequence-driven DNA breakage probabilities to improve de novo genome assembly algorithms. His work allows for improved de novo assembly of genomes from shorter-than-usual DNA fragments, relevant for cell-free DNA, ancient or forensic highly fragmented DNA research. This proof-of-concept work, currently at the revision stage with BMC Bioinformatics, shows promise for integration into future large-scale de novo genome assembly projects.

Dr Pflughaupt’s DPhil journey produced high-impact publications, which he also presented at various conferences globally (EMBL: AI and Biology; Cold Spring Habor: Biology of Genomes; Cold Spring Habor: Genome Informatics; Cold Spring Habor: DNA Metabolism, Genomic Stability, and Human Disease; RECOMB/ISCB: Conference on Regulatory and Systems Genomics; 15th International Conference on Bioinformatics Models, Methods, and Algorithms). He also collaborated with Adib Abdullah from the Sahakyan Lab to develop a multi-purpose and flexible k-meric enrichment analysis software, which is now available on the official R CRAN repository and under review as a stand-alone publication. Dr Pflughaupt has also conducted peer-reviews for Nucleic Acids Research. He hopes to continue contributing to the field of computational biology.