Sangam: A Confluence of Knowledge Streams

Combining DNA Methylation with Deep Learning Improves Sensitivity and Accuracy of Eukaryotic Genome Annotation

Show simple item record

dc.contributor Dalkilic, Mehmet
dc.creator Zynda, Gregory J.
dc.date 2020-04-28T19:23:14Z
dc.date 2020-04-28T19:23:14Z
dc.date 2020-04
dc.date.accessioned 2023-02-24T18:22:05Z
dc.date.available 2023-02-24T18:22:05Z
dc.identifier http://hdl.handle.net/2022/25389
dc.identifier.uri http://localhost:8080/xmlui/handle/CUHPOERS/260033
dc.description Thesis (Ph.D.) - Indiana University, School of Informatics, Computing, and Engineering, 2020
dc.description The genome assembly process has significantly decreased in computational complexity since the advent of third-generation long-read technologies. However, genome annotations still require significant manual effort from scientists to produce trust-worthy annotations required for most bioinformatic analyses. Current methods for automatic eukaryotic annotation rely on sequence homology, structure, or repeat detection, and each method requires a separate tool, making the workflow for a final product a complex ensemble. Beyond the nucleotide sequence, one important component of genetic architecture is the presence of epigenetic marks, including DNA methylation. However, no automatic annotation tools currently use this valuable information. As methylation data becomes more widely available from nanopore sequencing technology, tools that take advantage of patterns in this data will be in demand. The goal of this dissertation was to improve the annotation process by developing and training a recurrent neural network (RNN) on trusted annotations to recognize multiple classes of elements from both the reference sequence and DNA methylation. We found that our proposed tool, RNNotate, detected fewer coding elements than GlimmerHMM and Augustus, but those predictions were more often correct. When predicting transposable elements, RNNotate was more accurate than both Repeat-Masker and RepeatScout. Additionally, we found that RNNotate was significantly less sensitive when trained and run without DNA methylation, validating our hypothesis. To our best knowledge, we are not only the first group to use recurrent neural networks for eukaryotic genome annotation, but we also innovated in the data space by utilizing DNA methylation patterns for prediction.
dc.language en
dc.publisher [Bloomington, Ind.] : Indiana University
dc.subject genome annotation
dc.subject deep learning
dc.subject rnn
dc.subject dna methylation
dc.subject epigenetics
dc.title Combining DNA Methylation with Deep Learning Improves Sensitivity and Accuracy of Eukaryotic Genome Annotation
dc.type Doctoral Dissertation


Files in this item

Files Size Format View
zynda_dissertation_20-04-18.pdf 7.015Mb application/pdf View/Open

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse