Abstract: Techniques are provided for predicting DNA accessibility. DNase-seq data files and RNA-seq data files for a plurality of cell types are paired by assigning DNase-seq data files to RNA-seq data files that are at least within a same biotype. A neural network is configured to be trained using batches of the paired data files, where configuring the neural network comprises configuring convolutional layers to process a first input comprising DNA sequence data from a paired data file to generate a convolved output, and fully connected layers following the convolutional layers to concatenate the convolved output with a second input comprising gene expression levels derived from RNA-seq data from the paired data file and process the concatenation to generate a DNA accessibility prediction output. The trained neural network is used to predict DNA accessibility in a genomic sample input comprising RNA-seq data and whole genome sequencing for a new cell type.
Type:
Grant
Filed:
November 20, 2017
Date of Patent:
November 5, 2019
Assignees:
Nant Holdings IP, LLC, NantOmics, LLP
Inventors:
Kamil Wnuk, Jeremi Sudol, Shahrooz Rabizadeh, Patrick Soon-Shiong, Christopher Szeto, Charles Vaske