Preview

The Herald of the Siberian State University of Telecommunications and Information Science

Advanced search

GPU-based algorithm for context analysis of the core promoter region of mouse genes differently expressed in hypothalamic energy-sensing neurons in response to weight-loss

Abstract

De novo motif discovery in the regulatory regions of eukaryotic genes poses a complex computational problem due to the large size of datasets and huge diversity of motifs. This article suggests a new algorithm for measuring the presence of degenerate oligonucleotide motifs written as a 15-letter IUPAC code in a DNA dataset. Its performance has increased 10 times compared with the previous one. There are three key ingredients of this method. The first one is the prefix trees. The second is the relation between motif prefixes and hash ranges in the analyzed nucleotide sequences. The third consists of applying CUDA framework to the massive parallelization allowing to use affordable graphic accelerators.

The context analysis of promoter regions of mouse genes differently expressed (DEG) in hypothalamic AGRP neurons after food deprivation was performed with the proposed method. When an animal is deprived of food, AGRP neurons produce molecules that increase appetite and stimulate weight gain. The understanding of how AGRP neurons respond to weight loss is important to confront the obesity. Nowadays, this hereditary disease lacks methods of treatment and intervention strategies which would be both safe and efficient in the long term. The performed analysis revealed relevant oligonucleotide motifs that were associated with starvation.

About the Authors

A. Bocharnikov
Институт систем информатики им. А. П. Ершова СО РАН
Russian Federation


E. Ignatieva
Институт цитологии и генетики СО РАН; Новосибирский государственный университет
Russian Federation


O. Vishenvskiy
Институт цитологии и генетики СО РАН; Новосибирский государственный университет
Russian Federation


References

1. Pesole G, Liuni S, Dsouza M. PatSearch: A pattern matcher sontware that finds functional elements in nucleotide and protein sequences and assesses their statistical significance // Bioinformatics. 2000. V. 16, № 5. P. 439–450.

2. Marsan L, Sagot M. F. Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification // J Comput Biol. 2000. V. 7 (3–4). P. 345–362.

3. Hertz G, Stormo G. Identifying DNA and protein patterns with statistically significant alignments of multiple sequences // Bioinformatics. 1999. V. 15 (7–8). P. 563–577.

4. Grundy W. N., Bailey T. L., Elkan C. P. ParaMEME: A parallel implementation and a web interface for a DNA and protein motif discovery tool // CABIOS. 1996. V. 12. P. 303–310.

5. Lawrence C. E., Altschul S. F., Boguski M. S., Liu J. S. et al. Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment // Science. 1993. V. 262, № 5131. P. 208–214.

6. Nickolls J., Buck I., Garland M., Skadron K. Scalable Parallel Programming with CUDA // Queue. 2008. V. 6, № 2. P.40–53.

7. Vishnevsky O. V., Bocharnikov A. V., Kolchanov N. A. ARGO_CUDA: Exhaustive GPU based approach for motif discovery in large DNA datasets // Journal of Bioinformatics and Computation Biology. 2017. V. 16, № 1.

8. Huang D. W., Sherman B. T., Lempicki R. A. Systematic and integrative analysis of large gene lists using DAVID Bioinformatics Resources // Nature Protoc. 2009. V. 4, № 1. P. 44-57.

9. Huang D. W., Sherman B. T., Lempicki R. A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists // Nucleic Acids Res. 2009. V. 37, № 1. P. 1–13.

10. Henry F. et al. Cell type-specific transcriptomics of hypothalamic energy-sensing neuron responses to weight-loss // Elife. 2015. Sep 2–4.

11. Zerbino D. R., Flicek P. et al. Ensembl 2018 // PubMed. 2018. PMID: 29155950.

12. Durinck S., Spellman P., Birney E., Huber W. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt // Nature Protocols. 2009. V. 4. P. 1184–1191.

13. Gupta S., Stamatoyannopoulos J. A., Bailey T. L., and Noble W. S. Quantifying similarity between motifs // Genome Biology. 2007. V. 8, № 2. R24.


Review

For citations:


Bocharnikov A., Ignatieva E., Vishenvskiy O. GPU-based algorithm for context analysis of the core promoter region of mouse genes differently expressed in hypothalamic energy-sensing neurons in response to weight-loss. The Herald of the Siberian State University of Telecommunications and Information Science. 2019;(3):36-44. (In Russ.)

Views: 185


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1998-6920 (Print)