Inference genové exprese pomocí umělých neuronových sítí

Kunc, Vladimír

Gene expression inference using artificial neural networks

Inference genové exprese pomocí umělých neuronových sítí

Authors

Kunc, Vladimír

Supervisors

Kléma, Jiří

Reviewers

Berka, Petr

Publisher

České vysoké učení technické v Praze
Czech Technical University in Prague

Files

Full Text (14.78 MB)

Abstract

Merení genové exprese je nezbytné pro porozumení bunecným procesum a stavum v rozlicných experimentálních podmínkách, což je potreba v ruzných oblastech biomedicínského výzkumu. Navzdory významnému pokroku v merení genové exprese jsou velkorozsahové studie stále velmi drahé a nárocné. Nástup merící platformy L1000 výrazne zlevnil podobné studie díky merení jen vybraných klícových genu a použití výpocetních modelu k rekonstrukci úrovní genové exprese zbylých genu. Puvodne byly použity modely využívající lineární regresi, ale brzy byly nahrazeny neuronovými sítemi jako je D–GEX, které jsou vhodnejší pro modelování složitých nelineárních vztahu mezi expresemi jednotlivých genu. Tato disertacní práce prináší významná vylepšení puvodního D–GEX modelu — zejména predstavuje transformativní adaptivní aktivacní funkce (TAAF), novou trídu adaptivních aktivacních funkcí. TAAF zavádejí ctyri adaptivní parametry umožnující libovolné horizontální a vertikální škálování a translaci libovolné vnitrní aktivacní funkce. TAAF zlepšují kvalitu inference genové exprese a také pridávají urcitou robustnost vuci výberu aktivacní funkce. Lepší modelovací schopnosti neuronových sítí s TAAF jsou ukázány na úloze inference genové exprese z exprese klícových genu microarray platformy L1000 a také pomocí nekolika umele vytvorených datasetu za úcelem prokázání jejich aplikovatelnosti mimo oblast biomedicíny. Dále je ukázáno, že zpresnení inference genové exprese se také promítá do zpresnení následných analýz, což demonstruje, že TAAF jsou vhodné pro použití v praxi. Druhým duležitým vylepšením puvodních neuronových sítí použitých pro inferenci genové exprese je predstavení vežových a šachovnicových architektur, které dále zlepšují neuronové síte s TAAF a dosahují ješte lepší presnosti inference. Tato vylepšení se projevují i v následných analýzách dopoctených dat, címž je ukázáno, že TAAF mají statisticky význam dopad na zlepšení kvality dopoctených dat. I když byla tato zlepšení predvedena hlavne na úloze inference genové exprese, jejich použití není omezeno na oblast biomedicíny, nebot tato zlepšení jsou použitelná v mnoha jiných aplikací neuronových sítí. Jelikož TAAF zobecnují ruzné aktivacní funkce, které již byly navrženy v literature a použity na rozlicných úlohách, tak i TAAF jsou vhodné nejen pro inferenci genové exprese. Dále tato práce poskytuje rozsáhlý seznam aktivacních funkcí, který slouží jako reference k zjednodušení budoucího výzkumu a predcházení opakovaným návrhum aktivacních funkcí již prítomných v literature.

Gene expression profiling is necessary for understanding cellular states in different experimental conditions, which is needed in various fields of biomedical research. Despite the significant progress in gene expression profiling, large-scale genome-wide profiling is still expensive and challenging. The introduction of the L1000 microarray platform made this analysis significantly cheaper by measuring the gene expression of only a few landmark genes and using computational models to infer the gene expression levels of the remaining genes. Initially, linear regression models were used but were soon replaced by neural network (NN) models such as the D–GEX as they are better suited for modeling the complex nonlinear relationships of the expressions of individual genes. This thesis introduces significant enhancements to the original D–GEX model — primarily the introduction of transformative adaptive activation.functions (TAAFs), a novel class of adaptive activation functions. The TAAFs introduce four adaptive parameters allowing for any horizontal and vertical scaling and translation of any inner activation function. The TAAFs improve the performance of the NNs for gene expression inference and also add some robustness to the choice of the activation function. The performance of NNs with TAAFs is shown on the task of gene expression inference from the expressions of landmark genes of the L1000 microarray platform and also using several artificially generated datasets to demonstrate their applicability outside the omics domain. Additionally, we also show that the improvements in the gene expression inference also translate to improvements in the subsequent analyses, thus validating the practical impact of the usage of the TAAFs. A second important enhancement to the original NNs used for gene expression inference is the introduction of a tower and checkerboard architectures that further improve the NNs with TAAFs and reach even better performance. Notably, this improvement extends to subsequent analyses, demonstrating its statistical significance in enhancing inferred data quality. Although these improvements were demonstrated mainly on the gene expression inference task, their scope is not confined to omics as they are transferable to broader NNs applications. The TAAFs generalize various activation functions already proposed in the literature and used for various tasks, proving their versatility in multiple settings beyond gene expression inference.Additionally, this work provides an extensive list of activation functions, serving as a reference to streamline future research and prevent redundant proposals of activation functions already present in the literature.

Keywords

adaptivní aktivacní funkce, hluboké ucení, neuronové síte, inference genové exprese, vežová architektura, šachovnicová architektura, transformativní adaptivní aktivacní funkce, adaptive activation functions, deep learning, neural networks, gene expression inference, tower architecture, checkerboard architecture, transformative adaptive activation functions, L1000

Permanent link

http://hdl.handle.net/10467/114140

Rights/License

A university thesis is a work protected by the Copyright Act of the Czech Republic. Extracts, copies and transcripts of the thesis are allowed for personal use only and at one`s own expense. The use of thesis should be in compliance with the Copyright Act.

Vysokoškolská závěrečná práce je dílo chráněné autorským zákonem. Je možné pořizovat z něj na své náklady a pro svoji osobní potřebu výpisy, opisy a rozmnoženiny. Jeho využití musí být v souladu s autorským zákonem v platném znění.

Collections

Doctoral Theses - 13000

Full item page

Gene expression inference using artificial neural networks

Inference genové exprese pomocí umělých neuronových sítí

Authors

Supervisors

Reviewers

Editors

Other contributors

Journal Title

Journal ISSN

Volume Title

Publisher

Date of defense

Files

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Citation

Underlying research data set URL

Permanent link

Rights/License

Collections

Endorsement

Review

Supplemented By

Referenced By