Theses and Dissertations

Issuing Body

Mississippi State University

Advisor

Kurum, Mehmet

Committee Member

Popescu, George

Committee Member

Du, Jenny

Committee Member

Ball, John

Committee Member

Tang, Bo

Date of Degree

12-10-2021

Document Type

Dissertation - Open Access

Major

Electrical and Computer Engineering

Degree Name

Doctor of Philosophy (Ph.D)

College

James Worth Bagley College of Engineering

Department

Department of Electrical and Computer Engineering

Abstract

RNA (Ribonuceic Acid) sequencing technology is a powerful technology used to give re- searchers essential information about the functionality of genes. The transcriptomic study and downstream analysis highlight the functioning of the genes associated with a specific biological process/treatment. In practice, differentially expressed genes associated with a particular treatment or genotype are subjected to downstream analysis to find some critical set of genes. This critical set of genes/ genes pathways infers the effect of the treatment in a cell or tissue. This disserta- tion describes the multiple stages framework of finding these critical sets of genes using different analysis methodologies and inference algorithms.

RNA sequencing technology helps to find the differentially expressed genes associated with the treatments and genotypes. The preliminary step of RNA-seq analysis consists of extracting the mRNA(messenger RNA) followed by mRNA libraries’ preparation and sequencing using the Illumina HiSeq 2000 platform. The later stage analysis starts with mapping the RNA sequencing data (obtained from the previous step) to the genome annotations and counting each annotated

gene’s reads to produce the gene expression data. The second step involves using the statistical method such as linear model fit, clustering, and probabilistic graphical modeling to analyze genes and gene networks’ role in treatment responses.

In this dissertation, an R software package is developed that compiles all the RNA sequencing steps and the downstream analysis using the R software and Linux environment.

Inference methodology based on loopy belief propagation is conducted on the gene networks to infer the differential expression of the gene in the further step. The loopy belief propagation algorithm uses a computational modeling framework that takes the gene expression data and the transcriptional Factor interacting with the genes. The inference method starts with constructing a gene-Transcriptional Factor network. The construction of the network uses an undirected proba- bilistic graphical modeling approach. Later the belief message is propagated across all the nodes of the graphs.

The analysis and inference methods explained in the dissertation were applied to the Arabidopsis plant with two different genotypes subjected to two different stress treatments. The results for the analysis and inference methods are reported in the dissertation.

Included in

Biomedical Commons

Share

COinS