Plasmodium Falciparum Microarray Data Analysis Using Machine Learning Approaches

dc.contributor.advisorTilahun Melak (PhD.)
dc.contributor.authorAbdiwak, Tesema
dc.date.accessioned2025-12-17T10:54:18Z
dc.date.issued2022-02
dc.description.abstractMalaria is one of the deadliest diseases to humans. The disease is developing resistance to antimalarial drugs in different countries worldwide. In addition to this problem, there is no vaccine available despite decades of research. The systematic development of resistance to antimalarial drugs forces researchers to generate a massive volume of data. The data retrieved from Microarrays shows this fact. This research objective is to identify drug targets at the gene level from gene expression data obtained from microarrays. In this research we perform the preprocessing of the raw MA Data before high-level analysis. After loading the raw data into the working environment of the R studio, first we explore the dataset to check if it contains the required components. Then we apply microarray data quality control and after that we go for background correction, normalization, and log2 transformation of the data. Finally, we check for the existence of missing values and screening of outliers. We applied the Empirical Bayes method to identify differentially expressed genes in the high level analysis. Identification of differentially expressed genes results in 2500 differentially expressed genes out of 22769 genes. The study applied clustering of DEGs to group them based on their expression values, considering that genes within the same cluster have the same biological behavior. We applied the hierarchical clustering technique. The clustering result gives us 226 genes falling in the fourth cluster, 283 genes in the third cluster, 616 genes in the first cluster, and 904 genes in the second cluster. We validated the clustering result using an internal validation measure in which hierarchical clustering is selected as the best clustering technique for this study. The final step in this research is constructing a gene-gene interaction network for the up-regulated genes. We used Cytoscape software to construct the network. Finally in this study, from the network we extracted top 20 genes that can be used as drug targets.en_US
dc.description.sponsorshipASTUen_US
dc.identifier.urihttp://10.240.1.28:4000/handle/123456789/1584
dc.language.isoen_USen_US
dc.publisherASTUen_US
dc.subjectGene, Gene Expression, Malaria, Plasmodium Falciparum, Microarray Data, Machine Learning, Clustering, Gene-gene Interaction, Differential Expressionen_US
dc.titlePlasmodium Falciparum Microarray Data Analysis Using Machine Learning Approachesen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Abdiwak Tesema.pdf
Size:
3.6 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description:

Collections