Chickpea Genotype Recommendation And Yield Prediction Using Machine Learning

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

ASTU

Abstract

Chickpea (Cicer arietinum L.) is a multi-purpose legume crop in the world and ranking third after soybean and bean. To keep food security, farmers want to use improved crops and new technology-based farming systems. International center for agricultural research in dry areas (ICARDA) established Centre for chickpea crop improvement through breeding by providing materials for national agricultural Centre including Ethiopia. In Ethiopia, ICARDA and Debre Zeit with the provision of funds from the united states agency for international development (USAID) started to experiment production of chickpea. At the current time, DZARC is improving chickpea crops using the conventional type of breeding. Conventional breeding is a type of breeding that depends on phenotypic traits like plant height, pods per plant, seeds per pods. Therefore, the current way of the chickpea genotype improvement process is based on phenotypic observation as well as yield prediction is based on the farmer's experience. To handle these problems, many researchers had used phenotypic trait data and machine learning techniques. But the techniques that researchers used were not promising methods for the existing problem. In this research, a dataset of 11990 rows with 14 features was used from which four clusters were generated using the PSO-based K-means algorithm. As the cluster results found are unbalanced, the synthetic minority oversampling technique (SMOTE) was applied on training dataset and generated 17,268 data to train the Random Forest (RF) classification algorithm for single unseen genotype grouping. An RF algorithm performed well in predicting the class of new genotypes with an accuracy of 76%. DNN with a structure of (9-146-146-146-146-1) integrated with the Bayesian optimization technique has selected six hyperparameters which provided the best result of yield prediction with a mean absolute error (MAE) of 0.018. Finally, the predicted yield can be compared with other existing yields and the genotypes having better yield in the prediction process can be given to the grouping algorithm. The four cluster results found by PSO-based K-means are recommended for a breeding program using K-means inter-cluster distance mapping and descriptive statistics analysis.

Description

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By