نوع مقاله : مقاله پژوهشی
عنوان مقاله English
نویسندگان English
In this study, nine ensemble and machine learning algorithms, including Xgboost, Catboost, Extra Trees, Random Forest, M5, MLP, K-NN, Decision Tree, and SVR, were employed to estimate missing daily streamflow data for the Karkheh River in southwestern Iran. To estimate the missing data at Abdolkhan and Paye-Pol stations, the daily flow data from the Hamidiyeh hydrometric station, as a neighboring station, was analyzed over a 40-year period. Hyperparameter optimization for these algorithms was carried out using the Optuna method. A thorough comparison of model performance showed that the Xgboost algorithm, by learning complex nonlinear relationships, provided the highest estimation accuracy. The results revealed that at Abdolkhan and Paye-Pol stations, Xgboost achieved the highest efficiency, with the highest coefficient of determination (R²) values of 0.95 and 0.78, the lowest mean absolute error (MAE) values of 18.76 and 36.45, the lowest root mean square error (RMSE) values of 43.75 and 108.87, and the lowest relative root mean square error (RRMSE) values of 0.20 and 0.46, respectively.Furthermore, Taylor diagrams confirmed the superiority of the Xgboost model at both stations. These findings highlight the ability of Xgboost to overcome spatial challenges and handle limited data effectively.The Catboost model achieved second place among the models evaluated, with 11% and 5% lower accuracy compared to the Xgboost model at the Abdolkhan and Pay-Pol stations, respectively. The results of this study can be valuable for estimating missing flow data at other stations of this river and play a significant role in the effective management of water resources.
کلیدواژهها English