Catboost and Random
Forest for stage-to-discharge prediction in a Monsoon-dominated river system
Khambenour R., Shaikh A.F., Bhirud Y.L., Patil U.S. and Shelar V.V.
Disaster Advances; Vol. 19(1); 63-73;
doi: https://doi.org/10.25303/191da063073; (2026)
Abstract
Accurate river flow forecasting is critical for effective water resource management,
flood mitigation and disaster preparation, particularly in monsoon-driven river
systems. This study explores the use of machine learning models Random Forest and
CatBoost for predicting daily river discharge based solely on historical water level
data. The Narmada River basin, a major monsoon-influenced system in central India,
serves as the case study. Lagged water level features were incorporated to capture
temporal dependencies and model performance was evaluated using statistical metrics
including MAE, RMSE, R², NRMSE and RMSPE. Both models demonstrated strong predictive
capabilities across training, validation and testing phases, with CatBoost consistently
outperforming Random Forest in relative error metrics.
Time-series and scatter plot analyses further confirmed CatBoost’s superior ability
to capture dynamic flow variations, especially under peak conditions. The findings
highlight the robustness and reliability of data-driven approaches for stage-to-discharge
conversion, offering a viable alternative as traditional hydrological modeling faces
limitations due to sparse or dynamic input data. This study reinforces the potential
of machine learning techniques to enhance operational forecasting and water management
strategies in regions with pronounced seasonal variability and limited auxiliary
data.