Forecasting Maize Production in Mozambique: A Comparative Analysis of Arima And Lstm Models
Abstract
Filipe Mahaluca, Faizal Carsane and Alfeu Vilanculos
This study investigates the prediction of maize production in Mozambique, a crucial component for the country's food security, using two predictive models: ARIMA and LSTM. The research encompasses historical data from 1961 to 2022, allowing for a detailed analysis of trends and variations in production over the decades. The methodology involved ARIMA modeling, known for its effectiveness in capturing linear patterns in time series, and the LSTM model, which excels in forecasting nonlinear and complex patterns in temporal data. For the ARIMA model, the first step was to conduct an exploratory analysis of the time series, identifying the need for transformation to achieve stationarity. The Dickey-Fuller test confirmed the necessity of differencing, removing long-term trends. After this transformation, the ARIMA model was fitted, and its parameters were estimated using the maximum likelihood method. Three ARIMA models were tested (ARIMA (1,1,0), ARIMA (0,1,1), and ARIMA (1,1,1)), and their performance was compared using metrics such as AIC, BIC, RMSE, and MAPE. The ARIMA (1,1,1) model emerged as the most robust, offering the best balance between simplicity and accuracy in capturing the dynamics of maize production.
Concurrently, the LSTM model was trained using feedback neural networks, with normalized data to enhance training efficiency. The model architecture consisted of two LSTM layers with 50 units each, followed by a dense layer to generate predictions. The model was trained for 100 epochs using the Adam optimizer and mean squared error (MSE) loss function. LSTM evaluation was conducted using data from 2014 to 2022, which were not used in training, and prediction accuracy was measured using RMSE and MAPE. The results indicate that while the ARIMA (1,1,1) model showed solid performance, with an RMSE of 390,016.3 and a MAPE of 16.39%, the LSTM model outperformed it in predictive accuracy, achieving a significantly lower MAPE of 2.64%. LSTM proved more effective in capturing the complexities of the maize production time series, particularly in years of greater variability. These findings corroborate previous studies that highlight the superiority of LSTM neural networks in scenarios where time series exhibit nonlinear patterns and complex external influences.
Maize production forecasts for the period 2023 to 2030 were generated using the LSTM model combined with the Bootstrapping technique, which allowed the creation of 95% confidence intervals, quantifying the uncertainty of the predictions. The forecasts indicate stabilization in maize production, with small annual variations but no significant growth. While the stabilization is positive, it raises concerns in the context of food security, particularly considering Sustainable Development Goal 2 (SDG 2), which aims to eradicate hunger by 2030. The lack of substantial growth may hinder Mozambique's ability to meet the growing food needs of its population.
The conclusions of this study demonstrate that the LSTM model is a powerful and more accurate tool than ARIMA for predicting maize production in Mozambique, underscoring the need for proactive agricultural policies and continued investments in technologies to increase productivity and mitigate food insecurity risks. The uncertainties in the confidence intervals of the forecasts highlight the importance of strategic planning and political interventions to ensure the resilience of Mozambique's agricultural sector in the face of climate change and economic fluctuations. In addition to advancing scientific knowledge in agricultural production forecasting, the study provides valuable insights for public policy formulation aimed at food security and sustainable development in Mozambique.