<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article">
 <front>
  <journal-meta>
   <journal-id journal-id-type="publisher-id">
    jamp
   </journal-id>
   <journal-title-group>
    <journal-title>
     Journal of Applied Mathematics and Physics
    </journal-title>
   </journal-title-group>
   <issn pub-type="epub">
    2327-4352
   </issn>
   <issn publication-format="print">
    2327-4379
   </issn>
   <publisher>
    <publisher-name>
     Scientific Research Publishing
    </publisher-name>
   </publisher>
  </journal-meta>
  <article-meta>
   <article-id pub-id-type="doi">
    10.4236/jamp.2024.128173
   </article-id>
   <article-id pub-id-type="publisher-id">
    jamp-135419
   </article-id>
   <article-categories>
    <subj-group subj-group-type="heading">
     <subject>
      Articles
     </subject>
    </subj-group>
    <subj-group subj-group-type="Discipline-v2">
     <subject>
      Physics 
     </subject>
     <subject>
       Mathematics
     </subject>
    </subj-group>
   </article-categories>
   <title-group>
    Predicting Malaria Dynamics in Burundi Using Deep Learning Models
   </title-group>
   <contrib-group>
    <contrib contrib-type="author" xlink:type="simple">
     <name name-style="western">
      <surname>
       Daxelle
      </surname>
      <given-names>
       Sakubu
      </given-names>
     </name> 
     <xref ref-type="aff" rid="aff1"> 
      <sup>1</sup>
     </xref>
    </contrib>
    <contrib contrib-type="author" xlink:type="simple">
     <name name-style="western">
      <surname>
       Kelly Joelle Gatore
      </surname>
      <given-names>
       Sinigirira
      </given-names>
     </name> 
     <xref ref-type="aff" rid="aff1"> 
      <sup>1</sup>
     </xref> 
     <xref ref-type="aff" rid="aff2"> 
      <sup>2</sup>
     </xref>
    </contrib>
    <contrib contrib-type="author" xlink:type="simple">
     <name name-style="western">
      <surname>
       David
      </surname>
      <given-names>
       Niyukuri
      </given-names>
     </name> 
     <xref ref-type="aff" rid="aff1"> 
      <sup>1</sup>
     </xref> 
     <xref ref-type="aff" rid="aff2"> 
      <sup>2</sup>
     </xref> 
     <xref ref-type="aff" rid="aff3"> 
      <sup>3</sup>
     </xref>
    </contrib>
   </contrib-group> 
   <aff id="aff1">
    <addr-line>
     aDoctoral School, University of Burundi, Bujumbura, Burundi
    </addr-line> 
   </aff> 
   <aff id="aff2">
    <addr-line>
     aDepartment of Mathematics, University of Burundi, Bujumbura, Burundi
    </addr-line> 
   </aff> 
   <aff id="aff3">
    <addr-line>
     aThe South African Department of Science and Technology-National Research Foundation, (DST-NRF) Centre of Excellence in Epidemiological Modelling and Analysis (SACEMA), Stellenbosch University, Cape Town, South Africa
    </addr-line> 
   </aff> 
   <pub-date pub-type="epub">
    <day>
     06
    </day> 
    <month>
     08
    </month>
    <year>
     2024
    </year>
   </pub-date> 
   <volume>
    12
   </volume> 
   <issue>
    08
   </issue>
   <fpage>
    2904
   </fpage>
   <lpage>
    2917
   </lpage>
   <history>
    <date date-type="received">
     <day>
      19,
     </day>
     <month>
      June
     </month>
     <year>
      2024
     </year>
    </date>
    <date date-type="published">
     <day>
      19,
     </day>
     <month>
      June
     </month>
     <year>
      2024
     </year> 
    </date> 
    <date date-type="accepted">
     <day>
      19,
     </day>
     <month>
      August
     </month>
     <year>
      2024
     </year> 
    </date>
   </history>
   <permissions>
    <copyright-statement>
     © Copyright 2014 by authors and Scientific Research Publishing Inc. 
    </copyright-statement>
    <copyright-year>
     2014
    </copyright-year>
    <license>
     <license-p>
      This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/
     </license-p>
    </license>
   </permissions>
   <abstract>
    Malaria continues to be a major public health problem on the African continent, particularly in Sub-Saharan Africa despite the ongoing efforts and significant progress that has been made. In the case of Burundi, malaria remains a major public health concern in the general population. In the literature, there are limited malaria prediction models for Burundi knowing that such tools are much needed for intervention design. In this study, deep-learning models are built to estimate malaria cases in Burundi. The forecast of malaria cases was carried out both at the provincial and national levels. Long short term memory (LSTM) model, a type of deep learning model, has been used to achieve best results using climate-change related factors such as temperature, rainfall, relative humidity, together with malaria historical data and human population. With this model, the results showed that different parameter tuning can be used to determine the minimum and maximum expected malaria cases. The univariate version of that model (LSTM), which learns from previous dynamics of malaria cases, gives more precise estimates, but both univariate and multivariate models have the same overall trends at the province level and country level.
   </abstract>
   <kwd-group> 
    <kwd>
     Malaria
    </kwd> 
    <kwd>
      Prediction
    </kwd> 
    <kwd>
      Deep Learning
    </kwd> 
    <kwd>
      Long-Short-Term Memory (LSTM)
    </kwd> 
    <kwd>
      Burundi
    </kwd>
   </kwd-group>
  </article-meta>
 </front>
 <body>
  <sec id="s1">
   <title>1. Introduction</title>
   <p>Malaria is an infectious disease caused by the Plasmodium falciparum parasite, transmitted by the bite of the female Anopheles mosquito. There are four main types of malaria parasites: Plasmodium falciparum, Plasmodium vivax, Plasmodium malariae, and Plasmodium ovale. Plasmodium falciparum is the most dangerous, with a long incubation period of 6 - 14 days <xref ref-type="bibr" rid="scirp.135419-1">
     [1]
    </xref>, and it is the primary cause of malaria in Burundi, accounting for nearly 90% of cases <xref ref-type="bibr" rid="scirp.135419-2">
     [2]
    </xref>.</p>
   <p>According to the World Health Organisation (WHO) report <xref ref-type="bibr" rid="scirp.135419-3">
     [3]
    </xref>, estimated malaria cases increased from 213 million to 228 million, and deaths increased from 534,000 to 602,000 in the WHO African Region between 2019 and 2020. This region accounted for approximately 95% of all cases and 96% of all deaths globally, children under the age of five account for 80% of all deaths in this region.</p>
   <p>In Burundi, malaria infection is among the main public health concerns after diarrhoea diseases, neonatal disorders, and tuberculosis <xref ref-type="bibr" rid="scirp.135419-3">
     [3]
    </xref>-<xref ref-type="bibr" rid="scirp.135419-6">
     [6]
    </xref>. During the recent malaria outbreak in 2017 <xref ref-type="bibr" rid="scirp.135419-7">
     [7]
    </xref>-<xref ref-type="bibr" rid="scirp.135419-9">
     [9]
    </xref>, World Vision International assumed that climate change, population density, change in agricultural methods, food shortages and a lack of information and action to prevent malaria were the main driving factors contributing to the prevalence of the disease in Burundi <xref ref-type="bibr" rid="scirp.135419-10">
     [10]
    </xref>.</p>
   <p>On a global scale, coordinated initiatives to minimize the malaria epidemic are being planned as part of the millennium development goals. In Burundi, the package of interventions for malaria control during pregnancy includes the promotion and use of ITN, IPTp with sulfadoxine-pyrimethamine under directly observed treatment, and fast and successful treatment. The NMCP has yet to follow WHO recommendations from 2016, which increased the recommended number of prenatal care visits from four to eight. With Malaria Operational Plan (MOP) funds from Fiscal Year (FY) 2021 <xref ref-type="bibr" rid="scirp.135419-11">
     [11]
    </xref>, the team proposes to trial an evidence-based group prenatal care strategy to enhance IPTp uptake (according to DHIS2, 54% of women received the recommended three doses at U.S. Government-supported health centers in the first quarter of FY 2021). Although massive efforts have already been deployed, Malaria still prevails. Thus, this project aims to understand the contributions of climate by using machine learning models to predict the cases of Malaria.</p>
   <p>In recent years, there has been a great deal of interest in the development and application of machine learning (ML) in the field of infectious diseases <xref ref-type="bibr" rid="scirp.135419-12">
     [12]
    </xref>. Not only as a catalyst for academic studies but also as a critical means of detecting pathogenic microorganisms, implementing public health surveillance, investigating host-pathogen interactions, discovering drug and vaccine candidates, etc. According to one survey, ML is used in 77% of the products we use today. Machine learning (ML) is a subfield of Artificial Intelligence (AI) that is an important tool in bioinformatics <xref ref-type="bibr" rid="scirp.135419-13">
     [13]
    </xref>. When confronted with a range of large and complex data sets that must be analyzed, ML may employ sophisticated algorithms and efficient models to extract meaningful information from vast amounts of complex data-sets <xref ref-type="bibr" rid="scirp.135419-14">
     [14]
    </xref>-<xref ref-type="bibr" rid="scirp.135419-16">
     [16]
    </xref>. Machine learning extracts useful information from enormous amounts of data by using algorithms to recognise patterns and learn in an iterative process. Instead of relying on any preconceived equation that may serve as a model, ML algorithms use computing methods to learn directly from data <xref ref-type="bibr" rid="scirp.135419-17">
     [17]
    </xref>. The union of mathematics and computer science in ML has shown significant potential as a breakthrough in science and technology, and it has been applied to a wide range of scientific fields, including biology. Deep learning models which are part of machine learning relies on multiple hidden layers to learn powerful representations of the input data that has been used in order to predict a DNA sequence function <xref ref-type="bibr" rid="scirp.135419-18">
     [18]
    </xref>, the authors used a convolution neural network combined with LSTM while they used random dropout for improvement. However, the study was limited in its ability to determine the optimal range of dropout.</p>
   <p>A number of studies have already been conducted on the prediction of malaria in Burundi, and machine learning techniques, such as artificial neural networks, have been applied <xref ref-type="bibr" rid="scirp.135419-19">
     [19]
    </xref>. The authors investigated malaria in different groups of ages and the impact of meteorological factors on the high number of malaria cases during some seasons. Different degrees of precision were reported from previous investigations. Nevertheless, the paper did not show the prediction on a province level which may contribute during intervention or emergency cases on what part of the country needs more attention in a specific moment. The overarching goal of this research is to investigate malaria case predictions based on meteorological data in order to help future intervention teams on what part of the country may need more attention due to its high number of malaria cases that may contribute to the country becoming an endemic situation. Hence, LSTM will be used to forecast the incidence of malaria in all five provinces. This study may help improve public health measures, especially at the district level.</p>
  </sec><sec id="s2">
   <title>2. Material and Methods</title>
   <p>In the following section, we discuss the data we used and how we built the deep learning models. All data used was from Burundi, and data analysis, and the models were processed using Python Language (version 3.6.5) <xref ref-type="bibr" rid="scirp.135419-20">
     [20]
    </xref>.</p>
   <p>It is worth noting that, before choosing the best deep learning model we used in this work, we explored so many different deep learning models such as decision trees and artificial neural. However, these alternative models were unable to achieve the same level of accuracy as the LSTM model.</p>
   <sec id="s2_1">
    <title>2.1. Data Description</title>
    <p>The study was carried out using monthly data, collected from different sources, namely: Geographical Institute of Burundi (Institut Geographique du Burundi, IGEBU), the Institute of Statistics and Economic Studies of Burundi Institut de Statistiques et d’Etudes Economiques du Burundi, ISTEEBU and Burundi’s National Malaria Control Programme (NMCP). The data was collected for all the eighteen provinces of the country. Since it’s national data they are not available for the public.</p>
   </sec>
   <sec id="s2_2">
    <title>2.2. Data Extraction</title>
    <p>Data collected from IGEBU were on a monthly scale from 2010 to 2022 with parameters such as relative humidity, rainfall, and temperature with their maximum and minimum values. The average was calculated and inserted in the data-set. Historical malaria data was obtained at the NMCP Burundi on a monthly scale from January 2010 to December 2022 for all 18 provinces. The human population feature was available online on ISTEEBU website and the human population was calculated annually.</p>
   </sec>
   <sec id="s2_3">
    <title>2.3. Data Processing</title>
    <p>The data collected was on a different time scale monthly and annually, thus the human population was considered constant throughout the year. The meteorological data contained some missing values that were filled using an algorithm called miss Forest <xref ref-type="bibr" rid="scirp.135419-21">
      [21]
     </xref>. The experiments were done on 80% of the data set while the testing was done on the rest of the data set which means 20%. The mean squared error was used for loos function and bacthe size was 12. This algorithm was judged to be suitable since it takes into account the possible relation between variables. The data were normalized before being fed into the neural networks. Recently, the communes, zones, and hills/neighbourhoods of the Republic of Burundi have been the subject of administrative regrouping <xref ref-type="bibr" rid="scirp.135419-22">
      [22]
     </xref>: Bujumbura (west part), Buhumuza (east part), Gitega (central part), Burunga (southern part) and Butanyerera (northern part) are the five provinces of the country in the new delimitation of provinces. This grouping is mainly based on the so-called natural regions of Burundi, which exhibit differences in terms of climate conditions, agriculture, landscape, and social life in those areas.</p>
    <p>In this reform, the provinces were regrouped as follow: Bujumbura included all the commune of Bujumbura Mairie, Bujumbura Rural, Bubanza and Cibitoke. Gitega assembled all the commune of Gitega, Mwaro, Karuzi and Muramvya. Buhumuza grouped together all the communes of Cankuzo, Muyinga and Ruyigi. Butanyerera aggregated all the communes of Kirundo, Ngozi and Kayanza. Finally, Burunga included all the communes of Bururi, Makamba, Rumonge and Rutana. Therefore, the data-set of these new delimited provinces were the mean of the old provinces regrouped per month for the meteorological data and the sum for the human population and malaria cases data.</p>
    <p>After processing the data, we start to build the neural network models. The prediction model was built with four layers and several units. Nevertheless, the performance was not good, hence a different approach was taken in order to find good results.</p>
   </sec>
   <sec id="s2_4">
    <title>2.4. Building ML Models</title>
    <p>Since the data set was on a monthly scale, a different model that took into account the previous information seems to be appropriate for this study. The kind of algorithm that takes into account previous information in a chronological manner is a recurrent neural network. This kind of neural network, unlike others, has feedback connections. The network connection weights and biases change once per training episode, similar to how physiological changes in synaptic strengths store long-term memories; the network’s activation patterns change once per time step, similar to how the brain’s electrical firing patterns change moment-to-moment to store short-term memories. Recurrent neural networks are frequently used in multiple domains to predict future events based on previous experience. Long-Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) architecture that is particularly well-suited for handling sequences of data. It was introduced by Hochreiter and Schmidhuber in 1997 <xref ref-type="bibr" rid="scirp.135419-23">
      [23]
     </xref> and has since become one of the most popular and widely used architectures for tasks involving sequential data. The model was implemented using Tesnsorflow and Keras. The LSTM model architecture consisted of one layer of LSTM with 5 units, followed by a dense layer.</p>
    <p>For instance, Long Short Term Memory (LSTM) model can be used for tasks like connected, unsegmented handwriting recognition, video games, speech recognition, automated translation, healthcare, Speech activity detection and Robotics. In the healthcare system, it has been used for cardiovascular prediction analytics, where it yields the best accuracy among other machine learning models <xref ref-type="bibr" rid="scirp.135419-24">
      [24]
     </xref>. Since LSTM has been shown to perform well in aiding in rational decision making, it was used in this study to predict malaria cases.</p>
    <p>In <xref ref-type="fig" rid="fig1">
      Figure 1
     </xref>, a long short term memory cell is depicted where the square represents the layers, the ellipse is the component wise operation, 
     <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
       <msub> 
        <mi>
          C 
        </mi> 
        <mi>
          i 
        </mi> 
       </msub> 
       <mo> 
       </mo> 
      </mrow> 
     </math> are hidden state vectors and 
     <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
       <msub> 
        <mi>
          X 
        </mi> 
        <mi>
          i 
        </mi> 
       </msub> 
      </mrow> 
     </math> are input vector to the LSTM unit and h<sub>i</sub> is hidden state vector also known as output vector 
     <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
       <msub> 
        <mi>
          y 
        </mi> 
        <mi>
          i 
        </mi> 
       </msub> 
      </mrow> 
     </math> of the LSTM unit.</p>
    <fig id="fig1" position="float">
     <label>Figure 1</label>
     <caption>
      <title>Figure 1. Long short term memory (LSTM) cell.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId18.jpeg?20240822024639" />
    </fig>
    <p>
     <xref ref-type="bibr" rid="scirp.135419-"></xref>In <xref ref-type="fig" rid="fig2">
      Figure 2
     </xref>, a univariate LSTM model is shown in a sequential manner with previous states alongside the input vectors 
     <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
       <msub> 
        <mi>
          x 
        </mi> 
        <mi>
          i 
        </mi> 
       </msub> 
      </mrow> 
     </math> and outputs 
     <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
       <msub> 
        <mi>
          y 
        </mi> 
        <mi>
          i 
        </mi> 
       </msub> 
      </mrow> 
     </math>. The previous malaria cases are the input vector and the output vector is the actual malaria cases. The Multivariate LSTM model is represented in <xref ref-type="fig" rid="fig3">
      Figure 3
     </xref> with the multiple input vector 
     <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
       <msub> 
        <mi>
          x 
        </mi> 
        <mi>
          i 
        </mi> 
       </msub> 
       <mo> 
       </mo> 
      </mrow> 
     </math> up to 
     <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
       <msub> 
        <mi>
          x 
        </mi> 
        <mi>
          n 
        </mi> 
       </msub> 
      </mrow> 
     </math> and the target 
     <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
       <msub> 
        <mi>
          y 
        </mi> 
        <mi>
          i 
        </mi> 
       </msub> 
      </mrow> 
     </math>. In the multivariate LSTM model the input vectors are the climate data, human population and previous malaria cases, and the output vectors are the current malaria cases.</p>
    <fig id="fig2" position="float">
     <label>Figure 2</label>
     <caption>
      <title>Figure 2. Univariate long short term memory.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId29.jpeg?20240822024639" />
    </fig>
    <fig id="fig3" position="float">
     <label>Figure 3</label>
     <caption>
      <title>Figure 3. Multivariate long short term memory.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId30.jpeg?20240822024639" />
    </fig>
    <p>After tuning the data-set in the ML models we obtained results which are presented in the following section.</p>
   </sec>
  </sec><sec id="s3">
   <title>3. Results</title>
   <p>After running and fitting the data to the models, the error between the actual and predicted cases was calculated using the root mean square error (RMSE). From the results obtained are observed in <xref ref-type="table" rid="table1">
     Table 1
    </xref>, overall, the univariate LSTM model got the smallest RMSE while the multivariate got the biggest RMSE. In Bujumbura province, the multivariate error is nearly four times as big as the univariate error. In the case of Gitega province, the error gap is three times big between multivariate and univariate and this is the same for Butanyerera province as well. In Burunga province, the multivariate error is twice bigger than the univariate error. For Buhumuza province, the error gap was huge, the multivariate error was about eight times bigger than the univariate one. In the country-level prediction, the multivariate error disparity was three times larger than the univariate one.</p>
   <table-wrap id="table1">
    <label>
     <xref ref-type="table" rid="table1">
      Table 1
     </xref></label>
    <caption>
     <title>
      <xref ref-type="bibr" rid="scirp.135419-"></xref>Table 1. RMSE of malaria cases between Univariate LSTM and Multivariate LSTM.</title>
    </caption>
    <table class="MsoTableGrid custom-table" border="0" cellspacing="0" cellpadding="0"> 
     <tr> 
      <td rowspan="2" class="acenter" width="35.35%">Provinces<p style="text-align:center"></p></td> 
      <td class="custom-bottom-td acenter" width="64.65%" colspan="2">RMSE<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="custom-bottom-td custom-top-td acenter" width="32.32%">Univariate LSTM<p style="text-align:center"></p></td> 
      <td class="custom-bottom-td custom-top-td acenter" width="32.32%">Multivariate LSTM<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="custom-top-td acenter" width="35.35%">Bujumbura<p style="text-align:center"></p></td> 
      <td class="custom-top-td acenter" width="32.32%">4868.69<p style="text-align:center"></p></td> 
      <td class="custom-top-td acenter" width="32.32%">16777.17<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="acenter" width="35.35%">Gitega<p style="text-align:center"></p></td> 
      <td class="acenter" width="32.32%">10943.18<p style="text-align:center"></p></td> 
      <td class="acenter" width="32.32%">31012.00<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="acenter" width="35.35%">Burunga<p style="text-align:center"></p></td> 
      <td class="acenter" width="32.32%">6403.33<p style="text-align:center"></p></td> 
      <td class="acenter" width="32.32%">12964.48<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="acenter" width="35.35%">Butanyerera<p style="text-align:center"></p></td> 
      <td class="acenter" width="32.32%">8288.07<p style="text-align:center"></p></td> 
      <td class="acenter" width="32.32%">25187.09<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="acenter" width="35.35%">Buhumuza<p style="text-align:center"></p></td> 
      <td class="acenter" width="32.32%">5664.18<p style="text-align:center"></p></td> 
      <td class="acenter" width="32.32%">44893.25<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="acenter" width="35.35%">Country level: Burundi<p style="text-align:center"></p></td> 
      <td class="acenter" width="32.32%">31635.95<p style="text-align:center"></p></td> 
      <td class="acenter" width="32.32%">119724.68<p style="text-align:center"></p></td> 
     </tr> 
    </table>
   </table-wrap>
   <table-wrap id="table2">
    <label>
     <xref ref-type="table" rid="table2">
      Table 2
     </xref></label>
    <caption>
     <title>
      <xref ref-type="bibr" rid="scirp.135419-"></xref>Table 2. Number of malaria cases predicted by Univariate LSTM and Multivariate LSTM and observed.</title>
    </caption>
    <table class="MsoTableGrid custom-table" border="0" cellspacing="0" cellpadding="0"> 
     <tr> 
      <td rowspan="2" class="acenter" width="28.89%">Provinces<p style="text-align:center"></p></td> 
      <td class="custom-bottom-td acenter" width="71.11%" colspan="3">Malaria Cases<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="custom-bottom-td custom-top-td acenter" width="23.70%">Observed<p style="text-align:center"></p></td> 
      <td class="custom-bottom-td custom-top-td acenter" width="23.70%">Univariate LSTM<p style="text-align:center"></p></td> 
      <td class="custom-bottom-td custom-top-td acenter" width="23.70%">Multivariate LSTM<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="custom-top-td acenter" width="28.89%">Bujubura<p style="text-align:center"></p></td> 
      <td class="custom-top-td acenter" width="23.70%">2,253,588<p style="text-align:center"></p></td> 
      <td class="custom-top-td acenter" width="23.70%">2,232,670<p style="text-align:center"></p></td> 
      <td class="custom-top-td acenter" width="23.70%">2,280,516<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="acenter" width="28.89%">Gitega<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">2,395,666<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">2,444,657<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">2,338,582<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="acenter" width="28.89%">Burunga<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">1,901,832<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">1,890,808<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">1,974,578<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="acenter" width="28.89%">Butanyerera<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">3,568,702<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">3,507,692<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">3,808,420<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="acenter" width="28.89%">Buhumuza<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">2,772,583<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">2,808,850<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">3,567,723<p style="text-align:center"></p></td> 
     </tr> 
     <tr> 
      <td class="acenter" width="28.89%">Country level: Burundi<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">12,892,371<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">12,841,653<p style="text-align:center"></p></td> 
      <td class="acenter" width="23.70%">15,215,766<p style="text-align:center"></p></td> 
     </tr> 
    </table>
   </table-wrap>
   <sec id="s3_1">
    <title>3.1. Province-Level Predictions</title>
    <p>The prediction of malaria cases on the province level is shown in <xref ref-type="fig" rid="fig4">
      Figure 4
     </xref> from October 2020 to September 2022. In univariate LSTM prediction, the curve trends are followed in most cases with the observed cases being slightly higher than the expected ones. In the multivariate LSTM prediction, the curve trends are not coordinated except for the country level prediction. The difference between the observed and predicted cases is seen in <xref ref-type="table" rid="table2">
      Table 2
     </xref>. The total number of malaria cases in Gitega province during that time period was higher than predicted by the multivariate model but lower than predicted by the univariate model. The observed malaria cases in Burunga province were somewhat higher than the multivariate prediction and lower than the univariate prediction. The multivariate model in Buhumuza province had a much greater number of instances than the actual cases, but the observed cases during that time were slightly fewer than the number predicted by the univariate model. The multivariate for Butanyerera was substantially higher than the observed instances, despite the fact that the actual cases were only marginally higher than the univariate forecast. The multivariate LSTM model predictions were marginally higher than the actual instances, however the observed malaria cases in the Bujumbura province were slightly higher than the Univariate LSTM model projections. (<xref ref-type="fig" rid="figFigures 4-13">
      Figures 4-13
     </xref>)</p>
    <fig id="fig4" position="float">
     <label>Figure 4</label>
     <caption>
      <title>Figure 4. Univariate LSTM Gitega Prediction.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId31.jpeg?20240822024640" />
    </fig>
    <fig id="fig5" position="float">
     <label>Figure 5</label>
     <caption>
      <title>Figure 5. Multivariate LSTM Gitega Prediction.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId32.jpeg?20240822024640" />
    </fig>
    <fig id="fig6" position="float">
     <label>Figure 6</label>
     <caption>
      <title>Figure 6. Univariate LSTM Bujumbura Prediction.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId33.jpeg?20240822024640" />
    </fig>
    <fig id="fig7" position="float">
     <label>Figure 7</label>
     <caption>
      <title>Figure 7. Multivariate LSTM Bujumbura Prediction.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId34.jpeg?20240822024640" />
    </fig>
    <fig id="fig8" position="float">
     <label>Figure 8</label>
     <caption>
      <title>Figure 8. Univariate LSTM Burunga Prediction.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId35.jpeg?20240822024640" />
    </fig>
    <fig id="fig9" position="float">
     <label>Figure 9</label>
     <caption>
      <title>Figure 9. Multivariate LSTM Burunga Prediction.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId36.jpeg?20240822024640" />
    </fig>
   </sec>
   <sec id="s3_2">
    <title>3.2. Country-Level Predictions</title>
    <p>After predicting malaria cases at the provincial level, the country-level prediction was made. The results are depicted in <xref ref-type="fig" rid="fig14">
      Figure 14
     </xref> and <xref ref-type="fig" rid="fig15">
      Figure 15
     </xref>. The data-set of the country level was the mean of all provinces for the climate data and the</p>
    <fig id="fig10" position="float">
     <label>Figure 10</label>
     <caption>
      <title>Figure 10. Univariate LSTM Buhumuza Prediction.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId37.jpeg?20240822024640" />
    </fig>
    <fig id="fig11" position="float">
     <label>Figure 11</label>
     <caption>
      <title>Figure 11. Multivariate LSTM Buhumuza Prediction.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId38.jpeg?20240822024640" />
    </fig>
    <fig id="fig12" position="float">
     <label>Figure 12</label>
     <caption>
      <title>Figure 12. Univariate LSTM Butanyerera Prediction.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId39.jpeg?20240822024640" />
    </fig>
    <fig id="fig13" position="float">
     <label>Figure 13</label>
     <caption>
      <title>Figure 13. Multivariate LSTM Butanyerera Prediction.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId40.jpeg?20240822024640" />
    </fig>
    <fig id="fig14" position="float">
     <label>Figure 14</label>
     <caption>
      <title>Figure 14. Univariate LSTM Burundi Prediction.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId41.jpeg?20240822024640" />
    </fig>
    <fig id="fig15" position="float">
     <label>Figure 15</label>
     <caption>
      <title>Figure 15. Multivariate LSTM Burundi Prediction.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723809-rId42.jpeg?20240822024640" />
    </fig>
    <p>sum for the human population and malaria cases. The curve trends were coordinated with the actual trend in all models. Nevertheless, the multivariate predicted more malaria cases than observed one while the univariate predicted less. In comparison to the univariate model prediction, there were approximately 100,000 more malaria cases reported in the nation during that time period. However, compared to actual reported instances, the multivariate LSTM model projected a much higher number of cases, more than 2.5 million cases, than what was actually reported as seen in <xref ref-type="table" rid="table2">
      Table 2
     </xref>.</p>
   </sec>
  </sec><sec id="s4">
   <title>4. Discussion</title>
   <p>In general, the multivariate LSTM model predicted more malaria cases at the country level as well as at the province level. Specifically, when examining specific provinces, the multivariate LSTM tended to overpredict cases in the northeastern provinces of Butanyerera and Buhumuza. Conversely, in the southeastern provinces of Bujumbura and Burunga, the predictions aligned with the observed trend until October 2021 but started decreasing in 2022. In Gitega province, located in the central part of the country, the predicted cases of the multivariate LSTM model were significantly lower than the observed cases. However, the univariate LSTM model demonstrated the best precision, with the predicted curve following the trend at the country and province levels. The model predictions provided a confidence interval 95%, indicating that if the process was repeated, 95% of malaria cases would fall within the range defined by the lower and upper bounds, as shown in <xref ref-type="fig" rid="figFigures 4-15">
     Figures 4-15
    </xref>. This predictive model proved valuable in forecasting potential outbreaks. However, it is crucial to emphasize the importance of up-to-date information gathering and sharing, as the accuracy of the models relies on the most current data. <xref ref-type="fig" rid="fig14">
     Figure 14
    </xref>, <xref ref-type="fig" rid="fig15">
     Figure 15
    </xref> illustrate that the univariate LSTM results capture the overall trends of malaria cases and provide more precise estimates at the province level. While the univariate model outperformed the multivariate model, this aligns with previous research findings <xref ref-type="bibr" rid="scirp.135419-25">
     [25]
    </xref> suggesting that the univariate model tends to excel in short term predictions, while multivariate models perform better with longer prediction horizons. This could be due to the fact that the meteorological data may not directly influence the outbreak, but over time it can impact the environmental factors that contribute to the outbreak. The consistent trends observed across the models, particularly at the country level, confirm the significant influence of climate conditions on malaria outbreaks within the country over the long term. Combining the models could improve prediction accuracy at different time steps, as noted in <xref ref-type="bibr" rid="scirp.135419-26">
     [26]
    </xref>. The confidence interval for malaria cases utilized in this study enables researchers, policymakers, and healthcare professionals to assess plausible value ranges and make informed decisions based on the estimated range. Quantifies the uncertainty associated with the estimate and provides a measure of precision of the findings. Furthermore, investigating the effects of climate change on disease outbreaks extends beyond the dynamics of malaria transmission. This study integrated meteorological factors with historical data such as human population and total monthly malaria cases. Future work could focus on daily predictions using multivariate models to explore the possible impacts of climatic conditions, as previous studies have demonstrated the superiority of multivariate models in very short-term predictions <xref ref-type="bibr" rid="scirp.135419-27">
     [27]
    </xref>. Despite artificial neural networks often being perceived as black boxes, the results obtained highlight the importance of meteorological data in outbreak prediction, particularly at the country level where the highest number of cases were predicted. The use of deep learning models such as recurrent neural networks shows promise in predicting malaria outbreaks, which continue to pose challenges in sub-Saharan Africa. Expanding the application of artificial intelligence will facilitate collaboration among different intervention teams in the healthcare system, especially in responding to predicted increases or decreases in malaria cases.</p>
  </sec><sec id="s5">
   <title>Declarations</title>
   <p>National data was used for this study.</p>
   <p>A preprint has previously been published [2306.02685] <xref ref-type="bibr" rid="scirp.135419-28">
     [28]
    </xref>.</p>
  </sec>
 </body><back>
  <ref-list>
   <title>References</title>
   <ref id="scirp.135419-ref1">
    <label>1</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Sajana, T. and Narasingarao, M. (2017) Machine Learning Techniques for Malaria Disease Diagnosis—A Review. Journal of Advanced Research in Dynamical and Control Systems, 9, 349-369.
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref2">
    <label>2</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Medecins Sans Frontieres (2017) More about Malaria in Burundi.
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref3">
    <label>3</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     World Health Organization (WHO) (2022) World Malaria Report 2021.
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref4">
    <label>4</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     World Health Organization (WHO) (2017) World Malaria Report 2016.
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref5">
    <label>5</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     World Health Organization (WHO) (2018) World Malaria Report 2017.
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref6">
    <label>6</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Institute for Health Metrics and Evaluation (2020) Health Metrics for Burundi. &gt;http://www.healthdata.org/burundi 
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref7">
    <label>7</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     World Health Organization (WHO)/Regional Office for Africa (2017) Weekly Bulletin on Outbreaks and Other Emergencies: Week 27: 1-7 July 2017.
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref8">
    <label>8</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     United Nations for Children (UNICEF)/Burundi (2017) UNICEF Burundi Humanitarian Situation Report—31 March 2017.
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref9">
    <label>9</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Lok, P. and Dijk, S. (2019) Malaria Outbreak in Burundi Reaches Epidemic Levels with 5.7 Million Infected This Year. BMJ, 366, L5104. &gt;https://doi.org/10.1136/bmj.l5104
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref10">
    <label>10</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     World Vision (2017) Eight Facts about Burundi’s Malaria Epidemic.
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref11">
    <label>11</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     USAID (2022) Burundi Malaria Operational Plan Fiscal Year 2021.
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref12">
    <label>12</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Wiemken, T.L. and Kelley, R.R. (2020) Machine Learning in Epidemiology and Health Outcomes Research. Annual Review of Public Health, 41, 21-36. &gt;https://doi.org/10.1146/annurev-publhealth-040119-094437
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref13">
    <label>13</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Larrañaga, P., Calvo, B., Santana, R., Bielza, C., Galdiano, J., Inza, I., et al. (2006) Machine Learning in Bioinformatics. Briefings in Bioinformatics, 7, 86-112. &gt;https://doi.org/10.1093/bib/bbk007
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref14">
    <label>14</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Chekol, B.E. and Hagras, H. (2018) Employing Machine Learning Techniques for the Malaria Epidemic Prediction in Ethiopia. 2018 10th Computer Science and Electronic Engineering (CEEC), Colchester, 19-21 September 2018, 89-94. &gt;https://doi.org/10.1109/ceec.2018.8674210
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref15">
    <label>15</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Masinde, M. (2020) Africa’s Malaria Epidemic Predictor: Application of Machine Learning on Malaria Incidence and Climate Data. Proceedings of the 2020 4th International Conference on Compute and Data Analysis, Silicon, 9-12 March 2020, 29-37. &gt;https://doi.org/10.1145/3388142.3388158
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref16">
    <label>16</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Nkiruka, O., Prasad, R. and Clement, O. (2021) Prediction of Malaria Incidence Using Climate Variability and Machine Learning. Informatics in Medicine Unlocked, 22, Article ID: 100508. &gt;https://doi.org/10.1016/j.imu.2020.100508
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref17">
    <label>17</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Sajana, T. and Narasingarao, M. (2018) An Ensemble Framework for Classification of Malaria Disease. ARPN Journal of Engineering and Applied Sciences, 13, 3299-3307.
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref18">
    <label>18</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Li, D. and Huang, X. (2020) An Improved Deep Learning Model for Predicting DNA Sequence Function. Intelligent Information Management, 12, 36-42. &gt;https://doi.org/10.4236/iim.2020.121003
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref19">
    <label>19</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Mfisimana, L.D., Nibayisabe, E., Badu, K. and Niyukuri, D. (2022) Exploring Predictive Frameworks for Malaria in Burundi. Infectious Disease Modelling, 7, 33-44. &gt;https://doi.org/10.1016/j.idm.2022.03.003
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref20">
    <label>20</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Python (2018) Python Version 3.6.5.
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref21">
    <label>21</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Stekhoven, D.J. (2015) Missforest: Nonparametric Missing Value Imputation Using Random Forest. Astrophysics Source Code Library.
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref22">
    <label>22</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Gouvernement du Buvundi (2023) Loi organique N° 1/05 du 16 mars 2023 portant d’etermination et d’elimitation des provinces, des communes, des zones, des collines ou quartiers de la R’ epublique du Burundi, Gitega.
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref23">
    <label>23</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Hochreiter, S. and Schmidhuber, J. (1997) Long Short-Term Memory. Neural Computation, 9, 1735-1780. &gt;https://doi.org/10.1162/neco.1997.9.8.1735
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref24">
    <label>24</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Pathan, S.M.K. and Imran, S.B. (2024) Integrated Machine Learning and Deep Learning Models for Cardiovascular Disease Risk Prediction: A Comprehensive Comparative Study. Journal of Intelligent Learning Systems and Applications, 16, 12-22. &gt;https://doi.org/10.4236/jilsa.2024.161002
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref25">
    <label>25</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Chayama, M. and Hirata, Y. (2016) When Univariate Model-Free Time Series Prediction Is Better than Multivariate. Physics Letters A, 380, 2359-2365. &gt;https://doi.org/10.1016/j.physleta.2016.05.027
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref26">
    <label>26</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Salehi, S., Kavgic, M., Bonakdari, H. and Begnoche, L. (2024) Comparative Study of Univariate and Multivariate Strategy for Short-Term Forecasting of Heat Demand Density: Exploring Single and Hybrid Deep Learning Models. Energy and AI, 16, Article ID: 100343. &gt;https://doi.org/10.1016/j.egyai.2024.100343
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref27">
    <label>27</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Mandal, A.K., Sen, R., Goswami, S. and Chakraborty, B. (2021) Comparative Study of Univariate and Multivariate Long Short-Term Memory for Very Short-Term Forecasting of Global Horizontal Irradiance. Symmetry, 13, Article 1544. &gt;https://doi.org/10.3390/sym13081544
    </mixed-citation>
   </ref>
   <ref id="scirp.135419-ref28">
    <label>28</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Sakubu, D., Sinigirira, K.J.G. and Niyukuri, D. (2023) Predicting Malaria Dynamics in Burundi Using Deep Learning Models. arXiv: 2306.02685
    </mixed-citation>
   </ref>
  </ref-list>
 </back>
</article>