<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article">
 <front>
  <journal-meta>
   <journal-id journal-id-type="publisher-id">
    jdaip
   </journal-id>
   <journal-title-group>
    <journal-title>
     Journal of Data Analysis and Information Processing
    </journal-title>
   </journal-title-group>
   <issn pub-type="epub">
    2327-7211
   </issn>
   <issn publication-format="print">
    2327-7203
   </issn>
   <publisher>
    <publisher-name>
     Scientific Research Publishing
    </publisher-name>
   </publisher>
  </journal-meta>
  <article-meta>
   <article-id pub-id-type="doi">
    10.4236/jdaip.2024.123019
   </article-id>
   <article-id pub-id-type="publisher-id">
    jdaip-134301
   </article-id>
   <article-categories>
    <subj-group subj-group-type="heading">
     <subject>
      Articles
     </subject>
    </subj-group>
    <subj-group subj-group-type="Discipline-v2">
     <subject>
      Computer Science 
     </subject>
     <subject>
       Communications, Physics 
     </subject>
     <subject>
       Mathematics
     </subject>
    </subj-group>
   </article-categories>
   <title-group>
    A Novel Approach for Developing a Linear Regression Model within Logistic Cluster Using Scikit-Learn
   </title-group>
   <contrib-group>
    <contrib contrib-type="author" xlink:type="simple">
     <name name-style="western">
      <surname>
       Nwosu
      </surname>
      <given-names>
       Ambrose
      </given-names>
     </name> 
     <xref ref-type="aff" rid="aff1"> 
      <sup>1</sup>
     </xref>
    </contrib>
    <contrib contrib-type="author" xlink:type="simple">
     <name name-style="western">
      <surname>
       Gilbert I. O.
      </surname>
      <given-names>
       Aimufua
      </given-names>
     </name> 
     <xref ref-type="aff" rid="aff1"> 
      <sup>1</sup>
     </xref>
    </contrib>
    <contrib contrib-type="author" xlink:type="simple">
     <name name-style="western">
      <surname>
       Choji Davou
      </surname>
      <given-names>
       Nyap
      </given-names>
     </name> 
     <xref ref-type="aff" rid="aff2"> 
      <sup>2</sup>
     </xref>
    </contrib>
   </contrib-group> 
   <aff id="aff1">
    <addr-line>
     aDepartments of Computer Sciences, Nasarawa State University, Keffi, Nigeria
    </addr-line> 
   </aff> 
   <aff id="aff2">
    <addr-line>
     aDepartments of Computer Sciences, University of Jos, Jos, Nigeria
    </addr-line> 
   </aff> 
   <pub-date pub-type="epub">
    <day>
     13
    </day> 
    <month>
     06
    </month>
    <year>
     2024
    </year>
   </pub-date> 
   <volume>
    12
   </volume> 
   <issue>
    03
   </issue>
   <fpage>
    348
   </fpage>
   <lpage>
    369
   </lpage>
   <history>
    <date date-type="received">
     <day>
      1,
     </day>
     <month>
      January
     </month>
     <year>
      2024
     </year>
    </date>
    <date date-type="published">
     <day>
      25,
     </day>
     <month>
      January
     </month>
     <year>
      2024
     </year> 
    </date> 
    <date date-type="accepted">
     <day>
      25,
     </day>
     <month>
      June
     </month>
     <year>
      2024
     </year> 
    </date>
   </history>
   <permissions>
    <copyright-statement>
     © Copyright 2014 by authors and Scientific Research Publishing Inc. 
    </copyright-statement>
    <copyright-year>
     2014
    </copyright-year>
    <license>
     <license-p>
      This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/
     </license-p>
    </license>
   </permissions>
   <abstract>
    Due to the rapid development of logistic industry, transportation cost is also increasing, and finding trends in transportation activities will impact positively in investment in transportation infrastructure. There is limited literature and data-driven analysis about trends in transportation mode. This thesis delves into the operational challenges of vehicle performance management within logistics clusters, a critical aspect of efficient supply chain operations. It aims to address the issues faced by logistics organizations in optimizing their vehicle fleets’ performance, essential for seamless logistics operations. The study’s core design involves the development of a predictive logistics model based on regression, focused on forecasting, and evaluating vehicle performance in logistics clusters. It encompasses a comprehensive literature review, research methodology, data sources, variables, feature engineering, and model training and evaluation and F-test analysis was done to identify and verify the relationships between attributes and the target variable. The findings highlight the model’s efficacy, with a low mean squared error (MSE) value of 3.42, indicating its accuracy in predicting performance metrics. The high R-squared (R
    <sup>2</sup>) score of 0.921 emphasizes its ability to capture relationships between input characteristics and performance metrics. The model’s training and testing accuracy further attest to its reliability and generalization capabilities. In interpretation, this research underscores the practical significance of the findings. The regression-based model provides a practical solution for the logistics industry, enabling informed decisions regarding resource allocation, maintenance planning, and delivery route optimization. This contributes to enhanced overall logistics performance and customer service. By addressing performance gaps and embracing modern logistics technologies, the study supports the ongoing evolution of vehicle performance management in logistics clusters, fostering increased competitiveness and sustainability in the logistics sector.
   </abstract>
   <kwd-group> 
    <kwd>
     Mean Squared Error
    </kwd> 
    <kwd>
      R
     <sup>2</sup> Score
    </kwd> 
    <kwd>
      F-Test
    </kwd> 
    <kwd>
      MSE
    </kwd>
   </kwd-group>
  </article-meta>
 </front>
 <body>
  <sec id="s1">
   <title>1. Introduction</title>
   <p>Transportation, warehousing, and distribution are all included in the logistics industry, which is a crucial part of the global economy. The smooth flow of goods from manufacturers to consumers is essential in this industry, and transportation plays a key role in this. For logistics clusters, which are strategically placed geographic areas where logistics activity is concentrated, efficient vehicle performance is crucial. These clusters provide advantages like increased connectivity, scale economies, and access to resources and infrastructure. To creating precise predictive models for vehicle performance in logistics clusters, the availability of extensive historical data is essential. Accurate vehicle performance predictions enable the optimisation of delivery routes, resource allocation, proactive maintenance, customer service, sustainability, cost reduction, real-time adaptation, and competitiveness <xref ref-type="bibr" rid="scirp.134301-1">
     [1]
    </xref>. By leveraging predictive models, logistics companies can optimise routes, allocate resources effectively, enhance customer service, and reduce operational costs, ensuring a competitive edge in the ever-evolving logistics industry <xref ref-type="bibr" rid="scirp.134301-2">
     [2]
    </xref>. A data-driven approach for route optimisation that incorporates real-time traffic data to enhance delivery performance was proposed by <xref ref-type="bibr" rid="scirp.134301-3">
     [3]
    </xref>. The significance of data-driven approaches in improving logistics operations has been highlighted by researchers. Former researches carried out were not analyzed using L2 and ridge regularization also F test was not carried out to test the relationships between data.</p>
   <sec id="s1_1">
    <title>The Aim of This Work</title>
    <p>The primary objective (aim) of this study is to develop a predictive logistics model using linear regression for accurate forecasting of vehicle performance within logistics clusters. This model aims to enhance resource allocation, maintenance planning, delivery routes, and overall logistics efficiency.</p>
    <sec id="s1">
     <title>2. Scikit-Learn Regression</title>
     <p>Scikit-learn, a powerful Python library for machine learning, provided us with a quick and efficient way to implement Regression <xref ref-type="bibr" rid="scirp.134301-4">
       [4]
      </xref>. This approach is widely used and trusted for its robustness and comprehensive set of tools for model development. It offers a set of fast tools for machine learning and statistical modeling, such as classification, regression, clustering, and dimensionality reduction, via a Python interface. Scikit-learn is a Python package that makes it easier to apply a variety of Machine Learning (ML) algorithms for predictive data analysis, such as linear regression.</p>
     <p>Linear regression is defined as the process of determining the straight line that best fits a set of dispersed data points <xref ref-type="bibr" rid="scirp.134301-5">
       [5]
      </xref>.</p>
     <p>Key Steps in Scikit-learn Regression:</p>
    </sec>
    <sec id="s2_2">
     <title>2.1. Custom Mathematical Models</title>
     <p>In addition to scikit-learn’s Regression, custom mathematical models was used to gain deeper insights into the underlying mathematical principles of regression and to explore the concept of Ridge Regression, a technique that introduces regularization to the regression model.</p>
     <p>The Mathematical model of Linear Regression was used to predict the city-mpg and highway-mpg of the car.</p>
     <p>By:</p>
     <p>
      <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
        <mi>
          h 
        </mi> 
        <mrow> 
         <mo>
           ( 
         </mo> 
         <mrow> 
          <mi>
            θ 
          </mi> 
          <mo>
            , 
          </mo> 
          <mi>
            x 
          </mi> 
         </mrow> 
         <mo>
           ) 
         </mo> 
        </mrow> 
        <mo>
          = 
        </mo> 
        <msub> 
         <mi>
           θ 
         </mi> 
         <mn>
           0 
         </mn> 
        </msub> 
        <mo>
          + 
        </mo> 
        <msub> 
         <mi>
           θ 
         </mi> 
         <mn>
           1 
         </mn> 
        </msub> 
        <msub> 
         <mi>
           x 
         </mi> 
         <mn>
           1 
         </mn> 
        </msub> 
        <mo>
          + 
        </mo> 
        <msub> 
         <mi>
           θ 
         </mi> 
         <mn>
           2 
         </mn> 
        </msub> 
        <msub> 
         <mi>
           x 
         </mi> 
         <mn>
           2 
         </mn> 
        </msub> 
        <mo>
          + 
        </mo> 
        <mo>
          ⋯ 
        </mo> 
        <msub> 
         <mi>
           θ 
         </mi> 
         <mi>
           n 
         </mi> 
        </msub> 
        <msub> 
         <mi>
           x 
         </mi> 
         <mi>
           n 
         </mi> 
        </msub> 
       </mrow> 
      </math> (1)</p>
     <p>- h (θ, x): The predicted value for input x using the regression model.</p>
     <p>- θ<sub>0</sub>: Intercept term.</p>
     <p>- θ<sub>1</sub>x<sub>1</sub>: regression coefficient (θ<sub>1</sub>).</p>
     <p>- θ<sub>n</sub>x<sub>n</sub>: regression coefficient of the last independent variable.</p>
     <p>- θ<sub>1</sub>, θ<sub>2</sub>, …, θ<sub>n</sub>: Coefficients for the features x<sub>1</sub>, x<sub>2</sub>, …, x<sub>n</sub>.</p>
     <p>- x<sub>1</sub>, x<sub>2</sub>, …, x<sub>n</sub>: Features of the input data.</p>
    </sec>
    <sec id="s2_3">
     <title>2.2. Key Steps in Custom Mathematical Models</title>
    </sec>
    <sec id="s2_4">
     <title>2.3. Hypothesis Function: Multiple Regression</title>
     <p>
      <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
        <mi>
          h 
        </mi> 
        <mrow> 
         <mo>
           ( 
         </mo> 
         <mrow> 
          <mi>
            θ 
          </mi> 
          <mo>
            , 
          </mo> 
          <mi>
            x 
          </mi> 
         </mrow> 
         <mo>
           ) 
         </mo> 
        </mrow> 
        <mo>
          = 
        </mo> 
        <msub> 
         <mi>
           θ 
         </mi> 
         <mn>
           0 
         </mn> 
        </msub> 
        <mo>
          + 
        </mo> 
        <msub> 
         <mi>
           θ 
         </mi> 
         <mn>
           1 
         </mn> 
        </msub> 
        <msub> 
         <mi>
           x 
         </mi> 
         <mn>
           1 
         </mn> 
        </msub> 
        <mo>
          + 
        </mo> 
        <msub> 
         <mi>
           θ 
         </mi> 
         <mn>
           2 
         </mn> 
        </msub> 
        <msub> 
         <mi>
           x 
         </mi> 
         <mn>
           2 
         </mn> 
        </msub> 
        <mo>
          + 
        </mo> 
        <mo>
          ⋯ 
        </mo> 
        <msub> 
         <mi>
           θ 
         </mi> 
         <mi>
           n 
         </mi> 
        </msub> 
        <msub> 
         <mi>
           x 
         </mi> 
         <mi>
           n 
         </mi> 
        </msub> 
       </mrow> 
      </math> (2)</p>
     <p>- h (θ, x): The predicted value for input x using the regression model.</p>
     <p>- θ<sub>0</sub>: Intercept term.</p>
     <p>- θ<sub>1</sub>x<sub>1</sub>: regression coefficient (θ<sub>1</sub>).</p>
     <p>- θ<sub>n</sub>x<sub>n</sub>: regression coefficient of the last independent variable.</p>
     <p>- θ<sub>1</sub>, θ<sub>2</sub>, …, θ<sub>n</sub>: Coefficients for the features x<sub>1</sub>, x<sub>2</sub>, …, x<sub>n</sub>.</p>
     <p>- x<sub>1</sub>, x<sub>2</sub>, …, x<sub>n</sub>: Features of the input data.</p>
    </sec>
    <sec id="s2_5">
     <title>2.4. Cost Function (Mean Squared Error)</title>
     <p>
      <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
        <mi>
          J 
        </mi> 
        <mrow> 
         <mo>
           ( 
         </mo> 
         <mi>
           θ 
         </mi> 
         <mo>
           ) 
         </mo> 
        </mrow> 
        <mo>
          = 
        </mo> 
        <mrow> 
         <mo>
           ( 
         </mo> 
         <mrow> 
          <mrow> 
           <mn>
             1 
           </mn> 
           <mo>
             / 
           </mo> 
           <mrow> 
            <mn>
              2 
            </mn> 
            <mi>
              m 
            </mi> 
           </mrow> 
          </mrow> 
         </mrow> 
         <mo>
           ) 
         </mo> 
        </mrow> 
        <mstyle displaystyle="true"> 
         <msub> 
          <mo>
            ∑ 
          </mo> 
          <mi>
            i 
          </mi> 
         </msub> 
         <mrow> 
          <msup> 
           <mrow> 
            <mrow> 
             <mo>
               ( 
             </mo> 
             <mrow> 
              <mi>
                h 
              </mi> 
              <mrow> 
               <mo>
                 ( 
               </mo> 
               <mrow> 
                <mi>
                  θ 
                </mi> 
                <mo>
                  , 
                </mo> 
                <msup> 
                 <mi>
                   x 
                 </mi> 
                 <mi>
                   i 
                 </mi> 
                </msup> 
               </mrow> 
               <mo>
                 ) 
               </mo> 
              </mrow> 
              <mo>
                − 
              </mo> 
              <msup> 
               <mi>
                 y 
               </mi> 
               <mi>
                 i 
               </mi> 
              </msup> 
             </mrow> 
             <mo>
               ) 
             </mo> 
            </mrow> 
           </mrow> 
           <mn>
             2 
           </mn> 
          </msup> 
         </mrow> 
        </mstyle> 
       </mrow> 
      </math> (3)</p>
     <p>- J(θ): The cost function that measures the error of the model’s predictions.</p>
     <p>- m: The number of training examples.</p>
     <p>- x<sup>i</sup>: The feature vector of the i-th training example.</p>
     <p>- y<sup>i</sup>: The actual target value of the i-th training example.</p>
     <p>- h (θ, x<sup>i</sup>): The predicted value for the i-th training example using the hypothesis function.</p>
    </sec>
    <sec id="s2_6">
     <title>2.5. Parameter Estimation</title>
     <p>- The parameters (θ<sub>0</sub>, θ<sub>1</sub>, …, θ<sub>n</sub>) of the regression model are estimated by minimizing the cost function J(θ).</p>
    </sec>
    <sec id="s2_7">
     <title>2.6. Ridge Regression Model</title>
     <p>1) Hypothesis Function for Ridge Regression:</p>
     <p>
      <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
        <mi>
          h 
        </mi> 
        <mrow> 
         <mo>
           ( 
         </mo> 
         <mrow> 
          <mi>
            θ 
          </mi> 
          <mo>
            , 
          </mo> 
          <mi>
            x 
          </mi> 
         </mrow> 
         <mo>
           ) 
         </mo> 
        </mrow> 
        <mo>
          = 
        </mo> 
        <msub> 
         <mi>
           θ 
         </mi> 
         <mn>
           0 
         </mn> 
        </msub> 
        <mo>
          + 
        </mo> 
        <msub> 
         <mi>
           θ 
         </mi> 
         <mn>
           1 
         </mn> 
        </msub> 
        <msub> 
         <mi>
           x 
         </mi> 
         <mn>
           1 
         </mn> 
        </msub> 
        <mo>
          + 
        </mo> 
        <msub> 
         <mi>
           θ 
         </mi> 
         <mn>
           2 
         </mn> 
        </msub> 
        <msub> 
         <mi>
           x 
         </mi> 
         <mn>
           2 
         </mn> 
        </msub> 
        <mo>
          + 
        </mo> 
        <mo>
          ⋯ 
        </mo> 
        <msub> 
         <mi>
           θ 
         </mi> 
         <mi>
           n 
         </mi> 
        </msub> 
        <msub> 
         <mi>
           x 
         </mi> 
         <mi>
           n 
         </mi> 
        </msub> 
       </mrow> 
      </math> (4)</p>
     <p>- h (θ, x): The predicted value for input x using the Ridge regression model.</p>
     <p>- θ<sub>0</sub>: Intercept term.</p>
     <p>- θ<sub>1</sub>, θ<sub>2</sub>, …, θ<sub>n</sub>: Coefficients for the features x<sub>1</sub>, x<sub>2</sub>, …, x<sub>n</sub>.</p>
     <p>- x<sub>1</sub>, x<sub>2</sub>, …, x<sub>n</sub>: Features of the input data.</p>
     <p>2) Cost Function (Ridge Regression):</p>
     <p>
      <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
        <mi>
          J 
        </mi> 
        <mrow> 
         <mo>
           ( 
         </mo> 
         <mi>
           θ 
         </mi> 
         <mo>
           ) 
         </mo> 
        </mrow> 
        <mo>
          = 
        </mo> 
        <mrow> 
         <mo>
           ( 
         </mo> 
         <mrow> 
          <mrow> 
           <mn>
             1 
           </mn> 
           <mo>
             / 
           </mo> 
           <mrow> 
            <mn>
              2 
            </mn> 
            <mi>
              m 
            </mi> 
           </mrow> 
          </mrow> 
         </mrow> 
         <mo>
           ) 
         </mo> 
        </mrow> 
        <mstyle displaystyle="true"> 
         <msub> 
          <mo>
            ∑ 
          </mo> 
          <mi>
            i 
          </mi> 
         </msub> 
         <mrow> 
          <msup> 
           <mrow> 
            <mrow> 
             <mo>
               ( 
             </mo> 
             <mrow> 
              <mi>
                h 
              </mi> 
              <mrow> 
               <mo>
                 ( 
               </mo> 
               <mrow> 
                <mi>
                  θ 
                </mi> 
                <mo>
                  , 
                </mo> 
                <msup> 
                 <mi>
                   x 
                 </mi> 
                 <mi>
                   i 
                 </mi> 
                </msup> 
               </mrow> 
               <mo>
                 ) 
               </mo> 
              </mrow> 
              <mo>
                − 
              </mo> 
              <msup> 
               <mi>
                 y 
               </mi> 
               <mi>
                 i 
               </mi> 
              </msup> 
             </mrow> 
             <mo>
               ) 
             </mo> 
            </mrow> 
           </mrow> 
           <mn>
             2 
           </mn> 
          </msup> 
         </mrow> 
        </mstyle> 
        <mo>
          + 
        </mo> 
        <mrow> 
         <mo>
           ( 
         </mo> 
         <mrow> 
          <mrow> 
           <mi>
             λ 
           </mi> 
           <mo>
             / 
           </mo> 
           <mrow> 
            <mn>
              2 
            </mn> 
            <mi>
              m 
            </mi> 
           </mrow> 
          </mrow> 
         </mrow> 
         <mo>
           ) 
         </mo> 
        </mrow> 
        <mstyle displaystyle="true"> 
         <msub> 
          <mo>
            ∑ 
          </mo> 
          <mi>
            j 
          </mi> 
         </msub> 
         <mrow> 
          <msubsup> 
           <mi>
             θ 
           </mi> 
           <mi>
             j 
           </mi> 
           <mn>
             2 
           </mn> 
          </msubsup> 
         </mrow> 
        </mstyle> 
       </mrow> 
      </math> (5)</p>
     <p>- J(θ): The cost function that measures the error of the model’s predictions.</p>
     <p>- m: The number of training examples.</p>
     <p>- x<sup>i</sup>: The feature vector of the i-th training example.</p>
     <p>- y<sup>i</sup>: The actual target value of the i-th training example.</p>
     <p>- h (θ, x<sup>i</sup>): The predicted value for the i-th training example using the hypothesis function.</p>
     <p>- λ: The regularization parameter (alpha).</p>
     <p>- θ<sub>j</sub>: Coefficients for the features x<sub>j</sub>.</p>
     <p>3) Parameter Estimation for Ridge Regression:</p>
     <p>- The parameters (θ<sub>0</sub>, θ<sub>1</sub>, …, θ<sub>n</sub>) of the Ridge regression model are estimated by minimizing the cost function J(θ) while also considering the regularization term.</p>
    </sec>
   </sec>
   <sec id="s3">
    <title>3. The Approach</title>
    <p>The development of the logistics engineering - regression based predictive feature model using sk-learn for vehicle performance in logistics clusters must begin with engineering as shown in <xref ref-type="fig" rid="fig1">
      Figure 1
     </xref>. Selecting and altering the relevant features or variables is necessary to enhance the model’s performance and predictive abilities. Feature engineering aims to extract the most beneficial and distinctive features from the gathered data to increase the predictive model’s accuracy and efficacy <xref ref-type="bibr" rid="scirp.134301-7">
      [7]
     </xref>.</p>
    <fig id="fig1" position="float">
     <label>Figure 1</label>
     <caption>
      <title>Figure 1. Development of the regression model.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId22.jpeg?20240703094240" />
    </fig>
    <sec id="s3_1">
     <title>3.1. Data Splitting</title>
     <p>Data splitting and regression model training are two crucial steps in the development of the logistics clusters’ predictive model for vehicle performance as shown in <xref ref-type="fig" rid="fig2">
       Figure 2
      </xref>. While the training set is used to create the model, the testing set is used to evaluate the performance and generalizability of the predictive model to new, untested data.</p>
     <fig id="fig2" position="float">
      <label>Figure 2</label>
      <caption>
       <title>Figure 2. Training data/validation/test.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId23.jpeg?20240703094241" />
     </fig>
     <p>After data splitting, the regression model must be fitted to the training data to train the model. The model gains knowledge of the coefficients or weights for each feature to forecast vehicle performance based on the input variables. The model is then tested on the testing set using the appropriate performance metrics, such as mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R-squared)..</p>
    </sec>
    <sec id="s3_2">
     <title>
      <xref ref-type="bibr" rid="scirp.134301-"></xref>3.2. Data Preparation</title>
     <p>Data preparation is a meticulous process aimed at ensuring that the dataset is ready for modelling and analysis as shown in <xref ref-type="fig" rid="fig3">
       Figure 3
      </xref>. It encompasses various tasks, such as data cleaning, feature engineering, and encoding, all of which contribute to the dataset’s quality and suitability for predictive modelling.</p>
     <fig id="fig3" position="float">
      <label>Figure 3</label>
      <caption>
       <title>Figure 3. Data preview.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId24.jpeg?20240703094242" />
     </fig>
     <p>The data preparation phase commences with the handling of missing or erroneous values in the dataset as shown in <xref ref-type="fig" rid="fig4">
       Figure 4
      </xref>. Notably, columns containing question marks (“?”) are identified as missing data and are subsequently replaced with appropriate values. This crucial step ensures the integrity of the dataset and sets the stage for accurate modeling.</p>
     <fig id="fig4" position="float">
      <label>Figure 4</label>
      <caption>
       <title>Figure 4. Data cleaning.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId25.jpeg?20240703094242" />
     </fig>
     <p>Additionally, categorical data is encoded to convert non-numeric attributes into numerical representations as shown in <xref ref-type="fig" rid="fig5">
       Figure 5
      </xref>. The Ordinal Encoder is utilized for this purpose, facilitating the transformation of categorical variables into a format compatible with machine learning algorithms.</p>
     <fig id="fig5" position="float">
      <label>Figure 5</label>
      <caption>
       <title>Figure 5. Data imputation.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId26.jpeg?20240703094242" />
     </fig>
     <p>The implementation of data preparation ensures that the dataset is cleaned, transformed, and ready for feature selection and modelling, setting the stage for the subsequent stages of the study. This foundational process contributes to the accuracy and reliability of the predictive model for vehicle performance.</p>
    </sec>
    <sec id="s3_3">
     <title>3.3. Splitting the Data into Train and Test Sets</title>
     <p>To develop an effective predictive model for vehicle performance, the study proceeds to split the dataset into distinct training and testing subsets as shown in <xref ref-type="fig" rid="fig6">
       Figure 6
      </xref>. This division of the data is instrumental in evaluating the model’s performance and generalizability.</p>
     <fig id="fig6" position="float">
      <label>Figure 6</label>
      <caption>
       <title>Figure 6. Data splitting.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId27.jpeg?20240703094243" />
     </fig>
     <p>A pivotal decision in this phase is the selection of the target feature. In this study, our focus is on predicting the vehicle’s performance in terms of miles per gallon (mpg), specifically in two different contexts:</p>
     <p>City-MPG: This metric quantifies a car’s fuel efficiency in urban or city conditions. It reflects how many miles a vehicle can travel on a gallon of fuel within city limits.</p>
     <p>Highway-MPG: This metric measures a car’s fuel efficiency on highways or open roads. It provides insights into a vehicle’s performance during extended highway drives.</p>
     <p>Both “city-mpg” and “highway-mpg” are numerical variables, making them ideal candidates as target features for the study. These metrics offer a comprehensive assessment of a car’s fuel efficiency in different scenarios, contributing to a holistic understanding of its performance characteristics.</p>
    </sec>
    <sec id="s3_4">
     <title>3.4. Outlier Removal and Skewness Mitigation</title>
     <p>The data preparation process goes beyond the initial splitting of the dataset and feature selection. It also includes addressing potential outliers and mitigating skewness in the data, as shown in <xref ref-type="fig" rid="fig7">
       Figure 7
      </xref> and <xref ref-type="fig" rid="fig8">
       Figure 8
      </xref>. Both of which are crucial for accurate modeling and analysis.</p>
    </sec>
    <sec id="s3_5">
     <title>3.5. Correlations and Relationships</title>
     <p>This involves visualizing relationships between features such as engine size, body style, and city-mpg, the study uncovers patterns and trends that can influence the predictive model as shown in <xref ref-type="fig" rid="fig9">
       Figure 9
      </xref> and <xref ref-type="fig" rid="fig10">
       Figure 10
      </xref>. These visualizations offer insights into how certain attributes relate to the target features and influence vehicle performance.</p>
     <fig id="fig7" position="float">
      <label>Figure 7</label>
      <caption>
       <title>Figure 7. Before removing outliers.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId28.jpeg?20240703094245" />
     </fig>
     <fig id="fig8" position="float">
      <label>Figure 8</label>
      <caption>
       <title>Figure 8. After removing outliers.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId29.jpeg?20240703094245" />
     </fig>
     <fig id="fig9" position="float">
      <label>Figure 9</label>
      <caption>
       <title>Figure 9. Relation comparing with attributes.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId30.jpeg?20240703094245" />
     </fig>
     <fig id="fig10" position="float">
      <label>Figure 10</label>
      <caption>
       <title>Figure 10. Comparison relation.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId31.jpeg?20240703094245" />
     </fig>
    </sec>
    <sec id="s3_6">
     <title>3.6. Correlation Matrix</title>
     <p>A correlation matrix is constructed as shown in <xref ref-type="fig" rid="fig11">
       Figure 11
      </xref> is to visualize the relationships between various attributes in the dataset. This matrix aids in identifying which attributes are strongly correlated and which are less influential. It is a valuable tool for understanding the dataset’s interdependencies.</p>
     <fig id="fig11" position="float">
      <label>Figure 11</label>
      <caption>
       <title>Figure 11. Correlation matrix.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId32.jpeg?20240703094247" />
     </fig>
    </sec>
   </sec>
   <sec id="s4">
    <title>4. Modelling and Evaluation</title>
    <p>To prepare the dataset for modelling, this paper undertakes the essential step of encoding categorical data as shown in <xref ref-type="fig" rid="fig12">
      Figure 12
     </xref>. Categorical variables, which are non-numeric in nature, need to be transformed into a numerical format to be compatible with machine learning algorithms <xref ref-type="bibr" rid="scirp.134301-8">
      [8]
     </xref>. In this study, the Ordinal Encoder is the chosen method for this task. It ensures that categorical variables are numerically represented while preserving the integrity of the data.</p>
    <fig id="fig12" position="float">
     <label>Figure 12</label>
     <caption>
      <title>Figure 12. Encoding data.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId33.jpeg?20240703094248" />
    </fig>
    <sec id="s4_1">
     <title>4.1. Scikit-Learn Implementation</title>
     <p>The study utilizes Scikit-Learn, a powerful machine learning library, to develop and train a Linear Regression model. This Scikit-Learn-based model is applied to the dataset to predict “city-mpg.” as shown in <xref ref-type="fig" rid="fig13">
       Figure 13
      </xref>. The accuracy and performance of this model are evaluated using established metrics, including Mean Squared Error (MSE) and R-squared (R<sup>2</sup>) scores.</p>
     <p>The Scikit-learn model achieved an impressive R<sup>2</sup> score of 0.887. This indicates that the model can explain approximately 88.7% of the variance in vehicle performance based on make characteristics. A higher R<sup>2</sup> score signifies a stronger ability to capture relationships between input features and the target variable.</p>
     <p>The Scikit-learn model exhibited a low Mean Squared Error (MSE) of 2.441 as shown in <xref ref-type="fig" rid="fig14">
       Figure 14
      </xref>. A lower MSE indicates that the model’s predictions are consistently close to the actual performance values. This result reaffirms the accuracy of the Scikit-learn model in forecasting vehicle performance.</p>
     <fig id="fig13" position="float">
      <label>Figure 13</label>
      <caption>
       <title>Figure 13. Regression plot of Sk-Learn regression.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId34.jpeg?20240703094249" />
     </fig>
     <fig id="fig14" position="float">
      <label>Figure 14</label>
      <caption>
       <title>Figure 14. Sk-learn model.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId35.jpeg?20240703094249" />
     </fig>
    </sec>
    <sec id="s4_2">
     <title>4.2. Mathematical Model</title>
     <p>In parallel with the Scikit-Learn implementation, the paper incorporates a mathematical model of Linear Regression to predict “city-mpg” based on the data’s attributes. This mathematical model offers a deeper understanding of the linear relationships between the features and the target variable.</p>
    </sec>
    <sec id="s4_3">
     <title>4.3. Data Normalization</title>
     <p>The first step in this implementation is data normalization. This process scales the data to ensure that each feature contributes equally to the predictions. It is vital for considering the diverse attributes of the dataset in a uniform manner.</p>
    </sec>
    <sec id="s4_4">
     <title>4.4. Incorporating All Attributes</title>
     <p>The mathematical model takes into account all attributes present in the dataset. It allows the model to leverage the various features to provide an accurate prediction of “city-mpg.” This inclusive approach is essential for capturing the interdependencies between the attributes <xref ref-type="bibr" rid="scirp.134301-9">
       [9]
      </xref>.</p>
    </sec>
    <sec id="s4_5">
     <title>4.5. Adding an Intercept Term</title>
     <p>An intercept term is added to account for the baseline prediction as shown in <xref ref-type="fig" rid="fig15">
       Figure 15
      </xref> when all feature values are zero. This enables the model to provide predictions that are not solely reliant on the attributes.</p>
     <fig id="fig15" position="float">
      <label>Figure 15</label>
      <caption>
       <title>Figure 15. Adding an intercept term.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId36.jpeg?20240703094254" />
     </fig>
    </sec>
    <sec id="s4_6">
     <title>4.6. Ridge Regularization</title>
     <p>Ridge Regression, a regularization technique, is introduced to enhance the model’s robustness and mitigate overfitting <xref ref-type="bibr" rid="scirp.134301-9">
       [9]
      </xref>. The hyperparameter (alpha) used in Ridge Regression ensures that no single attribute dominates the prediction, promoting a balanced consideration of all attributes.</p>
    </sec>
    <sec id="s4_7">
     <title>4.7. Coefficient and Intercept Calculation</title>
     <p>The model calculates the coefficients representing the relationships between the attributes and “city-mpg”. Additionally, it computes the intercept, which accounts for the constant component of the prediction. These mathematical computations provide insights into how each attribute contributes to the prediction.</p>
    </sec>
   </sec>
   <sec id="s5">
    <title>5. Prediction and Evaluation</title>
    <p>The mathematical model is utilized to predict “city-mpg” values for the test set. Subsequently, its performance is assessed using key metrics, including Mean Squared Error (MSE) and R<sup>2</sup> Score. These metrics as shown in <xref ref-type="fig" rid="fig16">
      Figure 16
     </xref> are instrumental in evaluating the model’s ability to incorporate all attributes for accurate predictions.</p>
    <fig id="fig16" position="float">
     <label>Figure 16</label>
     <caption>
      <title>Figure 16. Prediction and evaluation of mathematical model.</title>
     </caption>
     <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId37.jpeg?20240703094257" />
    </fig>
    <sec id="s5_1">
     <title>5.1. Visual Representation</title>
     <p>The results of this mathematical model implementation as shown in <xref ref-type="fig" rid="fig17">
       Figure 17
      </xref> and <xref ref-type="fig" rid="fig18">
       Figure 18
      </xref>are visually represented to enhance comprehension. Replots are employed to visualize the relationship between predicted and actual values for both the training and testing datasets, both with and without Ridge Regularization. These visualizations facilitate a comparative analysis of the model’s performance.</p>
     <fig id="fig17" position="float">
      <label>Figure 17</label>
      <caption>
       <title>Figure 17. No Ridge mathematical model.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId38.jpeg?20240703094258" />
     </fig>
     <fig id="fig18" position="float">
      <label>Figure 18</label>
      <caption>
       <title>Figure 18. Ridge mathematical model.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId39.jpeg?20240703094258" />
     </fig>
    </sec>
    <sec id="s5_2">
     <title>
      <xref ref-type="bibr" rid="scirp.134301-"></xref>5.2. Model Evaluation Metrics</title>
     <p>Mean squared error (MSE) and R-squared (R<sup>2</sup>) score as shown in <xref ref-type="fig" rid="fig19">
       Figure 19
      </xref> were crucial measures we used to evaluate the model’s effectiveness. The average squared difference between the anticipated and actual values is quantified by the mean squared error, giving a measure of how well the model matches the data. A lower MSE value indicates better model accuracy. The amount of the dependent variable’s (performance) variation that the independent variables can account for is measured by the R-squared score (make features). A number closer to 1 indicates a better match and runs from 0 to 1.</p>
     <fig id="fig19" position="float">
      <label>Figure 19</label>
      <caption>
       <title>Figure 19. Mean square error.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId40.jpeg?20240703094300" />
     </fig>
     <p>After evaluating the model, the following results were obtained:</p>
     <p>- Mean Squared Error: 3.421099021686216</p>
     <p>- R-squared Score: 0.9213540454784778.</p>
     <p>The projected performance values often differ from the actual values by about 3.42 units, according to the mean squared error of 3.42. The high R-squared value of 0.921 indicates that the make features included in the model can account for almost 92.1 percent of the variation in vehicle performance. These findings show that the regression model was able to forecast vehicle performance based on the supplied variables accurately.</p>
    </sec>
    <sec id="s5_3">
     <title>
      <xref ref-type="bibr" rid="scirp.134301-"></xref>5.3. Graph of Data Prediction</title>
     <p>A scatter plot was made with the regression line for the training and test data to see how well the regression model performed <xref ref-type="bibr" rid="scirp.134301-10">
       [10]
      </xref>. The graphic shows how well the model’s predictions match the measured data. The regression line is the best-fit line that reduces the discrepancies between the projected and actual performance values.</p>
     <p>In conclusion, based on make-features, our regression model has shown exceptional accuracy in forecasting car performance as shown in <xref ref-type="fig" rid="figFigures 20-22">
       Figures 20-22
      </xref>. The MSE and R-squared score validates the model’s excellent prediction skills, and its ability to generalize to new data is shown by the training and testing accuracies. The graph’s visual portrayal of the model’s predictions confirms both the model’s accuracy and the chosen strategy’s efficacy. This model’s successful development advances automotive research and can help with the creation of more effective and high-performing automobiles.</p>
    </sec>
   </sec>
   <sec id="s6">
    <title>6. Comparing Results of Different Models</title>
    <p>A comprehensive analysis of the results obtained from different models used in our study. The metrics used for comparison include the R-squared (R<sup>2</sup>) score and the Mean Squared Error (MSE). The models under consideration are Scikit-learn, the model without regularization (No Regularization), and the model with Ridge regularization (Ridge Regularization).</p>
    <sec id="s6_1">
     <title>6.1. R<sup>2</sup> Score Comparison</title>
     <p>Scikit-learn (0.887): The Scikit-learn model achieved an impressive R<sup>2</sup> score of 0.887. This indicates that the model can explain approximately 88.7% of the variance in vehicle performance based on make characteristics. A higher R<sup>2</sup> score signifies a stronger ability to capture relationships between input features and the target variable.</p>
     <fig id="fig20" position="float">
      <label>Figure 20</label>
      <caption>
       <title>Figure 20. Regression plot of Sk-Learn regression.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId41.jpeg?20240703094302" />
     </fig>
     <fig id="fig21" position="float">
      <label>Figure 21</label>
      <caption>
       <title>Figure 21. Regression plot of custom regression (w/Ridge).</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId42.jpeg?20240703094302" />
     </fig>
     <fig id="fig22" position="float">
      <label>Figure 22</label>
      <caption>
       <title>Figure 22. Regression plot of custom regression.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId43.jpeg?20240703094302" />
     </fig>
     <p>No Regularization (0.646): In contrast, the model without regularization yielded an R<sup>2</sup> score of 0.646. While this score is indicative of some level of predictability, it falls short of the performance achieved by Scikit-learn. The lower R<sup>2</sup> score suggests that this model may not capture as much of the variance in vehicle performance as the Scikit-learn model.</p>
     <p>Ridge Regularization (0.793): The Ridge Regularization model achieved an R<sup>2</sup> score of 0.793, positioning it between Scikit-learn and the model without regularization. This score suggests that Ridge regularization effectively balances model complexity and performance, resulting in a reasonably good fit to the data.</p>
    </sec>
    <sec id="s6_2">
     <title>6.2. MSE Comparison</title>
     <p>Scikit-learn (2.441): The Scikit-learn model exhibited a low Mean Squared Error (MSE) of 2.441. A lower MSE indicates that the model’s predictions are consistently close to the actual performance values. This result reaffirms the accuracy of the Scikit-learn model in forecasting vehicle performance.</p>
     <p>No Regularization (7.617): In contrast, the model without regularization yielded a higher MSE of 7.617. The elevated MSE suggests that this model’s predictions exhibit more variability and are farther from the actual performance values compared to the Scikit-learn model.</p>
     <p>Ridge Regularization (4.450): The Ridge Regularization model recorded an MSE of 4.450, which falls between the values obtained by Scikit-learn and the model without regularization. This suggests that Ridge regularization strikes a balance between fitting the data well and preventing overfitting.</p>
     <p>In summary, the comparison of results among the three models as shown in <xref ref-type="fig" rid="fig23">
       Figure 23
      </xref> reveals that Scikit-learn outperforms both the No Regularization and Ridge Regularization models in terms of R<sup>2</sup> score and MSE. The Scikit-learn model demonstrates a strong ability to explain variance and provides highly accurate predictions of vehicle performance based on make characteristics. While Ridge Regularization improves performance compared to the model without regularization, it does not surpass the performance of Scikit-learn in this context. These findings emphasize the effectiveness of the Scikit-learn model in optimizing logistics operations through accurate vehicle performance forecasting.</p>
     <fig id="fig23" position="float">
      <label>Figure 23</label>
      <caption>
       <title>Figure 23. Comparison graph of Models.</title>
      </caption>
      <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/2870663-rId44.jpeg?20240703094303" />
     </fig>
    </sec>
   </sec>
   <sec id="s7">
    <title>
     <xref ref-type="bibr" rid="scirp.134301-"></xref>7. Conclusion and Recommendations</title>
    <p>This paper examined effectively addressed its research goals by shedding light on the complex relationship between vehicle performance and the effectiveness of logistics operations within clusters. Through an in-depth investigation of the subject matter and the implementation of regression-based predictive models, this research has produced useful insights that support improving logistics clusters’ overall performance, sustainability, and competitiveness.</p>
    <p>However, some data discrepancies should be considered, even if they mostly support the research hypotheses and advance the discipline. It is possible to attribute the model’s remarkable performance in predicting vehicle performance to its careful feature engineering approach, which allowed it to identify and include pertinent qualities. As a result of integrating real-time data, including traffic patterns and weather information, the accuracy and responsiveness of the model were further improved. As a result of various operating conditions and characteristics, the model’s effectiveness may differ between various logistics clusters.</p>
    <p>In addition to academic discussion, the ramifications of this research can be applied to real-world logistics operations. By accurately predicting vehicles’ performance, the model can be used to allocate resources, schedule routes, and optimize maintenance. In addition to improving customer service, effective operations can help keep customers returning. Furthermore, the model’s insight into sustainability programs aligns with the sector’s increasing emphasis on environmental responsibility, encouraging eco-friendly behaviour and reducing carbon footprints.</p>
    <p>Based on the findings and their implications, several recommendations are made for further research and actual implementation. In the future, the model should be applied to various logistics clusters based on different geographical locations, infrastructures, and operational conditions. A model incorporating emissions and environmental factors metrics may also enhance sustainability efforts within the logistics industry. Research into long-term predictive models for logistics clusters could also address strategic planning and capacity management.</p>
   </sec>
  </sec>
 </body><back>
  <ref-list>
   <title>References</title>
   <ref id="scirp.134301-ref1">
    <label>1</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Ivanova. (2022) An Insider Data Leakage Detection Using One-Hot Encoding, Synthetic Minority Oversampling and Machine Learning Techniques. Entropy, 23, 1258. &gt;https://doi.org/10.3390/e23101258 
    </mixed-citation>
   </ref>
   <ref id="scirp.134301-ref2">
    <label>2</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Friedrich, S., Ayadi, O., Adeeb, J. and Louzazni, M. (2019) Assessment of Artificial Neural Networks Learning Algorithms and Training Datasets for Solar Photovoltaic Power Production Prediction. Frontiers in Energy Research, 7, 130.&gt;https://doi.org/10.3389/fenrg.2019.00130 
    </mixed-citation>
   </ref>
   <ref id="scirp.134301-ref3">
    <label>3</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Jiang, S.A. and Bhaya, W.S. (2017) Review of Data Preprocessing Techniques in Data Mining. Journal of Engineering and Applied Sciences, 12, 4102-4107.
    </mixed-citation>
   </ref>
   <ref id="scirp.134301-ref4">
    <label>4</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Chicco, D., Warrens, M.J. and Jurman, G. (2021) The Coefficient of Determination R-Squared Is More Informative than SMAPE, MAE, MAPE, MSE and RMSE in Regression Analysis Evaluation. PeerJ Computer Science, 7, e623.&gt;https://doi.org/10.7717/peerj-cs.623
    </mixed-citation>
   </ref>
   <ref id="scirp.134301-ref5">
    <label>5</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Daily, J. and Peterson, J. (2017) Predictive Maintenance: How Big Data Analysis Can Improve Maintenance. Supply Chain Integration Challenges in Commercial Aerospace: A Comprehensive Perspective on the Aviation Value Chain, 267-278.&gt;https://doi.org/10.1007/978-3-319-46155-7_18 
    </mixed-citation>
   </ref>
   <ref id="scirp.134301-ref6">
    <label>6</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Davis, R., Vochozka, M., Vrbka, J. and Neguriţă, O. (2020) Industrial Artificial Intelligence, Smart Connected Sensors, and Big Data-Driven Decision-Making Processes in Internet of Things-Based Real-Time Production Logistics. Economics, Management and Financial Markets, 15, 9-15. &gt;https://doi.org/10.22381/EMFM15320201 
    </mixed-citation>
   </ref>
   <ref id="scirp.134301-ref7">
    <label>7</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Anderson, M.R. and Cafarella, M. (2016) Input Selection for Fast Feature Engineering. 2016 IEEE 32nd International Conference on Data Engineering (ICDE), Helsinki, 16-20 May 2016, 577-588. &gt;https://doi.org/10.1109/ICDE.2016.7498272 
    </mixed-citation>
   </ref>
   <ref id="scirp.134301-ref8">
    <label>8</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Comi, A. and Savchenko, L. (2021) Last-Mile Delivering: Analysis of Environment-Friendly Transport. Sustainable Cities and Society, 74, Article ID: 103213.&gt;https://doi.org/10.1016/j.scs.2021.103213 
    </mixed-citation>
   </ref>
   <ref id="scirp.134301-ref9">
    <label>9</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Biessmann, F., Salinas, D., Schelter, S., Schmidt, P. and Lange, D. (2018) “Deep” Learning for Missing Value Imputation in Tables with Non-Numerical Data. Proceedings of the 27th ACM International Conference on Information and Knowledge Management, October 2018, 2017-2025. &gt;https://doi.org/10.1145/3269206.3272005 
    </mixed-citation>
   </ref>
   <ref id="scirp.134301-ref10">
    <label>10</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Bai, R., Chen, X., Chen, Z. L., Cui, T., Gong, S., He, W., et al. (2023) Analytics and Machine Learning in Vehicle Routing Research. International Journal of Production Research, 61, 4-30. &gt;https://doi.org/10.1080/00207543.2021.2013566
    </mixed-citation>
   </ref>
  </ref-list>
 </back>
</article>