<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article">
 <front>
  <journal-meta>
   <journal-id journal-id-type="publisher-id">
    jamp
   </journal-id>
   <journal-title-group>
    <journal-title>
     Journal of Applied Mathematics and Physics
    </journal-title>
   </journal-title-group>
   <issn pub-type="epub">
    2327-4352
   </issn>
   <issn publication-format="print">
    2327-4379
   </issn>
   <publisher>
    <publisher-name>
     Scientific Research Publishing
    </publisher-name>
   </publisher>
  </journal-meta>
  <article-meta>
   <article-id pub-id-type="doi">
    10.4236/jamp.2024.1211237
   </article-id>
   <article-id pub-id-type="publisher-id">
    jamp-137696
   </article-id>
   <article-categories>
    <subj-group subj-group-type="heading">
     <subject>
      Articles
     </subject>
    </subj-group>
    <subj-group subj-group-type="Discipline-v2">
     <subject>
      Physics 
     </subject>
     <subject>
       Mathematics
     </subject>
    </subj-group>
   </article-categories>
   <title-group>
    Denoising Data with Random Matrix Theory
   </title-group>
   <contrib-group>
    <contrib contrib-type="author" xlink:type="simple">
     <name name-style="western">
      <surname>
       Nathan
      </surname>
      <given-names>
       Jiang
      </given-names>
     </name>
    </contrib>
   </contrib-group> 
   <aff id="affnull">
    <addr-line>
     aDepartment of Mathematics, Columbia University, New York, NY, USA
    </addr-line> 
   </aff> 
   <pub-date pub-type="epub">
    <day>
     06
    </day> 
    <month>
     11
    </month>
    <year>
     2024
    </year>
   </pub-date> 
   <volume>
    12
   </volume> 
   <issue>
    11
   </issue>
   <fpage>
    3902
   </fpage>
   <lpage>
    3911
   </lpage>
   <history>
    <date date-type="received">
     <day>
      9,
     </day>
     <month>
      October
     </month>
     <year>
      2024
     </year>
    </date>
    <date date-type="published">
     <day>
      24,
     </day>
     <month>
      October
     </month>
     <year>
      2024
     </year> 
    </date> 
    <date date-type="accepted">
     <day>
      24,
     </day>
     <month>
      November
     </month>
     <year>
      2024
     </year> 
    </date>
   </history>
   <permissions>
    <copyright-statement>
     © Copyright 2014 by authors and Scientific Research Publishing Inc. 
    </copyright-statement>
    <copyright-year>
     2014
    </copyright-year>
    <license>
     <license-p>
      This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/
     </license-p>
    </license>
   </permissions>
   <abstract>
    Properties from random matrix theory allow us to uncover naturally embedded signals from different data sets. While there are many parameters that can be changed, including the probability distribution of the entries, the introduction of noise, and the size of the matrix, the resulting eigenvalue and eigenvector distributions remain relatively unchanged. However, when there are certain anomalous eigenvalues and their corresponding eigenvectors that do not follow the predicted distributions, it could indicate that there’s an underlying non-random signal inside the data. As data and matrices become more important in the sciences and computing, so too will the importance of processing them with the principles of random matrix theory.
   </abstract>
   <kwd-group> 
    <kwd>
     Random Matrix Theory
    </kwd> 
    <kwd>
      Universality
    </kwd> 
    <kwd>
      Wishart Matrices
    </kwd> 
    <kwd>
      Marchenko-Pastur (M-P) Distribution
    </kwd> 
    <kwd>
      Noise
    </kwd> 
    <kwd>
      Sparsity
    </kwd> 
    <kwd>
      Signaling
    </kwd> 
    <kwd>
      Linear Sketching
    </kwd>
   </kwd-group>
  </article-meta>
 </front>
 <body>
  <sec id="s1">
   <title>1. Introduction</title>
   <p>While many random systems exist in the universe, predictions and conclusions can still be drawn from them. Whether it be energy states of Hamiltonian nuclei or biological signals among unicellular processes, the principles of random matrix theory can be used to denoise systems and parse out the most important information by studying the eigenvalues and eigenvectors of a random matrix.</p>
   <p>It turns out that random matrix theory imposes more structure, not less. The patterns of universality hold independently of numerous parameters in the sampling of a random matrix, which allows the process described in the article to separate the randomly generated noise from important signals in natural data sets. However, data collection methods are often imperfect; sometimes readings are false, entries are missing, and data sets are too large to compute with a reasonably-powered computer, but the beauty of Random Matrix Theory is that it can use this well-defined structure, along with some algorithms, to recover the important signals even despite some flaws in the data.</p>
  </sec><sec id="s2">
   <title>
    <xref ref-type="bibr" rid="scirp.137696-"></xref>2. Random Matrix Theory Fundamentals</title>
   <p>A random matrix is defined as a matrix whose entries are randomly sampled. An ensemble of random matrices is a group of matrices whose entries are sampled in the same manner. Examples of ensembles include the Gaussian Orthogonal Ensemble (GOE), which is sampled as a symmetric, 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        n 
      </mi> 
      <mo>
        × 
      </mo> 
      <mi>
        n 
      </mi> 
     </mrow> 
    </math> matrix A where entries above the diagonal are sampled from N(0, 1) and entries on the diagonal are sampled from N(0, 2) with all entries being divided by a factor 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <msqrt> 
       <mrow> 
        <mn>
          2 
        </mn> 
        <mi>
          n 
        </mi> 
       </mrow> 
      </msqrt> 
     </mrow> 
    </math> <xref ref-type="bibr" rid="scirp.137696-1">
     [1]
    </xref>. The probability density of the eigenvalues of a GOE matrix is defined by the Wigner Semicircle distribution 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        f 
      </mi> 
      <mrow> 
       <mo>
         ( 
       </mo> 
       <mi>
         x 
       </mi> 
       <mo>
         ) 
       </mo> 
      </mrow> 
      <mo>
        = 
      </mo> 
      <mo> 
      </mo> 
      <mfrac> 
       <mn>
         1 
       </mn> 
       <mrow> 
        <mn>
          2 
        </mn> 
        <mtext>
          π 
        </mtext> 
       </mrow> 
      </mfrac> 
      <msqrt> 
       <mrow> 
        <mn>
          4 
        </mn> 
        <mo>
          − 
        </mo> 
        <msup> 
         <mi>
           x 
         </mi> 
         <mn>
           2 
         </mn> 
        </msup> 
       </mrow> 
      </msqrt> 
     </mrow> 
    </math>, and the components of the eigenvectors follow a normal distribution. As 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mi>
       n 
     </mi> 
    </math> increases, the density of the resulting eigenvalues will converge to the Wigner Semicircle Distribution (<xref ref-type="fig" rid="fig1">
     Figure 1
    </xref>).</p>
   <fig id="fig1" position="float">
    <label>Figure 1</label>
    <caption>
     <title>(a) (b)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId24.jpeg?20241127032312" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId25.jpeg?20241127032313" /></p>(c) (d)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId26.jpeg?20241127032313" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId27.jpeg?20241127032312" /></p>(e) (f)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId28.jpeg?20241127032312" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId29.jpeg?20241127032312" /></p>(g) (h)Figure 1. Eigenvalue and eigenvector distribution for GOE matrix of n = 10, 100, 500, 1000 <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="" />
   </fig>
   <fig id="fig1" position="float">
    <label>Figure 1</label>
    <caption>
     <title>(a) (b)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId24.jpeg?20241127032312" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId25.jpeg?20241127032313" /></p>(c) (d)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId26.jpeg?20241127032313" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId27.jpeg?20241127032312" /></p>(e) (f)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId28.jpeg?20241127032312" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId29.jpeg?20241127032312" /></p>(g) (h)Figure 1. Eigenvalue and eigenvector distribution for GOE matrix of n = 10, 100, 500, 1000 <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723907-rId22.jpeg?20241127032312" />
   </fig>
   <fig id="fig1" position="float">
    <label>Figure 1</label>
    <caption>
     <title>(a) (b)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId24.jpeg?20241127032312" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId25.jpeg?20241127032313" /></p>(c) (d)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId26.jpeg?20241127032313" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId27.jpeg?20241127032312" /></p>(e) (f)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId28.jpeg?20241127032312" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId29.jpeg?20241127032312" /></p>(g) (h)Figure 1. Eigenvalue and eigenvector distribution for GOE matrix of n = 10, 100, 500, 1000 <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723907-rId23.jpeg?20241127032313" />
   </fig>
   <p>Figure 1. Eigenvalue and eigenvector distribution for GOE matrix of n = 10, 100, 500, 1000 <xref ref-type="bibr" rid="scirp.137696-2">
     [2]
    </xref>.</p>
   <p>The Wishart ensemble is defined by an 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        m 
      </mi> 
      <mo> 
      </mo> 
      <mo>
        × 
      </mo> 
      <mo> 
      </mo> 
      <mi>
        n 
      </mi> 
     </mrow> 
    </math> matrix A containing entries sampled from N(0, 1). The gram matrix 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        W 
      </mi> 
      <mo>
        = 
      </mo> 
      <mfrac> 
       <mrow> 
        <mi>
          A 
        </mi> 
        <msup> 
         <mi>
           A 
         </mi> 
         <mtext>
           T 
         </mtext> 
        </msup> 
       </mrow> 
       <mi>
         n 
       </mi> 
      </mfrac> 
     </mrow> 
    </math> is then constructed to form the Wishart matrix <xref ref-type="bibr" rid="scirp.137696-3">
     [3]
    </xref>. The resulting eigenvalue probability density, known as the Marchenko-Pastur (M-P) Distribution, which depends on the parameter r, is defined by:</p>
   <p>
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        μ 
      </mi> 
      <mrow> 
       <mo>
         ( 
       </mo> 
       <mi>
         x 
       </mi> 
       <mo>
         ) 
       </mo> 
      </mrow> 
      <mo>
        = 
      </mo> 
      <mrow> 
       <mo>
         { 
       </mo> 
       <mrow> 
        <mtable columnalign="left"> 
         <mtr columnalign="left"> 
          <mtd columnalign="left"> 
           <mrow> 
            <mfrac> 
             <mn>
               1 
             </mn> 
             <mrow> 
              <mn>
                2 
              </mn> 
              <mtext>
                π 
              </mtext> 
              <mi>
                r 
              </mi> 
              <mi>
                x 
              </mi> 
              <msup> 
               <mi>
                 σ 
               </mi> 
               <mn>
                 2 
               </mn> 
              </msup> 
             </mrow> 
            </mfrac> 
            <msqrt> 
             <mrow> 
              <mrow> 
               <mo>
                 ( 
               </mo> 
               <mrow> 
                <msub> 
                 <mi>
                   λ 
                 </mi> 
                 <mo>
                   + 
                 </mo> 
                </msub> 
                <mo>
                  − 
                </mo> 
                <mi>
                  x 
                </mi> 
               </mrow> 
               <mo>
                 ) 
               </mo> 
              </mrow> 
              <mrow> 
               <mo>
                 ( 
               </mo> 
               <mrow> 
                <mi>
                  x 
                </mi> 
                <mo>
                  − 
                </mo> 
                <msub> 
                 <mi>
                   λ 
                 </mi> 
                 <mo>
                   − 
                 </mo> 
                </msub> 
               </mrow> 
               <mo>
                 ) 
               </mo> 
              </mrow> 
             </mrow> 
            </msqrt> 
            <mo> 
            </mo> 
            <mo>
              , 
            </mo> 
            <mn>
              0 
            </mn> 
            <mo>
              &lt; 
            </mo> 
            <mi>
              r 
            </mi> 
            <mo>
              ≤ 
            </mo> 
            <mn>
              1 
            </mn> 
           </mrow> 
          </mtd> 
         </mtr> 
         <mtr columnalign="left"> 
          <mtd columnalign="left"> 
           <mrow> 
            <mrow> 
             <mo>
               ( 
             </mo> 
             <mrow> 
              <mn>
                1 
              </mn> 
              <mo>
                − 
              </mo> 
              <mfrac> 
               <mn>
                 1 
               </mn> 
               <mi>
                 r 
               </mi> 
              </mfrac> 
             </mrow> 
             <mo>
               ) 
             </mo> 
            </mrow> 
            <mo>
              + 
            </mo> 
            <mi>
              δ 
            </mi> 
            <mrow> 
             <mo>
               ( 
             </mo> 
             <mi>
               x 
             </mi> 
             <mo>
               ) 
             </mo> 
            </mrow> 
            <mo>
              + 
            </mo> 
            <mfrac> 
             <mn>
               1 
             </mn> 
             <mrow> 
              <mn>
                2 
              </mn> 
              <mtext>
                π 
              </mtext> 
              <mi>
                r 
              </mi> 
              <mi>
                x 
              </mi> 
              <msup> 
               <mi>
                 σ 
               </mi> 
               <mn>
                 2 
               </mn> 
              </msup> 
             </mrow> 
            </mfrac> 
            <msqrt> 
             <mrow> 
              <mrow> 
               <mo>
                 ( 
               </mo> 
               <mrow> 
                <msub> 
                 <mi>
                   λ 
                 </mi> 
                 <mo>
                   + 
                 </mo> 
                </msub> 
                <mo>
                  − 
                </mo> 
                <mi>
                  x 
                </mi> 
               </mrow> 
               <mo>
                 ) 
               </mo> 
              </mrow> 
              <mrow> 
               <mo>
                 ( 
               </mo> 
               <mrow> 
                <mi>
                  x 
                </mi> 
                <mo>
                  − 
                </mo> 
                <msub> 
                 <mi>
                   λ 
                 </mi> 
                 <mo>
                   − 
                 </mo> 
                </msub> 
               </mrow> 
               <mo>
                 ) 
               </mo> 
              </mrow> 
             </mrow> 
            </msqrt> 
            <mo>
              , 
            </mo> 
            <mi>
              r 
            </mi> 
            <mo>
              &gt; 
            </mo> 
            <mn>
              1 
            </mn> 
           </mrow> 
          </mtd> 
         </mtr> 
        </mtable> 
       </mrow> 
      </mrow> 
     </mrow> 
    </math>,</p>
   <p>where 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        r 
      </mi> 
      <mo> 
      </mo> 
      <mo>
        = 
      </mo> 
      <mfrac> 
       <mi>
         m 
       </mi> 
       <mi>
         n 
       </mi> 
      </mfrac> 
     </mrow> 
    </math>, 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <msub> 
       <mi>
         λ 
       </mi> 
       <mo>
         + 
       </mo> 
      </msub> 
      <mo>
        = 
      </mo> 
      <msup> 
       <mi>
         σ 
       </mi> 
       <mn>
         2 
       </mn> 
      </msup> 
      <msup> 
       <mrow> 
        <mrow> 
         <mo>
           ( 
         </mo> 
         <mrow> 
          <mn>
            1 
          </mn> 
          <mo>
            + 
          </mo> 
          <msqrt> 
           <mi>
             r 
           </mi> 
          </msqrt> 
         </mrow> 
         <mo>
           ) 
         </mo> 
        </mrow> 
       </mrow> 
       <mn>
         2 
       </mn> 
      </msup> 
     </mrow> 
    </math>, and 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <msub> 
       <mi>
         λ 
       </mi> 
       <mo>
         − 
       </mo> 
      </msub> 
      <mo>
        = 
      </mo> 
      <msup> 
       <mi>
         σ 
       </mi> 
       <mn>
         2 
       </mn> 
      </msup> 
      <msup> 
       <mrow> 
        <mrow> 
         <mo>
           ( 
         </mo> 
         <mrow> 
          <mn>
            1 
          </mn> 
          <mo>
            − 
          </mo> 
          <msqrt> 
           <mi>
             r 
           </mi> 
          </msqrt> 
         </mrow> 
         <mo>
           ) 
         </mo> 
        </mrow> 
       </mrow> 
       <mn>
         2 
       </mn> 
      </msup> 
     </mrow> 
    </math> (<xref ref-type="fig" rid="fig2">
     Figure 2
    </xref>) <xref ref-type="bibr" rid="scirp.137696-1">
     [1]
    </xref>.</p>
   <p>
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <msub> 
       <mi>
         λ 
       </mi> 
       <mo>
         + 
       </mo> 
      </msub> 
     </mrow> 
    </math> is also known as the Tracy-Widom Critical Eigenvalue since a completely randomly generated Wishart matrix will not have an eigenvalue greater than 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <msub> 
       <mi>
         λ 
       </mi> 
       <mo>
         + 
       </mo> 
      </msub> 
     </mrow> 
    </math>.</p>
   <fig id="fig2" position="float">
    <label>Figure 2</label>
    <caption>
     <title>(a) (b)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId48.jpeg?20241127032313" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId49.jpeg?20241127032312" /></p>(c) (d)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId50.jpeg?20241127032313" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId51.jpeg?20241127032313" /></p>(e) (f)Figure 2. M-P distribution eigenvalues and eigenvectors for matrices with entries sampled from N(0, 1) for different r values and n = 1000 <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="" />
   </fig>
   <fig id="fig2" position="float">
    <label>Figure 2</label>
    <caption>
     <title>(a) (b)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId48.jpeg?20241127032313" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId49.jpeg?20241127032312" /></p>(c) (d)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId50.jpeg?20241127032313" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId51.jpeg?20241127032313" /></p>(e) (f)Figure 2. M-P distribution eigenvalues and eigenvectors for matrices with entries sampled from N(0, 1) for different r values and n = 1000 <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723907-rId46.jpeg?20241127032313" />
   </fig>
   <fig id="fig2" position="float">
    <label>Figure 2</label>
    <caption>
     <title>(a) (b)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId48.jpeg?20241127032313" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId49.jpeg?20241127032312" /></p>(c) (d)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId50.jpeg?20241127032313" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId51.jpeg?20241127032313" /></p>(e) (f)Figure 2. M-P distribution eigenvalues and eigenvectors for matrices with entries sampled from N(0, 1) for different r values and n = 1000 <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723907-rId47.jpeg?20241127032313" />
   </fig>
   <p>Figure 2. M-P distribution eigenvalues and eigenvectors for matrices with entries sampled from N(0, 1) for different r values and n = 1000 <xref ref-type="bibr" rid="scirp.137696-2">
     [2]
    </xref>.</p>
   <p>The eigenvector components follow a normal distribution. It’s also worth noting that the process of creating the Wishart ensemble closely resembles the Singular Value Decomposition of A, where 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        A 
      </mi> 
      <mo>
        = 
      </mo> 
      <mi>
        U 
      </mi> 
      <mi>
        Σ 
      </mi> 
      <msup> 
       <mi>
         V 
       </mi> 
       <mtext>
         T 
       </mtext> 
      </msup> 
      <mo>
        , 
      </mo> 
     </mrow> 
    </math> and the singular values in 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mi>
       Σ 
     </mi> 
    </math> are proportional to the eigenvalues predicted by the Marchenko-Pastur distribution.</p>
   <p>It turns out, however, that many things do not affect the overall structure of random matrices, which is known as universality. As long as the mean 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mi>
       μ 
     </mi> 
    </math> and variance 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <msup> 
       <mi>
         σ 
       </mi> 
       <mn>
         2 
       </mn> 
      </msup> 
     </mrow> 
    </math> are constant, the distribution of eigenvalues and eigenvectors for a given ensemble does not change based on the overall probability distribution from which the entries are sampled. For example, the Reademacher 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        P 
      </mi> 
      <mrow> 
       <mo>
         ( 
       </mo> 
       <mrow> 
        <msub> 
         <mi>
           x 
         </mi> 
         <mrow> 
          <mi>
            i 
          </mi> 
          <mo>
            , 
          </mo> 
          <mi>
            j 
          </mi> 
         </mrow> 
        </msub> 
        <mo>
          = 
        </mo> 
        <mo>
          − 
        </mo> 
        <mn>
          1 
        </mn> 
       </mrow> 
       <mo>
         ) 
       </mo> 
      </mrow> 
      <mo>
        = 
      </mo> 
      <mi>
        P 
      </mi> 
      <mrow> 
       <mo>
         ( 
       </mo> 
       <mrow> 
        <msub> 
         <mi>
           x 
         </mi> 
         <mrow> 
          <mi>
            i 
          </mi> 
          <mo>
            , 
          </mo> 
          <mi>
            j 
          </mi> 
         </mrow> 
        </msub> 
        <mo>
          = 
        </mo> 
        <mn>
          1 
        </mn> 
       </mrow> 
       <mo>
         ) 
       </mo> 
      </mrow> 
      <mo>
        = 
      </mo> 
      <mfrac> 
       <mn>
         1 
       </mn> 
       <mn>
         2 
       </mn> 
      </mfrac> 
     </mrow> 
    </math> distribution, a distribution where the probability of sampling a −1 and 1 are equal at 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mfrac> 
       <mn>
         1 
       </mn> 
       <mn>
         2 
       </mn> 
      </mfrac> 
     </mrow> 
    </math>, can substitute a normal distribution and the eigenvalue distribution of the resulting matrix, whose entries strictly consist of 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mo>
        ± 
      </mo> 
      <mn>
        1 
      </mn> 
     </mrow> 
    </math>, remains the same (<xref ref-type="fig" rid="figFigures 3(a)-(d)">
     Figures 3(a)-(d)
    </xref>). Likewise, adjusting the mean 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mi>
       μ 
     </mi> 
    </math> does not change the core of the distribution of the entries, but it introduces distracting outlier eigenvalues outside the bounds of the Marchenko-Pastur distribution, and therefore, it is best for data sets to be z-score normalized. What about 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mi>
       σ 
     </mi> 
    </math> then? It turns out that adjusting the standard deviation 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mi>
       σ 
     </mi> 
    </math> scales the length of the distribution by a factor 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <msup> 
       <mi>
         σ 
       </mi> 
       <mn>
         2 
       </mn> 
      </msup> 
     </mrow> 
    </math> and the height by a factor 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mfrac> 
       <mn>
         1 
       </mn> 
       <mrow> 
        <msup> 
         <mi>
           σ 
         </mi> 
         <mn>
           2 
         </mn> 
        </msup> 
       </mrow> 
      </mfrac> 
     </mrow> 
    </math>, maintaining the total area under the probability density curve (<xref ref-type="fig" rid="figFigures 3(a)-(d)">
     Figures 3(a)-(d)
    </xref>). The only thing that does affect the patterns is that the entries must be sampled independently, and matrices with heavily dependent columns behave very differently from the predicted distributions (<xref ref-type="fig" rid="fig3(e)">
     Figure 3(e)
    </xref>). Thus, the Random Matrix Theory shows that the eigenvalue distribution for a given ensemble does not change with respect to the distribution of the entries of the matrix, nor the mean nor standard deviation.</p>
   <fig id="fig3" position="float">
    <label>Figure 3</label>
    <caption>
     <title>(a) (b)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId78.jpeg?20241127032313" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId79.jpeg?20241127032313" /></p>(c) (d)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId80.jpeg?20241127032313" /></p>(e)Figure 3. (a)-(d) adjusting the distribution, mean, and standard deviation with respect to a standard normal distribution and r = 1 <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>. (e) eigenvalues for a 1000 × 1000 dependent matrix composed of 10 identical but randomly generated 1000 × 100 blocks <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="" />
   </fig>
   <fig id="fig3" position="float">
    <label>Figure 3</label>
    <caption>
     <title>(a) (b)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId78.jpeg?20241127032313" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId79.jpeg?20241127032313" /></p>(c) (d)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId80.jpeg?20241127032313" /></p>(e)Figure 3. (a)-(d) adjusting the distribution, mean, and standard deviation with respect to a standard normal distribution and r = 1 <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>. (e) eigenvalues for a 1000 × 1000 dependent matrix composed of 10 identical but randomly generated 1000 × 100 blocks <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723907-rId76.jpeg?20241127032313" />
   </fig>
   <fig id="fig3" position="float">
    <label>Figure 3</label>
    <caption>
     <title>(a) (b)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId78.jpeg?20241127032313" /></p><p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId79.jpeg?20241127032313" /></p>(c) (d)<p class="imgGroupCss_v"><img class=" imgMarkCss lazy" data-original="https://html.scirp.org/file/1723907-rId80.jpeg?20241127032313" /></p>(e)Figure 3. (a)-(d) adjusting the distribution, mean, and standard deviation with respect to a standard normal distribution and r = 1 <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>. (e) eigenvalues for a 1000 × 1000 dependent matrix composed of 10 identical but randomly generated 1000 × 100 blocks <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723907-rId77.jpeg?20241127032313" />
   </fig>
   <p>Figure 3. (a)-(d) adjusting the distribution, mean, and standard deviation with respect to a standard normal distribution and r = 1 <xref ref-type="bibr" rid="scirp.137696-2">
     [2]
    </xref>. (e) eigenvalues for a 1000 × 1000 dependent matrix composed of 10 identical but randomly generated 1000 × 100 blocks <xref ref-type="bibr" rid="scirp.137696-2">
     [2]
    </xref>.</p>
  </sec><sec id="s3">
   <title>
    <xref ref-type="bibr" rid="scirp.137696-"></xref>3. Natural Signaling</title>
   <p>A common application of random matrix theory is to retrieve natural signals. We embed a simulated natural signal by adding the gram matrix (SS<sup>T</sup>) of a low-rank randomly generated signal S of dimensions 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        m 
      </mi> 
      <mo>
        × 
      </mo> 
      <mi>
        k 
      </mi> 
     </mrow> 
    </math>, where 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        k 
      </mi> 
      <mo>
        ≪ 
      </mo> 
      <mi>
        m 
      </mi> 
     </mrow> 
    </math> to random matrix A. Often, biological signals come embedded in this form. Doing so will generate 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        n 
      </mi> 
      <mo>
        − 
      </mo> 
      <mi>
        k 
      </mi> 
     </mrow> 
    </math> eigenvalues within the M-P distribution and 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mi>
       k 
     </mi> 
    </math> eigenvalues much larger than the Tracy-Widom Critical Eigenvalue, and their corresponding eigenvector components are not normally distributed like the other eigenvectors <xref ref-type="bibr" rid="scirp.137696-2">
     [2]
    </xref>. Interestingly, the rank of the signal can be recovered by the eigenvalues alone. The eigenvectors of the 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mi>
       k 
     </mi> 
    </math> largest eigenvalues should form the span of the natural signal. For the 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        k 
      </mi> 
      <mo>
        = 
      </mo> 
      <mn>
        1 
      </mn> 
     </mrow> 
    </math> case, the error margin can be easily visualized and measured as the angle between the vectors. The typical error between the eigenvector and signal is around 0.02 - 0.06 for 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        n 
      </mi> 
      <mo>
        = 
      </mo> 
      <mn>
        1000 
      </mn> 
     </mrow> 
    </math> (<xref ref-type="fig" rid="fig4">
     Figure 4
    </xref>).</p>
   <fig id="fig4" position="float">
    <label>Figure 4</label>
    <caption>
     <title>Figure 4. A 2-D projection of the eigenvector (yellow) and signal (red) plotted along with a guess of the signal rank and error margin <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723907-rId95.jpeg?20241127032315" />
   </fig>
  </sec><sec id="s4">
   <title>
    <xref ref-type="bibr" rid="scirp.137696-"></xref>4. Noise</title>
   <p>A common noise model is known as sparsity, where entries of a matrix are randomly replaced by 0 since experimental data often has large gaps due to imperfect measuring methods. The procedure involved pre-multiplying a diagonal matrix set to 95% 0 s and 5% 1 s to the random 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        m 
      </mi> 
      <mo>
        × 
      </mo> 
      <mi>
        n 
      </mi> 
     </mrow> 
    </math> matrix A with the signal already added, which would zero-out 95% of the columns. The following denoising algorithm was then applied to the entries:</p>
   <p>
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <msub> 
       <mi>
         A 
       </mi> 
       <mrow> 
        <mi>
          i 
        </mi> 
        <mi>
          j 
        </mi> 
       </mrow> 
      </msub> 
      <mo>
        = 
      </mo> 
      <mfrac> 
       <mrow> 
        <mn>
          1 
        </mn> 
        <mo>
          × 
        </mo> 
        <msup> 
         <mrow> 
          <mn>
            10 
          </mn> 
         </mrow> 
         <mn>
           6 
         </mn> 
        </msup> 
       </mrow> 
       <mrow> 
        <mi>
          m 
        </mi> 
        <mi>
          n 
        </mi> 
       </mrow> 
      </mfrac> 
      <msub> 
       <mi>
         A 
       </mi> 
       <mrow> 
        <mi>
          i 
        </mi> 
        <mi>
          j 
        </mi> 
       </mrow> 
      </msub> 
     </mrow> 
    </math> (1)</p>
   <p>
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <msub> 
       <mi>
         A 
       </mi> 
       <mrow> 
        <mi>
          i 
        </mi> 
        <mi>
          j 
        </mi> 
       </mrow> 
      </msub> 
      <mo>
        = 
      </mo> 
      <msub> 
       <mrow> 
        <mi>
          log 
        </mi> 
       </mrow> 
       <mn>
         2 
       </mn> 
      </msub> 
      <mrow> 
       <mo>
         ( 
       </mo> 
       <mrow> 
        <mn>
          1 
        </mn> 
        <mo>
          + 
        </mo> 
        <msub> 
         <mi>
           A 
         </mi> 
         <mrow> 
          <mi>
            i 
          </mi> 
          <mi>
            j 
          </mi> 
         </mrow> 
        </msub> 
       </mrow> 
       <mo>
         ) 
       </mo> 
      </mrow> 
     </mrow> 
    </math> (2)</p>
   <p>
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <msub> 
       <mi>
         A 
       </mi> 
       <mrow> 
        <mi>
          i 
        </mi> 
        <mi>
          j 
        </mi> 
       </mrow> 
      </msub> 
      <mo>
        = 
      </mo> 
      <mfrac> 
       <mrow> 
        <msub> 
         <mi>
           A 
         </mi> 
         <mrow> 
          <mi>
            i 
          </mi> 
          <mi>
            j 
          </mi> 
         </mrow> 
        </msub> 
        <mo>
          − 
        </mo> 
        <mi>
          μ 
        </mi> 
       </mrow> 
       <mi>
         σ 
       </mi> 
      </mfrac> 
     </mrow> 
    </math> (3)</p>
   <p>where 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        μ 
      </mi> 
      <mo>
        , 
      </mo> 
      <msup> 
       <mi>
         σ 
       </mi> 
       <mn>
         2 
       </mn> 
      </msup> 
     </mrow> 
    </math> is the mean and variance of all the entries of the previous line <xref ref-type="bibr" rid="scirp.137696-4">
     [4]
    </xref>.</p>
   <p>The gram matrix 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        W 
      </mi> 
      <mo>
        = 
      </mo> 
      <mfrac> 
       <mrow> 
        <mi>
          A 
        </mi> 
        <msup> 
         <mi>
           A 
         </mi> 
         <mtext>
           T 
         </mtext> 
        </msup> 
       </mrow> 
       <mi>
         n 
       </mi> 
      </mfrac> 
     </mrow> 
    </math> is calculated and the eigenvalues and eigenvectors are then computed. The resulting k signal eigenvectors almost form the span of the signal; however, they are instead parallel to 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        D 
      </mi> 
      <mi>
        S 
      </mi> 
     </mrow> 
    </math> where D is the diagonal matrix with a 1 for every nonzero column and 0 for every sparse column, and S is the full signal (<xref ref-type="fig" rid="fig5">
     Figure 5
    </xref>).</p>
   <p>In the real world, data collected are often sparse, so utilizing the theorems of random matrix theory allows signals to be approximated even when as much as 95% of the entries are 0 due to sparsity. While having more data available would yield a higher accuracy, having the vast majority of entries 0 is still enough to gather information about a potential signal <xref ref-type="bibr" rid="scirp.137696-2">
     [2]
    </xref>.</p>
  </sec><sec id="s5">
   <title>
    <xref ref-type="bibr" rid="scirp.137696-"></xref>5. Linear Sketching</title>
   <p>Some matrices are too large for a reasonable computer to calculate all the</p>
   <fig id="fig5" position="float">
    <label>Figure 5</label>
    <caption>
     <title>(a) (b)Figure 5. Error-values in the signal for a 95% sparce matrix (left) compared to its non-sparsified counterpart (right).</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="" />
   </fig>
   <fig id="fig5" position="float">
    <label>Figure 5</label>
    <caption>
     <title>(a) (b)Figure 5. Error-values in the signal for a 95% sparce matrix (left) compared to its non-sparsified counterpart (right).</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723907-rId110.jpeg?20241127032318" />
   </fig>
   <fig id="fig5" position="float">
    <label>Figure 5</label>
    <caption>
     <title>(a) (b)Figure 5. Error-values in the signal for a 95% sparce matrix (left) compared to its non-sparsified counterpart (right).</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723907-rId111.jpeg?20241127032318" />
   </fig>
   <p>eigenvalues and eigenvectors. A linear sketch is a projection of a square matrix 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <msup> 
       <mi>
         ℝ 
       </mi> 
       <mi>
         n 
       </mi> 
      </msup> 
      <mo>
        → 
      </mo> 
      <msup> 
       <mi>
         ℝ 
       </mi> 
       <mi>
         p 
       </mi> 
      </msup> 
     </mrow> 
    </math> with 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        p 
      </mi> 
      <mo>
        ≪ 
      </mo> 
      <mi>
        n 
      </mi> 
     </mrow> 
    </math> <xref ref-type="bibr" rid="scirp.137696-5">
     [5]
    </xref>. The projection matrix P is set to dimensions 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        p 
      </mi> 
      <mo>
        × 
      </mo> 
      <mi>
        n 
      </mi> 
     </mrow> 
    </math> and be 95% 0 s, 5% 1 s, meaning that the columns of a resulting sketch are a linear combination of other vectors <xref ref-type="bibr" rid="scirp.137696-5">
     [5]
    </xref>. The sketch is then denoised using the same algorithm and its eigenvalues and eigenvectors are calculated. While something like a 10,000 × 10,000 matrix is too large to calculate in a timely manner, it is possible to make multiple 100 × 100 sketches of the large matrix and obtain an accurate approximation of the signal (<xref ref-type="fig" rid="fig6">
     Figure 6
    </xref>). Even though higher-dimensional sketches are more accurate, taking many low-rank ones can dramatically speed up computational efficiency without sacrificing accuracy since eigenvalues are orders of magnitude easier to calculate with a smaller matrix size while still yielding an accurate result.</p>
   <fig id="fig6" position="float">
    <label>Figure 6</label>
    <caption>
     <title>(a) (b)Figure 6. Signal from two 100 × 100 sketches compared to a simulated 10,000 × 1 signal added to a 10,000 × 10,000 matrix <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="" />
   </fig>
   <fig id="fig6" position="float">
    <label>Figure 6</label>
    <caption>
     <title>(a) (b)Figure 6. Signal from two 100 × 100 sketches compared to a simulated 10,000 × 1 signal added to a 10,000 × 10,000 matrix <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723907-rId118.jpeg?20241127032318" />
   </fig>
   <fig id="fig6" position="float">
    <label>Figure 6</label>
    <caption>
     <title>(a) (b)Figure 6. Signal from two 100 × 100 sketches compared to a simulated 10,000 × 1 signal added to a 10,000 × 10,000 matrix <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723907-rId119.jpeg?20241127032318" />
   </fig>
   <p>Figure 6. Signal from two 100 × 100 sketches compared to a simulated 10,000 × 1 signal added to a 10,000 × 10,000 matrix <xref ref-type="bibr" rid="scirp.137696-2">
     [2]
    </xref>.</p>
   <p>As it can be seen, the linear sketch is able to well-approximate the signal direction of the larger matrix while taking a lot less time to compute all the eigenvalues and eigenvectors. The error 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mi>
       θ 
     </mi> 
    </math>, measured by</p>
   <p>
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        θ 
      </mi> 
      <mo>
        = 
      </mo> 
      <mfrac> 
       <mrow> 
        <mn>
          180 
        </mn> 
       </mrow> 
       <mtext>
         π 
       </mtext> 
      </mfrac> 
      <msup> 
       <mrow> 
        <mi>
          cos 
        </mi> 
       </mrow> 
       <mrow> 
        <mo>
          − 
        </mo> 
        <mn>
          1 
        </mn> 
       </mrow> 
      </msup> 
      <mrow> 
       <mo>
         ( 
       </mo> 
       <mrow> 
        <mfrac> 
         <mrow> 
          <mi>
            v 
          </mi> 
          <mo>
            ⋅ 
          </mo> 
          <mi>
            s 
          </mi> 
         </mrow> 
         <mrow> 
          <mrow> 
           <mo>
             ‖ 
           </mo> 
           <mi>
             v 
           </mi> 
           <mo>
             ‖ 
           </mo> 
          </mrow> 
          <mrow> 
           <mo>
             ‖ 
           </mo> 
           <mi>
             s 
           </mi> 
           <mo>
             ‖ 
           </mo> 
          </mrow> 
         </mrow> 
        </mfrac> 
       </mrow> 
       <mo>
         ) 
       </mo> 
      </mrow> 
     </mrow> 
    </math>,</p>
   <p>where 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <mi>
        v 
      </mi> 
      <mo>
        = 
      </mo> 
      <msup> 
       <mi>
         P 
       </mi> 
       <mtext>
         T 
       </mtext> 
      </msup> 
      <msup> 
       <mrow> 
        <mrow> 
         <mo>
           ( 
         </mo> 
         <mrow> 
          <mi>
            P 
          </mi> 
          <msup> 
           <mi>
             P 
           </mi> 
           <mtext>
             T 
           </mtext> 
          </msup> 
         </mrow> 
         <mo>
           ) 
         </mo> 
        </mrow> 
       </mrow> 
       <mrow> 
        <mo>
          − 
        </mo> 
        <mn>
          1 
        </mn> 
       </mrow> 
      </msup> 
      <mi>
        x 
      </mi> 
     </mrow> 
    </math>, 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mi>
       x 
     </mi> 
    </math> is the signal eigenvector, and 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mi>
       s 
     </mi> 
    </math> is the simulated signal.</p>
   <p>Taking multiple sketches and then applying the algorithms to a new matrix 
    <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> 
      <msub> 
       <mi>
         T 
       </mi> 
       <mi>
         m 
       </mi> 
      </msub> 
     </mrow> 
    </math>, with its i,jth entries being the mean of the i,jth entries across the sketches then decreases the error dramatically (<xref ref-type="fig" rid="fig7">
     Figure 7
    </xref>).</p>
   <fig id="fig7" position="float">
    <label>Figure 7</label>
    <caption>
     <title>Figure 7. Graph of 10 sketches T1-T10, along with their mean, and the error margin for a matrix Tm <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723907-rId132.jpeg?20241127032318" />
   </fig>
   <p>When applying the above sketching process to a public 32,738 × 2700 PBMC data set, representing single-cell mRNA expression vectors from blood cells, by making 10 sketches of the matrix with rank 500, 7 eigenvalues stand out about 1000 times greater than the predicted T-W critical eigenvalue (<xref ref-type="fig" rid="fig8">
     Figure 8
    </xref>). This, in turn, leads to a rank 7 signal. Higher-dimensional sketches of rank 1000, and rank 2000 were also made, and the result was corroborated; however, taking the entire 32,738-dimensional would have required a much more powerful computer and taken a lot longer.</p>
   <fig id="fig8" position="float">
    <label>Figure 8</label>
    <caption>
     <title>Figure 8. Eigenvalues of a 500 × 500 sketch of the described PBMC data set. While most of the eigenvalues fall within the range of the M-P curve, there is a small but notable spike far out from the rest of the eigenvalues <xref ref-type="bibr" rid="scirp.137696-2">
       [2]
      </xref>.</title>
    </caption>
    <graphic mimetype="image" position="float" xlink:type="simple" xlink:href="https://html.scirp.org/file/1723907-rId133.jpeg?20241127032318" />
   </fig>
  </sec><sec id="s6">
   <title>
    <xref ref-type="bibr" rid="scirp.137696-"></xref>6. Conclusions</title>
   <p>Despite its seemingly random nature, there are many mathematical patterns in the world of random matrix theory. It is, therefore, straightforward to analyze random processes like unicellular data or Hamiltonian nuclei that, even while affected by error in human measurements, still lead to convincing conclusions about the behavior of these systems. In addition, universality with respect to mean, standard deviation, and distribution of the entries of a random matrix further highlights the predictable properties of random matrices.</p>
   <p>This approach has been used in many instances to separate noise and improve calculation efficiency while maintaining an accurate depiction of the properties of a matrix that would otherwise be difficult to glean any information from. With the rise in the importance of matrices in emerging fields like biotechnology and artificial intelligence, they will be a very important tool for solving future problems.</p>
  </sec>
 </body><back>
  <ref-list>
   <title>References</title>
   <ref id="scirp.137696-ref1">
    <label>1</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Pedregoza, F., Paquette, C., Trogdon, T. and Pennington, J. (n.d.) (2024) Random Matrix Theory and Machine Learning Tutorial [PowerPoint Slides]. &gt;https://random-matrix-learning.github.io/#presentation1
    </mixed-citation>
   </ref>
   <ref id="scirp.137696-ref2">
    <label>2</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Jiang, N. (2024) Jiang Random Matrix Research. &gt;https://github.com/nathanjiang100/Jiang-Random-Matrix-Research
    </mixed-citation>
   </ref>
   <ref id="scirp.137696-ref3">
    <label>3</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Edelman, A. and Rao, N.R. (2005) Random Matrix Theory. Acta Numerica, 14, 233-297. &gt;https://doi.org/10.1017/s0962492904000236
    </mixed-citation>
   </ref>
   <ref id="scirp.137696-ref4">
    <label>4</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     Aparicio, L., Bordyuh, M., Blumberg, A.J. and Rabadan, R. (2020) A Random Matrix Theory Approach to Denoise Single-Cell Data. Patterns, 1, Article 100035. &gt;https://doi.org/10.1016/j.patter.2020.100035
    </mixed-citation>
   </ref>
   <ref id="scirp.137696-ref5">
    <label>5</label>
    <mixed-citation publication-type="other" xlink:type="simple">
     McGregor, A. (n.d.) (2024) Linear Sketches with Applications to Data Streams [PowerPoint Slides]. University of Massachusetts Amherst. &gt;https://people.cs.umass.edu/~mcgregor/stocworkshop/mcgregor.pdf
    </mixed-citation>
   </ref>
  </ref-list>
 </back>
</article>