<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">JCC</journal-id><journal-title-group><journal-title>Journal of Computer and Communications</journal-title></journal-title-group><issn pub-type="epub">2327-5219</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/jcc.2024.124002</article-id><article-id pub-id-type="publisher-id">JCC-132267</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Computer Science&amp;Communications</subject></subj-group></article-categories><title-group><article-title>
 
 
  Low-Rank Multi-View Subspace Clustering Based on Sparse Regularization
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Yan</surname><given-names>Sun</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Fanlong</surname><given-names>Zhang</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib></contrib-group><aff id="aff1"><addr-line>School of Computer Science, Nanjing Audit University, Nanjing, China</addr-line></aff><pub-date pub-type="epub"><day>01</day><month>04</month><year>2024</year></pub-date><volume>12</volume><issue>04</issue><fpage>14</fpage><lpage>30</lpage><history><date date-type="received"><day>2,</day>	<month>March</month>	<year>2024</year></date><date date-type="rev-recd"><day>30,</day>	<month>March</month>	<year>2024</year>	</date><date date-type="accepted"><day>2,</day>	<month>April</month>	<year>2024</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  Multi-view Subspace Clustering (MVSC) emerges as an advanced clustering method, designed to integrate diverse views to uncover a common subspace, enhancing the accuracy and robustness of clustering results. The significance of low-rank prior in MVSC is emphasized, highlighting its role in capturing the global data structure across views for improved performance. However, it faces challenges with outlier sensitivity due to its reliance on the Frobenius norm for error measurement. Addressing this, our paper proposes a Low-Rank Multi-view Subspace Clustering Based on Sparse Regularization (LMVSC- Sparse) approach. Sparse regularization helps in selecting the most relevant features or views for clustering while ignoring irrelevant or noisy ones. This leads to a more efficient and effective representation of the data, improving the clustering accuracy and robustness, especially in the presence of outliers or noisy data. By incorporating sparse regularization, LMVSC-Sparse can effectively handle outlier sensitivity, which is a common challenge in traditional MVSC methods relying solely on low-rank priors. Then Alternating Direction Method of Multipliers (ADMM) algorithm is employed to solve the proposed optimization problems. Our comprehensive experiments demonstrate the efficiency and effectiveness of LMVSC-Sparse, offering a robust alternative to traditional MVSC methods.
 
</p></abstract><kwd-group><kwd>Clustering</kwd><kwd> Multi-View Subspace Clustering</kwd><kwd> Low-Rank Prior</kwd><kwd> Sparse Regularization</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>Clustering plays a significant role in machine learning and artificial intelligence (AI) for several reasons, acting as a foundational technique that underpins many of the processes and applications within these fields [<xref ref-type="bibr" rid="scirp.132267-ref1">1</xref>] [<xref ref-type="bibr" rid="scirp.132267-ref2">2</xref>] .</p><p>Multi-view Subspace Clustering (MVSC) is an advanced clustering technique that is particularly suited for handling data that naturally comes from multiple sources or “views.” This approach is based on the principle that different views of the data can provide complementary information that should be integrated when performing clustering. The goal of MVSC is to find a common subspace that best represents the underlying structure of the data across all views, thereby improving the quality and accuracy of the clustering results. As shown in [<xref ref-type="bibr" rid="scirp.132267-ref3">3</xref>] and [<xref ref-type="bibr" rid="scirp.132267-ref4">4</xref>] , by leveraging multiple views of the data, MVSC can achieve higher clustering accuracy than single-view clustering methods, especially when the views are complementary. The works [<xref ref-type="bibr" rid="scirp.132267-ref5">5</xref>] and [<xref ref-type="bibr" rid="scirp.132267-ref6">6</xref>] claim that the integration of multiple views can make the clustering process more robust to noise and redundancy within individual views, as the method can exploit the clean and informative parts of each view.</p><p>As a result, the Multi-view Subspace Clustering has been applied in various areas, such as image and video analysis [<xref ref-type="bibr" rid="scirp.132267-ref7">7</xref>] , bioinformatics [<xref ref-type="bibr" rid="scirp.132267-ref8">8</xref>] [<xref ref-type="bibr" rid="scirp.132267-ref9">9</xref>] , and social network analysis [<xref ref-type="bibr" rid="scirp.132267-ref10">10</xref>] . Among them, low-rank prior is a critical technique used to capture the global structure of data across multiple views while ensuring that the representation is compact and meaningful. The low-rank prior is based on the assumption that the data from all views lie on or near a low-dimensional subspace, and this inherent structure can be exploited to improve clustering performance. The key contributions in the domain of low-rank subspace-based methods encompass several notable works, such as Latent Multi-view Subspace Clustering (LMSC) [<xref ref-type="bibr" rid="scirp.132267-ref11">11</xref>] , Multimodal Sparse and Low-rank Subspace Clustering (MLRSSC) [<xref ref-type="bibr" rid="scirp.132267-ref12">12</xref>] , Flexible Multi-view Representation Learning for Subspace Clustering (FMR) [<xref ref-type="bibr" rid="scirp.132267-ref13">13</xref>] and Dual Shared-Specific Multi-view Subspace Clustering (DSS-MSC) [<xref ref-type="bibr" rid="scirp.132267-ref14">14</xref>] . These methods, grounded in the well-established framework of low-rank representation, have demonstrated competitive clustering performance in empirical studies.</p><p>Especially, the recent work [<xref ref-type="bibr" rid="scirp.132267-ref15">15</xref>] introduced an efficient and effective approach termed Facilitated Low-rank Multi-view Subspace Clustering (FLMSC), they factorize the view-specific representation matrix into two small factor matrices, i.e., an orthogonal dictionary and a latent representation, which can fully explore the underlying subspace structure of multiple views. However, this approach still suffers from one issue. They focus on the F-norm to measure the error between observation data and reconstruction data. However, as well known F-norm is sensitive to outliers, as it squares the values before summing them, which can proportionately increase the impact of larger differences.</p><p>To address the aforementioned drawback, this paper develops a Low-Rank Multi-view Subspace Clustering Based on Sparse Regularization approach, which is featured by sparse regularization. The main contributions and novelty of this work can be summarized below. 1) By employing sparse regularization, this paper presents a robust low-rank multi-view subspace clustering approach termed, LMVSC-Sparse. 2) This paper develops an Alternating Direction Method of Multipliers (ADMM) algorithm to solve proposed optimization problems. 3) Comprehensive experiments are conducted on benchmark data sets, which have shown the advantage of our approach in both efficiency and effectiveness.</p><p>The rest of the paper is organized as follows: Section 2 reviews the related model. In Section 3, we propose a novel model, low-rank multi-view subspace clustering based on sparse regularization (LMVSC-Sparse) and its corresponding optimization algorithm. Section 4 illustrates the experiment’s benchmark data sets.</p></sec><sec id="s2"><title>2. Foundational Model</title><p>Denote X = { x : 1 , ⋯ , x : n } ∈ ℝ d &#215; n a collection of data samples, where d is the feature dimension and n is the number of data samples. Then, the traditional subspace clustering based on the self-expression can be modeled by (1):</p><p>X = X Z + E . (1)</p><p>where Z ∈ ℝ n &#215; n denotes the subspace representation at dictionary X and E denotes the error matrix. The corresponding optimization problem based on low rank prior can be given by (2)</p><p>min Z ‖ X Z − X ‖ F 2 + λ ‖ Z ‖ ∗ . (2)</p><p>in which ‖ Z ‖ ∗ means nuclear norm, computed by summing the singular values of ‖ Z ‖ ∗ . The nuclear norm is widely used to measure the low rank property. (2) is also called low-rank representation (LRR) [<xref ref-type="bibr" rid="scirp.132267-ref16">16</xref>] . LRR has being successfully applied in dimensionality reduction [<xref ref-type="bibr" rid="scirp.132267-ref17">17</xref>] , noise reduction [<xref ref-type="bibr" rid="scirp.132267-ref18">18</xref>] , recommendation systems [<xref ref-type="bibr" rid="scirp.132267-ref19">19</xref>] , image processing [<xref ref-type="bibr" rid="scirp.132267-ref20">20</xref>] , and also clustering and classification [<xref ref-type="bibr" rid="scirp.132267-ref21">21</xref>] [<xref ref-type="bibr" rid="scirp.132267-ref22">22</xref>] .</p><p>After solving (2), the affinity matrix W can be obtained by</p><p>W = 1 2 ( | Z | + | Z T | ) . (3)</p><p>Then one can obtain the final clustering results by conducting spectral clustering algorithm.</p><p>The (2) can be extended to a general problem (4):</p><p>min Z ‖ X Z − X ‖ F 2 + λ f ‖ Z ‖ ∗ . (4)</p><p>where function f means general regularization.</p><p>Now, considering the multi-view data samples, denoted by X = { X ( 1 ) T , ⋯ , X ( V ) T } T , where X ( v ) = [ x : 1 ( v ) , ⋯ , x : n ( v ) ] ∈ ℝ d v &#215; n being the vth view. The LRR (2) can be extended to (5):</p><p>min { Z ( v ) } v = 1 V ∑ v = 1 V ‖ X ( v ) − X ( v ) Z ( v ) ‖ F 2 + λ ∑ v = 1 V ‖ Z ( v ) ‖ ∗ . (5)</p><p>Recently, the [<xref ref-type="bibr" rid="scirp.132267-ref15">15</xref>] furthermore extends (5) to following (6):</p><p>min { Z ( v ) , L ( v ) , C ( v ) } v = 1 V ∑ v = 1 V ‖ X ( v ) − X ( v ) Z ( v ) ‖ F 2 + λ 1 ∑ v = 1 V ‖ C ( v ) ‖ ∗ + λ 2 ∑ v = 1 , v ≠ w V ‖ C ( v ) − C ( w ) ‖ F 2 s .t .     Z ( v ) = L ( v ) C ( v ) ,     L ( v ) T L ( v ) = I ,     ∀ v (6)</p><p>where λ 1 and λ 2 are two positive balancing parameters.</p><p>Compared to (5), the (6) factorize the view specific representation Z ( v ) into two small matrices L ( v ) and C ( v ) , and employed the property ‖ C ( v ) ‖ ∗ = ‖ Z ( v ) ‖ ∗ when L ( v ) has orthogonal columns, i.e. L ( v ) T L ( v ) = I .</p></sec><sec id="s3"><title>3. Proposed Approach</title><p>In this section, we utilize sparse regularization instead of F norm regularization. in (6). Sparse regularization is often preferred over F-norm regularization in various machine learning and signal processing applications due to its unique properties and benefits, especially when dealing with high-dimensional data or models that incorporate many parameters. There exist two benefits for sparse regularization. 1) Sparse regularization can promote sparsity of solution. This means that they encourage the model to use fewer features or parameters by driving the coefficients of less important features to zero. This is particularly useful in feature selection and for models where interpretability is important, as it highlights which features are most relevant to the prediction. 2) Sparse regularization is effective at preventing overfitting, especially in high-dimensional settings where the number of features greatly exceeds the number of observations. By encouraging a model to concentrate on fewer variables, it reduces the model’s complexity and enhances its capacity to fit noise.</p><p>Thus, we replace the F-norm in (6) by L1-norm, and obtain the Low-Rank Multi-view Subspace Clustering Based on Sparse Regularization (LMVSC-Sparse):</p><p>min { Z ( v ) , L ( v ) , C ( v ) } v = 1 V ∑ v = 1 V ‖ X ( v ) − X ( v ) Z ( v ) ‖ 1 + λ 1 ∑ v = 1 V ‖ C ( v ) ‖ ∗ + λ 2 ∑ v = 1 , v ≠ w V ‖ C ( v ) − C ( w ) ‖ F 2 s .t .     Z ( v ) = L ( v ) C ( v ) ,     L ( v ) T L ( v ) = I ,     ∀ v (7)</p><p>Based on the framework of ADMM [<xref ref-type="bibr" rid="scirp.132267-ref23">23</xref>] , we propose an efficient optimization algorithm to solve the minimization problem abovementioned. First, the corresponding augmented Lagrange function is formulated as follows:</p><p>L ( { Z ( v ) , L ( v ) , C ( v ) } v = 1 V ) = ∑ v = 1 V ‖ X ( v ) − X ( v ) Z ( v ) ‖ 1 + λ 1 ∑ v = 1 V ‖ C ( v ) ‖ ∗ + λ 2 V − 1 ∑ v = 1 , v ≠ w V ‖ C ( v ) − C ( w ) ‖ F 2         + ∑ v = 1 V 〈 Y ( v ) , Z ( v ) − L ( v ) C ( v ) 〉 + μ 2 ∑ v = 1 V ‖ Z ( v ) − L ( v ) C ( v ) ‖ F 2 s .t .     L ( v ) T L ( v ) = I ,   ∀ v (8)</p><p>where { Y ( v ) } v = 1 V represent the Lagrange multipliers and μ represents the penalty parameter. Apparently, it is not easy to optimize all the variables at the same time. Therefore, we adopt an iterative optimization scheme to update the variables one by one. The corresponding procedure of updating steps as shown in what follows.</p><p>1) Update the variables { Z ( v ) } v = 1 V</p><p>When fixing the other variables, we can solve the following minimization sub-problem w.r.t. variable Z ( v ) :</p><p>min Z ( v ) ‖ X ( v ) − X ( v ) Z ( v ) ‖ 1 + 〈 Y ( v ) , Z ( v ) − L ( v ) C ( v ) 〉 + μ 2 ‖ Z ( v ) − L ( v ) C ( v ) ‖ F 2 . (9)</p><p>This can be rewritten equivalently as:</p><p>min Z ( v ) ‖ X ( v ) − X ( v ) Z ( v ) ‖ 1 + μ 2 ‖ Z ( v ) − L ( v ) C ( v ) + 1 μ Y ( v ) ‖ F 2 . (10)</p><p>By introducing the auxiliary variable H, and omitting the superscript for simplicity, we have:</p><p>min Z , H ‖ H ‖ 1 + μ 2 ‖ Z − L C + 1 μ Y ‖ F 2       s .t .   X − X Z = H . (11)</p><p>This is a constrained optimization problem. We adopt half-quadratic splitting (HQS) [<xref ref-type="bibr" rid="scirp.132267-ref24">24</xref>] algorithm for its simplicity and fast convergence. Then it is solved by following minimization</p><p>min Z , H ‖ H ‖ 1 + μ 2 ‖ Z − L C + 1 μ Y ‖ F 2 + η 2 ‖ X − X Z − H ‖ F 2 . (12)</p><p>where η is a penalty parameter that forces X − X Z and H to approach the same fixed point. Subsequently, H and Z can be updated by following two sub-problems.</p><p>Sub-problem one:</p><p>min H f 1 = ‖ H ‖ 1 + η 2 ‖ X − X Z − H ‖ F 2 . (13)</p><p>Sub-problem two:</p><p>min Z f 2 = μ 2 ‖ Z − L C + 1 μ Y ‖ F 2 + η 2 ‖ X − X Z − H ‖ F 2 . (14)</p><p>For first sub-problem, let S τ : R → R denote the shrinkage operator S τ ( x ) = sgn ( x ) max ( | x | − τ , 0 ) and extend it to matrices by applying it to each element. It is easy to show that above sub-problem’s solution can be given by</p><p>H = S η ( X − X Z ) . (15)</p><p>For second sub-problem, it is equal to following problem</p><p>min Z f 3 = μ η ‖ Z − L C + 1 μ Y ‖ F 2 + ‖ X − X Z − H ‖ F 2 . (16)</p><p>Its solution can be given by setting the derivative of above sub-problem to zero and obtain:</p><p>f 3 = μ η t r a c e [ ( Z − L C + 1 μ Y ) T ( Z − L C + 1 μ Y ) ]     + t r a c e [ ( X − X Z − H ) T ( X − X Z − H ) ] = μ η t r a c e [ Z T Z + 2 Z T ( 1 μ Y − L C ) ] + t r a c e [ Z T X T X Z + 2 Z T X T ( H − X ) ] = t r a c e [ μ η Z T Z + 2 Z T ( 1 η Y − μ η L C ) ] + t r a c e [ Z T X T X Z + 2 Z T X T ( H − X ) ] = t r a c e [ Z T ( μ η + X T X ) Z + 2 Z T ( 1 η Y − μ η L C + X T ( H − X ) ) ] (17)</p><p>Then</p><p>∂ f 3 ∂ Z = 2 ( μ η + X T X ) Z + 2 ( 1 η Y − μ η L C + X T ( H − X ) ) . (18)</p><p>Let ∂ f 3 ∂ Z = 0 , we have</p><p>Z = ( μ η + X T X ) − 1 ( − 1 η Y + μ η L C + X T ( X − H ) ) . (19)</p><p>By Sherman-Morrison-Woodbury equation, according the size of X, Z can also be rewritten as</p><p>( μ η + X T X ) − 1 = η μ I − η μ X T ( I + X η μ X T ) − 1 X η μ = η μ ( I − η μ X T ( I + η μ X X T ) − 1 X ) (20)</p><p>2) Update rule for the variables { L ( v ) } v = 1 V</p><p>When fixing the other variables, we can solve the following minimization sub-problem w.r.t. variable L ( v ) :</p><p>min L ( v ) 〈 Y ( v ) , Z ( v ) − L ( v ) C ( v ) 〉 + μ 2 ‖ Z ( v ) − L ( v ) C ( v ) ‖ F 2 s .t .     L ( v ) T L ( v ) = I (21)</p><p>This constrained problem could be further reduced into the form as follows:</p><p>max L ( v ) T r ( L ( v ) T R ( v ) )       s .t .   L ( v ) T L ( v ) = I . (22)</p><p>where R ( v ) = ( Z ( v ) + 1 μ Y ( v ) ) C ( v ) T . Before solving (22), we need following lemma:</p><p>Lemma 1. For any matrices A ∈ ℝ m &#215; n , suppose the singular value decomposition (SVD) of matrix A is U Λ V T , then we consider the following constrained problem:</p><p>max Y T r ( Y T A )       s .t .   Y T Y = I . (23)</p><p>has closed form as follows:</p><p>Y = U V T . (24)</p><p>Based on the Lemma 1, by performing the SVD decomposition of matrix R ( v ) as R ( v ) = U R ( v ) Λ R ( v ) W R ( v ) , the solution for (22) can be achieved by:</p><p>L ( v ) = U R ( v ) W R ( v ) T . (25)</p><p>3) Update rule for the variables { C ( v ) } v = 1 V</p><p>When fixing the other variables, we obtain the problem (26):</p><p>min C ( v ) λ 1 ‖ C ( v ) ‖ ∗ + λ 2 V − 1 ∑ v ≠ w ‖ C ( v ) − C ( w ) ‖ F 2   + 〈 Y ( v ) , Z ( v ) − L ( v ) C ( v ) 〉 + μ 2 ‖ Z − L ( v ) C ( v ) ‖ F 2 (26)</p><p>Before solving (26), we need following lemma [<xref ref-type="bibr" rid="scirp.132267-ref25">25</xref>] :</p><p>Lemma 2. For a given matrix F and a positive parameter τ &gt; 0 , the optimal solution to the following problem</p><p>min D τ ‖ D ‖ ∗ + 1 2 ‖ D − F ‖ F 2 . (27)</p><p>is given by</p><p>D = U F Θ τ ( Σ F ) W F T . (28)</p><p>where U F Σ F W F T is the SVD decomposition of matrix F. Meanwhile, Θ τ ( ⋅ ) is defined as follows:</p><p>Θ τ ( Σ F ) = max ( 0 , Σ F − τ ) + min ( 0 , Σ F + τ ) . (29)</p><p>Based on Lemma 2, by setting γ = λ 1 2 ( μ + λ 2 ) , the closed-formed solution for variable C ( v ) is shown as follows:</p><p>C ( v ) = U H ( v ) Θ γ ( Σ H ( v ) ) W H ( v ) T . (30)</p><p>where, U H ( v ) Σ H ( v ) W H ( v ) T represents the SVD decomposition of 1 μ + λ 2 H ( v ) , and</p><p>H ( v ) = μ L ( v ) T ( Z ( v ) + 1 μ Y ( v ) ) + λ 2 V − 1 ∑ w ≠ v C ( w ) (31)</p><p>The final affinity matrix S could be obtained as follows:</p><p>S ˜ = 1 V ∑ v = 1 v L ( v ) C ( v ) . (32)</p><p>And S = | S ˜ | + | S ˜ | T 2 .</p><p>In a nutshell, the detailed optimization process for LMVSC-Sparse is summarized in Algorithm 1.</p></sec><sec id="s4"><title>4. Experiments</title><p>The proposed algorithm is compared with four state-of-the-art cluster algorithms, namely, Facilitated Low-rank Multi-view Subspace Clustering (FLMSC) [<xref ref-type="bibr" rid="scirp.132267-ref15">15</xref>] , Scalable Multi-view Subspace Clustering with UnifiedAnchors (SMVSC) [<xref ref-type="bibr" rid="scirp.132267-ref26">26</xref>] , Large-Scale Multi-View Subspace Clustering (LMVSC) [<xref ref-type="bibr" rid="scirp.132267-ref3">3</xref>] , Graph-based Multi-view Clustering (GMC) [<xref ref-type="bibr" rid="scirp.132267-ref27">27</xref>] .</p><sec id="s4_1"><title>4.1. Data and Metrics</title><p>To verify performance, the BBC [<xref ref-type="bibr" rid="scirp.132267-ref28">28</xref>] is used for clustering. There are 2225 documents over 5 annotated topics in this data set. In the experiments, we use as ampled subset of original BBC consisting of 685 documents and four different views, with 4659, 4633, 4665 and 4684 in each view, respectively.</p><p>For the evaluation metrics, F-score, Normalized Mutual Information (NMI), Accuracy (ACC), and Adjusted Rand index (AR) are employed. The F-score, also known as the F1-score or F-measure, considers both the precision and the recall to compute the score. Normalized Mutual Information (NMI) is a measure used to evaluate the similarity between two clustering of a dataset. It’s a measure of the mutual dependence between two variables, in this case, the clustering assignments obtained from different algorithms or methods. Accuracy (ACC) is a common evaluation metric used in the context of clustering, particularly when the ground truth labels are available. It measures the proportion of data points that are correctly assigned to their true clusters.</p><p>It’s a popular metric due to its simplicity and ability to handle varying cluster sizes and shapes. AR is a measure that assesses the similarity between two clustering by considering all pairs of samples and counting pairs that are assigned to the same or different clusters in the predicted and true clustering. It then adjusts the raw Rand Index to account for the expected similarity between clustering due to chance.</p></sec><sec id="s4_2"><title>4.2. Comparison with State-of-Arts</title><p>In the first experiments, 1% elements in BBC dataset is added noise, the noise level vary from 0 to 0.5. Then we compare the F-score, NMI, ACC, and AR for various algorithms. The results are shown in Figures 1-4.</p><p>From <xref ref-type="fig" rid="fig1">Figure 1</xref>, one might conclude that LMVSC-Sparse is the most robust method in the presence of noise, maintaining a high F-score across all tested noise levels. In contrast, GMC is the least effective method in terms of F-score, regardless of the noise level. The other methods show varying degrees of decline in their F-scores as the noise level increases, suggesting that they are more sensitive to noise than LMVSC-Sparse. These observations could be useful for selecting a method for applications where data is expected to have a certain level of noise. LMVSC-Sparse might be preferable in environments where noise is unavoidable or difficult to control.</p><p>In <xref ref-type="fig" rid="fig2">Figure 2</xref>, the overall trend indicates that all methods suffer a decline in clustering performance as noise increases, but to varying degrees. LMVSC-Sparse appears to be the most robust against noise, maintaining a high NMI throughout. GMC is markedly affected by noise, with a significant decrease in NMI as the noise level rises. For applications where maintaining clustering quality in the presence of noise is important, LMVSC-Sparse would likely be the preferred choice based on this data. The other methods may still be considered, but their performance will depend on the acceptable threshold for NMI in the context of the specific application. <xref ref-type="fig" rid="fig3">Figure 3</xref> and <xref ref-type="fig" rid="fig4">Figure 4</xref> show the similar trend as <xref ref-type="fig" rid="fig1">Figure 1</xref> and <xref ref-type="fig" rid="fig2">Figure 2</xref>.</p><p>Also, the mean running times of algorithms are summarized in <xref ref-type="table" rid="table1">Table 1</xref>. From this table, we can conclude that, LMVSC is the fastest algorithm across all noise levels, making it suitable for applications where running time is critical. LMVSC-Sparse is the slowest, which might be a trade-off for its robust performance in terms of F-score, NMI, ACC and AR, as indicated in the previous figures. Choosing the right algorithm would depend on the balance between accuracy (as measured by F-score, NMI, ACC and AR) and efficiency (as measured by running time), alongside the specific requirements of the application or task at hand.</p><p>In the second experiments, 1%, 5%, 10%, 20%, 50% elements in BBC dataset is added noise, the noise level is 0.1. Then we compare the F-score, NMI, ACC, and AR for various algorithms. The results are shown in Figures 5-8. From these figures, we can conclude that:</p><p>1) All algorithms show a decline in F-score with increasing sparsity levels, with LMVSC-Sparse and FLMSC being the least affected. GMC’s performance drops significantly and remains low across sparsity levels.</p><p>2) For NMI again, all algorithms show a decline as sparsity increases, with LMVSC-Sparse showing the least impact. GMC performs poorly at higher sparsity levels.</p><p>3) For ACC, we see a sharp decline for all methods as sparsity increases, with LMVSC-Sparse being the most robust but still affected.</p><p>4) For AC, all algorithms experience a drop sparsity increase. LMVSC-Sparse and FLMSC tend to have better robustness compared to others.</p></sec><sec id="s4_3"><title>4.3. Parameter Sensitivity Analysis</title><p><xref ref-type="fig" rid="fig9">Figure 9</xref> shows a series of heatmaps that represent a parameter analysis for different evaluation metrics and rank bounds. Each heatmap corresponds to a combination of two parameters, which are λ 1 and λ 2 . The colors in each heatmap represent different values of the metric being evaluated, with darker or brighter colors typically indicating better performance. Here’s a breakdown of the analysis:</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> The running times compare (Time unit: s)</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Noise Level</th><th align="center" valign="middle" >LMVSC-Sparse</th><th align="center" valign="middle" >FLMSC</th><th align="center" valign="middle" >SMVSC</th><th align="center" valign="middle" >LMVSC</th><th align="center" valign="middle" >GMC</th></tr></thead><tr><td align="center" valign="middle" >0</td><td align="center" valign="middle" >44.47</td><td align="center" valign="middle" >19.93</td><td align="center" valign="middle" >21.84</td><td align="center" valign="middle" >2.26</td><td align="center" valign="middle" >4.02</td></tr><tr><td align="center" valign="middle" >0.1</td><td align="center" valign="middle" >44.39</td><td align="center" valign="middle" >19.66</td><td align="center" valign="middle" >21.72</td><td align="center" valign="middle" >2.24</td><td align="center" valign="middle" >3.89</td></tr><tr><td align="center" valign="middle" >0.3</td><td align="center" valign="middle" >44.76</td><td align="center" valign="middle" >19.85</td><td align="center" valign="middle" >21.22</td><td align="center" valign="middle" >2.18</td><td align="center" valign="middle" >3.70</td></tr><tr><td align="center" valign="middle" >0.5</td><td align="center" valign="middle" >45.55</td><td align="center" valign="middle" >20.55</td><td align="center" valign="middle" >20.97</td><td align="center" valign="middle" >2.08</td><td align="center" valign="middle" >3.21</td></tr></tbody></table></table-wrap><p>In summary, the performance of all algorithms degrades with increasing sparsity, which is expected as sparser data tends to have less information for the algorithms to leverage in the clustering process. LMVSC-Sparse seems to be the most robust across all evaluated metrics, maintaining higher values than the others as sparsity increases.</p><p>1) There is a consistent trend where the F-score, NMI, and AC seem to be more sensitive to the second parameter across rank bounds.</p><p>2) For all metrics, there are specific parameter combinations that yield high values, indicating optimal regions for each rank bound setting.</p><p>3) The first parameter appears to have a less significant impact on the F-score and NMI compared to ACC and AC.</p><p>4) The optimal regions for high values seem to shift and become less extensive as the rank bound increases, which may indicate that models with higher complexity (larger rank bounds) do not necessarily perform better and can be harder to tune.</p><p>In conclusion, these heatmaps can be used to identify the optimal parameter settings for each rank bound and metric. It is important to balance model complexity with the ability to tune the parameters effectively, as overly complex models may not yield better performance and can be more challenging to optimize.</p></sec></sec><sec id="s5"><title>5. Conclusion</title><p>This paper introduced an innovative Low-Rank Multi-view Subspace Clustering based on Sparse Regularization (LMVSC-Sparse) method. LMVSC-Sparse incorporated sparse regularization to mitigate the impact of outliers, thus enhancing the robustness of the clustering process. The developed ADMM algorithm efficiently solved the optimization problem, ensuring both effectiveness and efficiency, as evidenced by the experimental results on benchmark datasets. The performance of LMVSC-Sparse, particularly in noisy and sparse conditions, demonstrated its superiority over other state-of-the-art MVSC methods. This robustness is critical for practical applications in fields such as image analysis, bioinformatics, and social network analysis, where data often contain noise and come from diverse sources. The results of this work not only further the understanding of multi-view clustering dynamics but also open avenues for future research in optimizing clustering methods for complex, real-world datasets.</p></sec><sec id="s6"><title>Acknowledgements</title><p>This work was supported partly by the National Natural Science Foundation of China under Grant No. 62276137.</p></sec><sec id="s7"><title>Conflicts of Interest</title><p>The authors declare no conflicts of interest regarding the publication of this paper.</p></sec><sec id="s8"><title>Cite this paper</title><p>Sun, Y. and Zhang, F.L. (2024) Low-Rank Multi-View Subspace Clustering Based on Sparse Regularization. Journal of Computer and Communications, 12, 14-30. https://doi.org/10.4236/jcc.2024.124002</p></sec></body><back><ref-list><title>References</title><ref id="scirp.132267-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Ezugwu, A.E., Ikotun, A.M., Oyelade, O.O., &lt;i&gt;et al.&lt;/i&gt; (2022) A Comprehensive Survey of Clustering Algorithms: State-of-the-Art Machine Learning Applications, Taxonomy, Challenges, and Future Research Prospects. &lt;i&gt;Engineering Applications of Artificial Intelligence&lt;/i&gt;, 110, Article 104743. &lt;br&gt;https://doi.org/10.1016/j.engappai.2022.104743</mixed-citation></ref><ref id="scirp.132267-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Tieghi, L., Becker, S., Corsini, A., &lt;i&gt;et al.&lt;/i&gt; (2023) Machine-Learning Clustering Methods Applied to Detection of Noise Sources in Low-Speed Axial Fan. &lt;i&gt;Journal of Engineering for Gas Turbines and Power&lt;/i&gt;, 145, Article 031020. &lt;br&gt;https://doi.org/10.1115/1.4055417</mixed-citation></ref><ref id="scirp.132267-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Kang, Z., Zhou, W., Zhao, Z., &lt;i&gt;et al.&lt;/i&gt; (2020) Large-Scale Multi-View Subspace Clustering in Linear Time. &lt;i&gt;Proceedings of the AAAI Conference on Artificial Intell&lt;/i&gt;&lt;i&gt;i&lt;/i&gt;&lt;i&gt;gence&lt;/i&gt;, 34, 4412-4419. &lt;br&gt;https://doi.org/10.1609/aaai.v34i04.5867</mixed-citation></ref><ref id="scirp.132267-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Wu, H., Huang, S., Tang, C., &lt;i&gt;et al.&lt;/i&gt; (2023) Pure Graph-Guided Multi-View Subspace Clustering.&lt;i&gt; Pattern Recognition&lt;/i&gt;, 136, Article 109187. &lt;br&gt;https://doi.org/10.1016/j.patcog.2022.109187</mixed-citation></ref><ref id="scirp.132267-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Zhang, X., Ren, Z., Sun, H., &lt;i&gt;et al.&lt;/i&gt; (2021) Multiple Kernel Low-Rank Representation-Based Robust Multi-View Subspace Clustering. &lt;i&gt;Information Sciences&lt;/i&gt;, 551, 324-340. &lt;br&gt;https://doi.org/10.1016/j.ins.2020.10.059</mixed-citation></ref><ref id="scirp.132267-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Zhao, N. and Bu, J. (2022) Robust Multi-View Subspace Clustering Based on Consensus Representation and Orthogonal Diversity. &lt;i&gt;Neural Networks&lt;/i&gt;, 150, 102-111. &lt;br&gt;https://doi.org/10.1016/j.neunet.2022.03.009</mixed-citation></ref><ref id="scirp.132267-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Zhu, W., Lu, J. and Zhou, J. (2019) Structured General and Specific Multi-View Subspace Clustering. &lt;i&gt;Pattern Recognition&lt;/i&gt;, 93, 392-403. &lt;br&gt;https://doi.org/10.1016/j.patcog.2019.05.005</mixed-citation></ref><ref id="scirp.132267-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Shi, Q., Hu, B., Zeng, T., &lt;i&gt;et al.&lt;/i&gt; (2019) Multi-View Subspace Clustering Analysis for Aggregating Multiple Heterogeneous Omics Data. &lt;i&gt;Frontiers in Genetics&lt;/i&gt;, 10, Article 744. &lt;br&gt;https://doi.org/10.3389/fgene.2019.00744</mixed-citation></ref><ref id="scirp.132267-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Liu, H., Shang, M., Zhang, H., &lt;i&gt;et al.&lt;/i&gt; (2021) Cancer Subtype Identification Based on Multi-View Subspace Clustering with Adaptive Local Structure Learning. &lt;i&gt;IEEE I&lt;/i&gt;&lt;i&gt;n&lt;/i&gt;&lt;i&gt;ternational Conference on Bio&lt;/i&gt;-&lt;i&gt;Informatics and Biomedicine&lt;/i&gt; (&lt;i&gt;BIBM&lt;/i&gt;), Houston, TX, 9-12 December 2021, 484-490. &lt;br&gt;https://doi.org/10.1109/BIBM52615.2021.9669659</mixed-citation></ref><ref id="scirp.132267-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Zhang, G.Y., Chen, X.W., Zhou, Y.R., &lt;i&gt;et al.&lt;/i&gt; (2022) Kernelized Multi-View Subspace Clustering via Auto-Weighted Graph Learning.&lt;i&gt; Applied Intelligence&lt;/i&gt;, 52, 716-731.&lt;br&gt;https://doi.org/10.1007/s10489-021-02365-8</mixed-citation></ref><ref id="scirp.132267-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Tao, H., Hou, C., Qian, Y., &lt;i&gt;et al.&lt;/i&gt; (2020) Latent Complete Row Space Recovery for Multi-View Subspace Clustering. &lt;i&gt;IEEE Transactions on Image Processing&lt;/i&gt;, 29, 8083-8096. &lt;br&gt;https://doi.org/10.1109/TIP.2020.3010631</mixed-citation></ref><ref id="scirp.132267-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Abavisani, M. and Patel, V.M. (2018) Multimodal Sparse and Low-Rank Subspace Clustering.&lt;i&gt; Information Fusion&lt;/i&gt;, 39, 168-177. &lt;br&gt;https://doi.org/10.1016/j.inffus.2017.05.002</mixed-citation></ref><ref id="scirp.132267-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Li, R., Zhang, C., Hu, Q., &lt;i&gt;et al.&lt;/i&gt; (2019) Flexible Multi-View Representation Learning for Subspace Clustering. &lt;i&gt;Proceedings of the Twenty-Eighth International Joint Co&lt;/i&gt;&lt;i&gt;n&lt;/i&gt;&lt;i&gt;fe&lt;/i&gt;&lt;i&gt;rence on Artificial Intelligenc&lt;/i&gt;&lt;i&gt;e&lt;/i&gt;, Macao, China, 10-16 August 2019, 2916-2922.&lt;br&gt;https://doi.org/10.24963/ijcai.2019/404</mixed-citation></ref><ref id="scirp.132267-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Zhou, T., Zhang, C., Peng, X., &lt;i&gt;et al.&lt;/i&gt; (2019) Dual Shared-Specific Multi-View Subspace Clustering. &lt;i&gt;IEEE Transactions on Cybernetics&lt;/i&gt;, 50, 3517-3530. &lt;br&gt;https://doi.org/10.1109/TCYB.2019.2918495</mixed-citation></ref><ref id="scirp.132267-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Zhang, G.-Y., Huang, D. and Wang, C.-D. (2023) Facilitated Low-Rank Multi-View Sub-Space Clustering. &lt;i&gt;Knowledge-Based Systems&lt;/i&gt;, 260, Article 110141. &lt;br&gt;https://doi.org/10.1016/j.knosys.2022.110141</mixed-citation></ref><ref id="scirp.132267-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y. and Ma, Y. (2012) Robust Recovery of Subspace Structures by Low-Rank Representation. &lt;i&gt;IEEE Transactions on Pattern Ana&lt;/i&gt;&lt;i&gt;l&lt;/i&gt;&lt;i&gt;ysis and Machine Intelligence&lt;/i&gt;, 35, 171-184. &lt;br&gt;https://doi.org/10.1109/TPAMI.2012.88</mixed-citation></ref><ref id="scirp.132267-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Liu, Z., Lu, Y., Lai, Z., &lt;i&gt;et al.&lt;/i&gt; (2021) Robust Sparse Low-Rank Embedding for Image Dimension Reduction. &lt;i&gt;Applied Soft Computing&lt;/i&gt;, 113, Article 107907. &lt;br&gt;https://doi.org/10.1016/j.asoc.2021.107907</mixed-citation></ref><ref id="scirp.132267-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple">Du, S., Liu, B., Shan, G., &lt;i&gt;et al.&lt;/i&gt; (2022) Enhanced Tensor Low-Rank Representation for Clustering and Denoising. &lt;i&gt;Knowledge-Based Systems&lt;/i&gt;, 243, Article 108468. &lt;br&gt;https://doi.org/10.1016/j.knosys.2022.108468</mixed-citation></ref><ref id="scirp.132267-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Wang, J., Zhu, L., Dai, T., &lt;i&gt;et al.&lt;/i&gt; (2021) Low-Rank and Sparse Matrix Factorization with Prior Relations for Recommender Systems. &lt;i&gt;Applied Intelligence&lt;/i&gt;, 51, 3435-3449. &lt;br&gt;https://doi.org/10.1007/s10489-020-02023-5</mixed-citation></ref><ref id="scirp.132267-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Peng, J., Sun, W., Li, H.C., &lt;i&gt;et al.&lt;/i&gt; (2021) Low-Rank and Sparse Representation for Hyperspectral Image Processing: A Review. &lt;i&gt;IEEE &lt;/i&gt;&lt;i&gt;Geoscience&lt;/i&gt;&lt;i&gt; and Remote Sensing Magazine&lt;/i&gt;, 10, 10-43. &lt;br&gt;https://doi.org/10.1109/MGRS.2021.3075491</mixed-citation></ref><ref id="scirp.132267-ref21"><label>21</label><mixed-citation publication-type="other" xlink:type="simple">Chen, J., Yang, S., Mao, H., &lt;i&gt;et al.&lt;/i&gt; (2021) Multiview Subspace Clustering Using Low-Rank Representation. &lt;i&gt;IEEE Transactions on Cybernetics&lt;/i&gt;, 52, 12364-12378. &lt;br&gt;https://doi.org/10.1109/TCYB.2021.3087114</mixed-citation></ref><ref id="scirp.132267-ref22"><label>22</label><mixed-citation publication-type="other" xlink:type="simple">Hui, K., Shen, X., Abhadiomhen, S.E., &lt;i&gt;et al.&lt;/i&gt; (2022) Robust Low-Rank Representation via Residual Projection for Image Classification. &lt;i&gt;Knowledge&lt;/i&gt;-&lt;i&gt;Based Systems&lt;/i&gt;, 241, Article 108230. &lt;br&gt;https://doi.org/10.1016/j.knosys.2022.108230</mixed-citation></ref><ref id="scirp.132267-ref23"><label>23</label><mixed-citation publication-type="other" xlink:type="simple">Falsone, A., Notarnicola, I., Notarstefano, G., &lt;i&gt;et al.&lt;/i&gt; (2020) Tracking-ADMM for Distributed Constraint-Coupled Optimization. &lt;i&gt;Automatica&lt;/i&gt;, 117, Article 108962. &lt;br&gt;https://doi.org/10.1016/j.automatica.2020.108962</mixed-citation></ref><ref id="scirp.132267-ref24"><label>24</label><mixed-citation publication-type="other" xlink:type="simple">Sun, Y., Yang, Y., Liu, Q., &lt;i&gt;et al.&lt;/i&gt; (2020) Learning Non-Locally Regularized Compressed Sensing Network with Half-Quadratic Splitting. &lt;i&gt;IEEE Transactions on Mu&lt;/i&gt;&lt;i&gt;l&lt;/i&gt;&lt;i&gt;timedia&lt;/i&gt;, 22, 3236-3248. &lt;br&gt;https://doi.org/10.1109/TMM.2020.2973862</mixed-citation></ref><ref id="scirp.132267-ref25"><label>25</label><mixed-citation publication-type="other" xlink:type="simple">Cai, J.F., Cand&amp;#232;s, E.J. and Shen, Z. (2010) A Singular Value Thresholding Algorithm for Matrix Completion. &lt;i&gt;SIAM Journal on Optimization&lt;/i&gt;, 20, 1956-1982. &lt;br&gt;https://doi.org/10.1137/080738970</mixed-citation></ref><ref id="scirp.132267-ref26"><label>26</label><mixed-citation publication-type="other" xlink:type="simple">Sun, M., Zhang, P., Wang, S., Zhou, S., Tu, W. and Liu, X. (2021) Scalable Multi-View Subspace Clustering with Unified Anchors. &lt;i&gt;Proceedings of the&lt;/i&gt; 29&lt;i&gt;th ACM International Conference on Multimedia&lt;/i&gt;, China, 20-24 October 2021, 3528-3536 &lt;br&gt;https://doi.org/10.1145/3474085.3475516</mixed-citation></ref><ref id="scirp.132267-ref27"><label>27</label><mixed-citation publication-type="other" xlink:type="simple">Wang, H., Yang, Y. and Liu, B (2020) GMC: Graph-Based Multi-View Clustering, &lt;i&gt;IEEE Transactions on Knowledge and Data Engineering&lt;/i&gt;, 32, 1116-1129. &lt;br&gt;https://doi.org/10.1109/TKDE.2019.2903810</mixed-citation></ref><ref id="scirp.132267-ref28"><label>28</label><mixed-citation publication-type="other" xlink:type="simple">Huang, L., Chao, H.Y. and Wang, C.D. (2019) Multi-View Intact Space Clustering. &lt;i&gt;Pattern Recognition&lt;/i&gt;, 86, 344-353. &lt;br&gt;https://doi.org/10.1016/j.patcog.2018.09.016</mixed-citation></ref></ref-list></back></article>