<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">ABB</journal-id><journal-title-group><journal-title>Advances in Bioscience and Biotechnology</journal-title></journal-title-group><issn pub-type="epub">2156-8456</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/abb.2020.115011</article-id><article-id pub-id-type="publisher-id">ABB-100071</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Biomedical&amp;Life Sciences</subject></subj-group></article-categories><title-group><article-title>
 
 
  pLoc_Deep-mGneg: Predict Subcellular Localization of Gram Negative Bacterial Proteins by Deep Learning
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Xin-Xin</surname><given-names>Liu</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Kuo-Chen</surname><given-names>Chou</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib></contrib-group><aff id="aff1"><addr-line>Computer Science, Jingdezhen Ceramic Institute, Jingdezhen, China</addr-line></aff><aff id="aff2"><addr-line>Gordon Life Science Institute, Boston, MA 02478, USA</addr-line></aff><pub-date pub-type="epub"><day>09</day><month>05</month><year>2020</year></pub-date><volume>11</volume><issue>05</issue><fpage>141</fpage><lpage>152</lpage><history><date date-type="received"><day>7,</day>	<month>April</month>	<year>2020</year></date><date date-type="rev-recd"><day>8,</day>	<month>May</month>	<year>2020</year>	</date><date date-type="accepted"><day>11,</day>	<month>May</month>	<year>2020</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1, has been endangering the life of human beings all around the world. In order to really understand the biological process within a cell level and provide useful clues to develop antiviral drugs, information of Gram negative bacterial protein subcellular localization is vitally important. In view of this, a CNN based protein subcellular localization predictor called “pLoc_Deep-mGnet” was developed. The predictor is particularly useful in dealing with the multi-sites systems in which some proteins may simultaneously occur in two or more different organelles that are the current focus of pharmaceutical industry. The global absolute true rate achieved by the new predictor is over 98% and its local accuracy is around 94% - 100%. Both are transcending other existing state-of-the-art predictors significantly. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at 
  http://www.jci-bioinfo.cn/pLoc_Deep-mGneg/
  , which will become a very useful tool for fighting pandemic coronavirus and save the mankind of this planet.
 
</p></abstract><kwd-group><kwd>Pandemic Coronavirus</kwd><kwd> Multi-Label System</kwd><kwd> Gram Negative Bacterial Proteins</kwd><kwd> Learning at Deeper Level</kwd><kwd> Five-Steps Rule</kwd><kwd> PseAAC</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>Knowledge of the subcellular localization of proteins is crucially important for fulfilling the following two important goals: 1) revealing the intricate pathways that regulate biological processes at the cellular level [<xref ref-type="bibr" rid="scirp.100071-ref1">1</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref2">2</xref>]; 2) selecting the right targets [<xref ref-type="bibr" rid="scirp.100071-ref3">3</xref>] for developing new drugs.</p><p>With the avalanche of protein sequences in the post-genomic age, we are challenged to develop computational tools for effectively identifying their subcellular localization purely based on the sequence information.</p><p>In 2018, a very powerful predictor, called “pLoc_bal-mGneg” [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>], was developed for predicting the subcellular localization of Gram negative bacterial proteins based on their sequences information alone. It has the following remarkable advantages. 1) Most existing protein subcellular location prediction methods were developed based on the single-label system in which it was assumed that each constituent protein had one, and only one, subcellular location (see, e.g., [<xref ref-type="bibr" rid="scirp.100071-ref5">5</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref6">6</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref7">7</xref>] and a long list of references cited in a review papers [<xref ref-type="bibr" rid="scirp.100071-ref8">8</xref>]). With more experimental data uncovered, however, the localization of proteins in a cell is actually a multi-label system, where some proteins may simultaneously occur in two or more different location sites. This kind of multiplex proteins often bears some exceptional functions worthy of our special notice [<xref ref-type="bibr" rid="scirp.100071-ref2">2</xref>]. And the pLoc_bal-mGneg predictor [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>] can cover this kind of important information missed by most other methods since it was established based on the multi-label benchmark dataset and theory. 2) Although there are a few methods (see, e.g., [<xref ref-type="bibr" rid="scirp.100071-ref9">9</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref10">10</xref>]) that can be used to deal with multi-label subcellular localization for proteins, the prediction quality achieved by pLoc_bal-mGneg [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>] is overwhelmingly higher, particularly in the absolute true rate. 3) Although the pLoc_bal-mGneg predictor [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>] has the aforementioned merits, it has not been trained at a deeper level yet [<xref ref-type="bibr" rid="scirp.100071-ref11">11</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref12">12</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref13">13</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref14">14</xref>].</p><p>The present study was initiated in an attempt to address this problem. As done in pLoc_bal-mGneg [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>] as well as many other recent publications in developing new prediction methods (see, e.g., [<xref ref-type="bibr" rid="scirp.100071-ref15">15</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref16">16</xref>]), the guidelines of the 5-step rule [<xref ref-type="bibr" rid="scirp.100071-ref17">17</xref>] are followed. They are about the detailed procedures for 1) benchmark dataset, 2) sample formulation, 3) operation engine or algorithm, 4) cross-validation, and 5) web-server. But here our attentions are focused on the procedures that significantly differ from those in developing the predictor pLoc_bal-mGneg [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>].</p></sec><sec id="s2"><title>2. Materials and Methods</title><sec id="s2_1"><title>2.1. Benchmark Dataset</title><p>The benchmark dataset used in this study is exactly the same as that in pLoc_bal-mGneg [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>]; i.e.,</p><p>S = S 1 ∪ S 2 ∪ ⋯ ∪ S u ∪ ⋯ ∪ S 7 ∪ S 8 (1)</p><p>where S 1 only contains the Gram-negative bacteria protein samples from the “Cell inner membrane” organelle (cf. <xref ref-type="table" rid="table1">Table 1</xref>), S 2 only contains those from the “Cell wall”, and so forth; ∪ denotes the symbol for “union” in the set theory. For readers’ convenience, their detailed sequences and accession numbers (or ID codes) are given in Supporting Information S1 that is also available at http://www.jci-bioinfo.cn/pLoc_bal-mGneg/Supp1.pdf in which none of proteins included has ≥25% sequence identity to any other in the same subset (subcellular location).</p></sec><sec id="s2_2"><title>2.2. Proteins Sample Formulation</title><p>Now let us consider the 2<sup>nd</sup> step of the 5-step rule [<xref ref-type="bibr" rid="scirp.100071-ref17">17</xref>]; i.e., how to formulate the biological sequence samples with an effective mathematical expression that can truly reflect their essential correlation with the target concerned. Given a protein sequence P, its most straightforward expression is</p><p>P = R 1 R 2 R 3 R 4 R 5 R 6 R 7 ⋯ R L (2)</p><p>where L denotes the protein’s length or the number of its constituent amino acid residues, R 1 is the 1<sup>st</sup> residue, R 2 the 2<sup>nd</sup> residue, R 3 the 3<sup>rd</sup> residue, and so forth. Since all the existing machine-learning algorithms} can only handle vectors as elaborated in [<xref ref-type="bibr" rid="scirp.100071-ref3">3</xref>], one has to convert a protein sample from its sequential expression (Equation (2)) to a vector. But a vector defined in a discrete model might completely miss all the sequence-order or pattern information. To deal with this problem, the Pseudo Amino Acid Composition [<xref ref-type="bibr" rid="scirp.100071-ref18">18</xref>] or PseAAC [<xref ref-type="bibr" rid="scirp.100071-ref19">19</xref>] has been proposed. Ever since then, the concept of “Pseudo Amino Acid Composition” has been widely used in nearly all the areas of computational proteomics with the aim to grasp various different sequence patterns that are essential to the targets investigated (see, e.g., [<xref ref-type="bibr" rid="scirp.100071-ref20">20</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref21">21</xref>] as well as a long list of references cited in [<xref ref-type="bibr" rid="scirp.100071-ref22">22</xref>]). Meanwhile, because it has been widely and increasingly used, four powerful open access soft-wares, called “PseAAC” [<xref ref-type="bibr" rid="scirp.100071-ref23">23</xref>], “PseAAC-Builder” [<xref ref-type="bibr" rid="scirp.100071-ref24">24</xref>], “propy” [<xref ref-type="bibr" rid="scirp.100071-ref25">25</xref>], and “PseAAC-General” [<xref ref-type="bibr" rid="scirp.100071-ref26">26</xref>], were established: the former three are for generating various modes of special PseAAC [<xref ref-type="bibr" rid="scirp.100071-ref27">27</xref>]; while the 4<sup>th</sup> one for those of general PseAAC [<xref ref-type="bibr" rid="scirp.100071-ref17">17</xref>], including not only all the special modes of feature vectors for proteins but also the higher level feature vectors such as “Functional Domain” mode, “Gene Ontology” mode, and “Sequential Evolution” or “PSSM” mode. Encouraged by the successes of using PseAAC to deal with protein/peptide sequences, its idea and approach were extended to PseKNC (Pseudo K-tuple Nucleotide Composition) to generate various feature vectors for DNA/RNA sequences [<xref ref-type="bibr" rid="scirp.100071-ref28">28</xref>] that have proved very successful as well (see, e.g., [<xref ref-type="bibr" rid="scirp.100071-ref29">29</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref30">30</xref>]).</p><p>According to the concept of general PseAAC [<xref ref-type="bibr" rid="scirp.100071-ref17">17</xref>], any protein sequence can be formulated as a PseAAC vector given by</p><p>P = [ Ψ 1 Ψ 2 ⋯ Ψ u ⋯ Ψ Ω ] T (3)</p><p>where T is a transpose operator, while the integer Ω is a parameter and its value as well as the components Ψ u ( u = 1 , 2 , ⋯ , Ω ) will depend on how to extract the desired information from the amino acid sequence of P, as elaborated in [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>]. Thus, by following exactly the same procedures as described in the Section 2.2 of [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>], each of the protein samples in the benchmark dataset can be uniquely defined as an 8-D numerical vector as given in Supporting Information S2, which can also be directly downloaded at http://www.jci-bioinfo.cn/pLoc_bal-mGneg/Supp2.pdf.</p></sec><sec id="s2_3"><title>2.3. Installing Deep-Learning for Three Deeper Levels</title><p>In this study, we use multilayer perceptron neural network model, which consists of 3 fully connected layers and was used to predict subcellular localization of multi-label human proteins, as illustrated in <xref ref-type="fig" rid="fig1">Figure 1</xref>. We set input layer with 14 neural units which correspond to 14 features. Too many hidden layer would make network complexity bigger and suffer from the vanishing gradient problem while a model is constructed. Here, only two hidden layer is included. The hidden layer 1 is set as 200 neural units. The activation function is set as “relu”. The second hidden layer has 100 neural units. The activation function is set the same as the hidden layer 1. We end the model with 14 neural units and Sigmoid activation. To go with it, we use the binary_crossentropy loss and the adam (adaptive moment estimation) optimizer to train the model. The metrics is set as “accuracy”. The batch size is set as 28, and the epochs is 100. The predicted results were decided by the output of the threshold θ. If the output is greater than 0.5, the outcome was true; otherwise, false. For more information about this, see [<xref ref-type="bibr" rid="scirp.100071-ref11">11</xref>], where the details have been clearly elaborated and hence there is no need to repeat here.</p><p>The new predictor developed via the above procedures is called “pLoc_ Deep-mGneg”, where “pLoc_Deep” stands for “predict subcellular localization by deep learning”, and “mGneg” for “multi-label Gram negative proteins”.</p></sec></sec><sec id="s3"><title>3. Results and Discussion</title><p>According to the 5-step rules [<xref ref-type="bibr" rid="scirp.100071-ref17">17</xref>], one of the important procedures in developing</p><p>a new predictor is how to properly evaluate its anticipated accuracy. To deal with that, two issues need to be considered. 1) What metrics should be used to quantitatively reflect the predictor’s quality? 2) What test method should be applied to score the metrics?</p><sec id="s3_1"><title>3.1. A Set of Five Metrics for Multi-Label Systems</title><p>Different from the metrics used to measure the prediction quality of single-label systems, the metrics for the multi-label systems are much more complicated [<xref ref-type="bibr" rid="scirp.100071-ref31">31</xref>]. To make them more intuitive and easier to understand for most experimental scientists, here we use the following intuitive Chou’s five metrics [<xref ref-type="bibr" rid="scirp.100071-ref32">32</xref>] or the “global metrics” that have recently been widely used for studying various multi-label systems (see, e.g., [<xref ref-type="bibr" rid="scirp.100071-ref33">33</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref34">34</xref>]). For the current study, the set of global metrics can be formulated as:</p><p>{ Aiming ↑   = 1 N q ∑ k = 1 N q ( ‖ L k ∩ L k * ‖ ‖ L k * ‖ ) ,       [ 0 , 1 ] Coverage ↑   = 1 N q ∑ k = 1 N q ( ‖ L k ∩ L k * ‖ ‖ L k ‖ ) ,       [ 0 , 1 ] Accuracy ↑   = 1 N q ∑ k = 1 N q ( ‖ L k ∩ L k * ‖ ‖ L k ∪ L k * ‖ ) ,       [ 0 , 1 ] Absolutetrue ↑   = 1 N q ∑ k = 1 N q Δ ( L k , L k * ) ,       [ 0 , 1 ] Absolutefalse ↓   = 1 N q ∑ k = 1 N q ( ‖ L k ∪ L k * ‖ − ‖ L k ∩ L k * ‖ M ) ,       [ 1 , 0 ] (4)</p><p>where N q is the total number of query proteins or tested proteins, M is the total number of different labels for the investigated system (for the current study it is L cell = 8 ), ‖ ‖ means the operator acting on the set therein to count the number of its elements, ∪ means the symbol for the “union” in the set theory, ∩ denotes the symbol for the “intersection”, L k denotes the subset that contains all the labels observed by experiments for the k-th tested sample, L k * represents the subset that contains all the labels predicted for the k-th sample, and</p><p>Δ ( L k , L k * ) = { 1 ,     if   all   the   labels   in   L k *   are   identical   to   those   in   L k 0 ,       otherwise (5)</p><p>In Equation (4), the first four metrics with an upper arrow ↑ are called positive metrics, meaning that the larger the rate is the better the prediction quality will be; the 5<sup>th</sup> metrics with a down arrow ↓ is called negative metrics, implying just the opposite meaning.</p><p>From Equation (4) we can see the following: 1) the “Aiming” defined by the 1<sup>st</sup> sub-equation is for checking the rate or percentage of the correctly predicted labels over the practically predicted labels; 2) the “Coverage” defined in the 2<sup>nd</sup> sub-equation is for checking the rate of the correctly predicted labels over the actual labels in the system concerned; 3) the “Accuracy” in the 3<sup>rd</sup> sub-equation is for checking the average ratio of correctly predicted labels over the total labels including correctly and incorrectly predicted labels as well as those real labels but are missed in the prediction; 4) the “Absolute true” in the 4<sup>th</sup> sub-equation is for checking the ratio of the perfectly or completely correct prediction events over the total prediction events; 5) the “Absolute false” in the 5<sup>th</sup> sub-equation is for checking the ratio of the completely wrong prediction over the total prediction events.</p></sec><sec id="s3_2"><title>3.2. Comparison with the State-of-the-Art Predictor</title><p>Listed in <xref ref-type="table" rid="table1">Table 1</xref> are the rates achieved by the current pLoc_Deep-mGneg predictor via the cross validations on the same experiment-confirmed dataset as used in [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>]. For facilitating comparison, listed there are also the corresponding results obtained by the pLoc_bal-mGneg [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>], the existing most powerful predictor for identifying the subcellular localization of Gram negative proteins with both single and multiple location sites. As shown in <xref ref-type="table" rid="table1">Table 1</xref>, the newly proposed predictor pLoc_Deep-mGneg is remarkably superior to the existing state-of-the-art predictor pLoc_bal-mGneg in all the five metrics. Particularly, it can be seen from the table that the absolute true rate achieved by the new predictor is over 98%, which is far beyond the reach of any other existing methods (see, e.g., [<xref ref-type="bibr" rid="scirp.100071-ref35">35</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref36">36</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref37">37</xref>]). This is because it is extremely difficult to enhance the absolute true rate of a prediction method for a multi-label system as clearly elucidated in [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>]. Actually, to avoid embarrassment, many investigators even chose not to mention the metrics of absolute true rate in dealing with multi-label systems (see, e.g., [<xref ref-type="bibr" rid="scirp.100071-ref38">38</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref39">39</xref>]).</p><p>Moreover, to in-depth examine the prediction quality of the new predictor for the proteins in each of the subcellular locations concerned (cf. <xref ref-type="table" rid="table2">Table 2</xref>), we used a set of four intuitive metrics that were derived in [<xref ref-type="bibr" rid="scirp.100071-ref40">40</xref>] based on the Chou’s symbols introduced for studying protein signal peptides [<xref ref-type="bibr" rid="scirp.100071-ref41">41</xref>] and that have ever since been widely concurred or justified (see, e.g., [<xref ref-type="bibr" rid="scirp.100071-ref42">42</xref>] [<xref ref-type="bibr" rid="scirp.100071-ref43">43</xref>]). For the current study, the set of metrics can be formulated as:</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Comparison with the state-of-the-art method in predicting Gram negative protein subcellular localization<sup>a</sup></title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Predictor</th><th align="center" valign="middle" >Aiming (&#173;) <sup>a</sup></th><th align="center" valign="middle" >Coverage (&#173;) <sup>a</sup></th><th align="center" valign="middle" >Accuracy (&#173;)<sup>a</sup></th><th align="center" valign="middle" >Absolute true (&#173;) <sup>a</sup></th><th align="center" valign="middle" >Absolute false (&#175;) <sup>a</sup></th></tr></thead><tr><td align="center" valign="middle" >pLoc_bal-mGneg<sup>b</sup></td><td align="center" valign="middle" >97.07%</td><td align="center" valign="middle" >97.8%</td><td align="center" valign="middle" >97.27%</td><td align="center" valign="middle" >96.55%</td><td align="center" valign="middle" >0.19%</td></tr><tr><td align="center" valign="middle" >pLoc_Deep-mGneg<sup>c</sup></td><td align="center" valign="middle" >98.52%</td><td align="center" valign="middle" >98.64%</td><td align="center" valign="middle" >98.40%</td><td align="center" valign="middle" >98.02%</td><td align="center" valign="middle" >00.00%</td></tr></tbody></table></table-wrap><p><sup>a</sup>See Equation (4) for the definition of the metrics. <sup>b</sup>See [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>], where the reported metrics rates were obtained by the jackknife test on the benchmark dataset of Supporting Information S1 that contains experiment-confirmed proteins only. <sup>c</sup>The proposed predictor; to assure that the test was performed on exactly the same experimental data as reported in [<xref ref-type="bibr" rid="scirp.100071-ref4">4</xref>] for pLoc_bal-mGneg.</p><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> Performance of pLoc_Deep-mGneg for each of the 8 subcellular locations.<sup> </sup></title></caption><table><tbody><thead><tr><th align="center" valign="middle" >i</th><th align="center" valign="middle" >Location<sup>a</sup></th><th align="center" valign="middle" >Sn(i)<sup>b</sup></th><th align="center" valign="middle" >Sp(i)<sup>b</sup></th><th align="center" valign="middle" >Acc(i)<sup>b</sup></th><th align="center" valign="middle" >MCC(i)<sup>b</sup></th></tr></thead><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >Cell inner membrane</td><td align="center" valign="middle" >0.9805</td><td align="center" valign="middle" >0.9910</td><td align="center" valign="middle" >0.9882</td><td align="center" valign="middle" >0.9709</td></tr><tr><td align="center" valign="middle" >2</td><td align="center" valign="middle" >Cell outer membrane</td><td align="center" valign="middle" >0.9465</td><td align="center" valign="middle" >0.9989</td><td align="center" valign="middle" >0.9921</td><td align="center" valign="middle" >0.9648</td></tr><tr><td align="center" valign="middle" >3</td><td align="center" valign="middle" >Cytoplasm</td><td align="center" valign="middle" >0.9707</td><td align="center" valign="middle" >0.9896</td><td align="center" valign="middle" >0.9842</td><td align="center" valign="middle" >0.9616</td></tr><tr><td align="center" valign="middle" >4</td><td align="center" valign="middle" >Extracellular</td><td align="center" valign="middle" >0.9786</td><td align="center" valign="middle" >0.9976</td><td align="center" valign="middle" >0.9951</td><td align="center" valign="middle" >0.9807</td></tr><tr><td align="center" valign="middle" >5</td><td align="center" valign="middle" >Fimbrium</td><td align="center" valign="middle" >0.9333</td><td align="center" valign="middle" >0.9985</td><td align="center" valign="middle" >0.9961</td><td align="center" valign="middle" >0.9358</td></tr><tr><td align="center" valign="middle" >6</td><td align="center" valign="middle" >Flagellum</td><td align="center" valign="middle" >1.0000</td><td align="center" valign="middle" >1.0000</td><td align="center" valign="middle" >1.0000</td><td align="center" valign="middle" >1.0000</td></tr><tr><td align="center" valign="middle" >7</td><td align="center" valign="middle" >Nucleoid</td><td align="center" valign="middle" >1.0000</td><td align="center" valign="middle" >1.0000</td><td align="center" valign="middle" >1.0000</td><td align="center" valign="middle" >1.0000</td></tr><tr><td align="center" valign="middle" >8</td><td align="center" valign="middle" >Periplasm</td><td align="center" valign="middle" >0.9814</td><td align="center" valign="middle" >0.9988</td><td align="center" valign="middle" >0.9956</td><td align="center" valign="middle" >0.9852</td></tr></tbody></table></table-wrap><p><sup>a</sup>See Equation (1) and relevant context as well as the Supporting Information S1 for further explanation. <sup>b</sup>See Equation (6) for the metrics definition.</p><p>{ Sn ( i ) = 1 − N − + ( i ) N + ( i )                                                                                                                           0 ≤ Sn ≤ 1 Sp ( i ) = 1 − N + − ( i ) N − ( i )                                                                                                                           0 ≤ Sp ≤ 1 Acc ( i ) = 1 − N − + ( i ) + N + − ( i ) N + ( i ) + N − ( i )                                                                                             0 ≤ Acc ≤ 1 MCC ( i ) = 1 − ( N − + ( i ) N + ( i ) + N + − ( i ) N − ( i ) ) ( 1 + N + − ( i ) − N − + ( i ) N + ( i ) ) ( 1 + N − + ( i ) − N + − ( i ) N − ( i ) )           − 1 ≤ MCC ≤ 1 ( i = 1 , 2 , ⋯ , 8 ) (6)</p><p>where Sn, Sp, Acc, and MCC represent the sensitivity, specificity, accuracy, and Mathew’s correlation coefficient, respectively, and i denotes the i-th subcellular location (or subset) in the benchmark dataset. N + ( i ) is the total number of the samples investigated in the i-th subset, whereas N − + ( i ) is the number of the samples in N + ( i ) that are incorrectly predicted to be of other locations; N − ( i ) is the total number of samples in any locations but not the i-th location, whereas N + − ( i ) is the number of the samples in N − ( i ) that are incorrectly predicted to be of the i-th location.</p><p>Listed in <xref ref-type="table" rid="table2">Table 2</xref> are the results achieved by pLoc_Deep-mGneg for the Gram negative proteins in each of 8 subcellular locations. As we can see from the table, nearly all the success rates achieved by the new predictor for the Gram negative proteins in each of the 8 subcellular locations are within the range of 90% - 100%, which is once again far beyond the reach of any of its counterparts.</p><p>Meanwhile, as a byproduct, the present paper has also stimulated a series of somewhat provocative but quite interesting papers (see, e.g., [<xref ref-type="bibr" rid="scirp.100071-ref44">44</xref>] - [<xref ref-type="bibr" rid="scirp.100071-ref49">49</xref>]).</p></sec><sec id="s3_3"><title>3.3. Web Server and User Guide</title><p>As pointed out in [<xref ref-type="bibr" rid="scirp.100071-ref50">50</xref>], user-friendly and publicly accessible web-servers represent the future direction for developing practically more useful predictors. Actually, user-friendly web-servers will significantly enhance the impacts of theoretical work because they can attract the broad experimental scientists [<xref ref-type="bibr" rid="scirp.100071-ref22">22</xref>]. In view of this, the web-server of the current pLoc_Deep-mGneg predictor has also been established at http://www.jci-bioinfo.cn/pLoc_Deep-mGneg/, by which users can easily get their desired data without the need to go thru the mathematical details.</p></sec></sec><sec id="s4"><title>4. Conclusion</title><p>It is anticipated that the pLoc_Deep-mGneg predictor holds very high potential to become a useful high throughput tool in identifying the subcellular localization of Gram negative bacterial proteins, particularly for finding multi-target drugs that is currently a very hot trend in drug development. Most important is that the predictor will become a very useful tool for fighting against the coronavirus to save mankind on this planet.</p></sec><sec id="s5"><title>Acknowledgements</title><p>This work was supported by the grants from the National Natural Science Foundation of China (No. 31560316, 61261027, 61262038, 61202313 and 31260273), the Province National Natural Science Foundation of JiangXi (No. 20132BAB201053), the Jiangxi Provincial Foreign Scientific and Technological Cooperation Project (No.20120BDH80023), the Department of Education of JiangXi Province (GJJ160866).</p></sec><sec id="s6"><title>Conflicts of Interest</title><p>The authors declare no conflicts of interest regarding the publication of this paper.</p></sec><sec id="s7"><title>Cite this paper</title><p>Liu, X.-X. and Chou, K.-C. (2020) pLoc_Deep-mGneg: Predict Subcellular Localization of Gram Negative Bacterial Proteins by Deep Learning. Advances in Bioscience and Biotechnology, 11, 141-152. https://doi.org/10.4236/abb.2020.115011</p></sec></body><back><ref-list><title>References</title><ref id="scirp.100071-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Ehrlich, J.S., Hansen, M.D. and Nelson, W.J. (2002) Spatio-Temporal Regulation of Rac1 Localization and Lamellipodia Dynamics during Epithelial Cell-Cell Adhesion. Developmental Cell, 3, 259-270. https://doi.org/10.1016/S1534-5807(02)00216-2</mixed-citation></ref><ref id="scirp.100071-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Glory, E. and Murphy, R.F. (2007) Automated Subcellular Location Determination and High-Throughput Microscopy. Developmental Cell, 12, 7-16.  
https://doi.org/10.1016/j.devcel.2006.12.007</mixed-citation></ref><ref id="scirp.100071-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2015) Impacts of Bioinformatics to Medicinal Chemistry. Medicinal Chemistry, 11, 218-234. https://doi.org/10.2174/1573406411666141229162834</mixed-citation></ref><ref id="scirp.100071-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Cheng, X., Xiao, X. and Chou, K.C. (2018) pLoc_bal-mGneg: Predict Subcellular Localization of Gram-Negative Bacterial Proteins by Quasi-Balancing Training Dataset and General PseAAC. Journal of Theoretical Biology, 458, 92-102.  
https://doi.org/10.1016/j.jtbi.2018.09.005</mixed-citation></ref><ref id="scirp.100071-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Nakai, K. and Kanehisa, M. (1992) A Knowledge Base for Predicting Protein Localization Sites in Eukaryotic Cells. Genomics, 14, 897-911.  
https://doi.org/10.1016/S0888-7543(05)80111-9</mixed-citation></ref><ref id="scirp.100071-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Cedano, J., Aloy, P., Perez-Pons, J.A. and Querol, E. (1997) Relation between Amino Acid Composition and Cellular Location of Proteins. Journal of Molecular Biology, 266, 594-600. https://doi.org/10.1006/jmbi.1996.0804</mixed-citation></ref><ref id="scirp.100071-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Reinhardt, A. and Hubbard, T. (1998) Using Neural Networks for Prediction of the Subcellular Location of Proteins. Nucleic Acids Research, 26, 2230-2236.  
https://doi.org/10.1093/nar/26.9.2230</mixed-citation></ref><ref id="scirp.100071-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. and Shen, H.B. (2007) Recent Progresses in Protein Subcellular Location Prediction. Analytical Biochemistry, 370, 1-16.  
https://doi.org/10.1016/j.ab.2007.07.006</mixed-citation></ref><ref id="scirp.100071-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C., Wu, Z.C. and Xiao, X. (2011) iLoc-Euk: A Multi-Label Classifier for Predicting the Subcellular Localization of Singleplex and Multiplex Eukaryotic Proteins. PLoS ONE, 6, e18258. https://doi.org/10.1371/journal.pone.0018258</mixed-citation></ref><ref id="scirp.100071-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Mandal, M., Mukhopadhyay, A. and Maulik, U. (2015) Prediction of Protein Subcellular Localization by Incorporating Multiobjective PSO-Based Feature Subset Selection into the General form of Chou’s PseAAC. Medical &amp; Biological Engineering &amp; Computing, 53, 331-344. https://doi.org/10.1007/s11517-014-1238-7</mixed-citation></ref><ref id="scirp.100071-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Maxwell, A., Li, R., Yang, B., Weng, H., Ou, A., Hong, H., Zhou, Z., Gong, P. and Zhang, C. (2017) Deep Learning Architectures for Multi-Label Classification of Intelligent Health Risk Prediction. BMC Bioinformatics, 18, 523.  
https://doi.org/10.1186/s12859-017-1898-z</mixed-citation></ref><ref id="scirp.100071-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Khan, S., Khan, M., Iqbal, N., Hussain, T., Khan, S.A. and Chou, K.C. (2019) A Two-Level Computation Model Based on Deep Learning Algorithm for Identification of piRNA and Their Functions via Chou’s 5-Steps Rule. International Journal of Peptide Research and Therapeutics. https://doi.org/10.1007/s10989-019-09887-3</mixed-citation></ref><ref id="scirp.100071-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Khan, Z.U., Ali, F., Khan, I.A., Hussain, Y. and Pi, D. (2019) iRSpot-SPI: Deep Learning-Based Recombination Spots Prediction by Incorporating Secondary Sequence Information Coupled with Physio-Chemical Properties via Chou’s 5-Step Rule and Pseudo Components. Chemometrics and Intelligent Laboratory Systems (CHEMOLAB), 189, 169-180. https://doi.org/10.1016/j.chemolab.2019.05.003</mixed-citation></ref><ref id="scirp.100071-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Nazari, I., Tahir, M., Tayari, H. and Chong, K.T. (2019) iN6-Methyl (5-step): Identifying RNA N6-methyladenosine Sites Using Deep Learning Mode via Chou’s 5-Step Rules and Chou’s General PseKNC. Chemometrics and Intelligent Laboratory Systems (CHEMOLAB), 193, Article ID: 103811.  
https://doi.org/10.1016/j.chemolab.2019.103811</mixed-citation></ref><ref id="scirp.100071-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Dutta, A., Dalmia, A., R, A., Singh, K.K. and Anand, A. (2019) Using the Chou’s 5-Steps Rule to Predict Splice Junctions with Interpretable Bidirectional Long Short-Term Memory Networks. Computers in Biology and Medicine, 116, Article ID: 103558. https://doi.org/10.1016/j.compbiomed.2019.103558</mixed-citation></ref><ref id="scirp.100071-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Ehsan, A., Mahmood, M.K., Khan, Y.D., Barukab, O.M., Khan, S.A. and Chou, K.C. (2019) iHyd-PseAAC (EPSV): Identify Hydroxylation Sites in Proteins by Extracting Enhanced Position and Sequence Variant Feature via Chou’s 5-Step Rule and General Pseudo Amino Acid Composition. Current Genomics, 20, 124-133.  
https://doi.org/10.2174/1389202920666190325162307</mixed-citation></ref><ref id="scirp.100071-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2011) Some Remarks on Protein Attribute Prediction and Pseudo Amino Acid Composition (50th Anniversary Year Review, 5-Steps Rule). Journal of Theoretical Biology, 273, 236-247. https://doi.org/10.1016/j.jtbi.2010.12.024</mixed-citation></ref><ref id="scirp.100071-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2001) Prediction of Protein Cellular Attributes Using Pseudo Amino Acid Composition. Proteins: Structure, Function, and Genetics, 43, 246-255. (Erratum: ibid., 2001, Vol. 44, 60) https://doi.org/10.1002/prot.1035</mixed-citation></ref><ref id="scirp.100071-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2005) Using Amphiphilic Pseudo Amino Acid Composition to Predict Enzyme Subfamily Classes. Bioinformatics, 21, 10-19.  
https://doi.org/10.1093/bioinformatics/bth466</mixed-citation></ref><ref id="scirp.100071-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Zhou, X.B., Chen, C., Li, Z.C. and Zou, X.Y. (2007) Using Chou’s Amphiphilic Pseudo Amino Acid Composition and Support Vector Machine for Prediction of Enzyme Subfamily Classes. Journal of Theoretical Biology, 248, 546-551.  
https://doi.org/10.1016/j.jtbi.2007.06.001</mixed-citation></ref><ref id="scirp.100071-ref21"><label>21</label><mixed-citation publication-type="other" xlink:type="simple">Zhang, S.W., Chen, W., Yang, F. and Pan, Q. (2008) Using Chou’s Pseudo Amino Acid Composition to Predict Protein Quaternary Structure: A Sequence-Segmented PseAAC Approach. Amino Acids, 35, 591-598.  
https://doi.org/10.1007/s00726-008-0086-x</mixed-citation></ref><ref id="scirp.100071-ref22"><label>22</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2017) An Unprecedented Revolution in Medicinal Chemistry Driven by the Progress of Biological Science. Current Topics in Medicinal Chemistry, 17, 2337-2358. https://doi.org/10.2174/1568026617666170414145508</mixed-citation></ref><ref id="scirp.100071-ref23"><label>23</label><mixed-citation publication-type="other" xlink:type="simple">Shen, H.B. and Chou, K.C. (2008) PseAAC: A Flexible Web-Server for Generating Various Kinds of Protein Pseudo Amino Acid Composition. Analytical Biochemistry, 373, 386-388. https://doi.org/10.1016/j.ab.2007.10.012</mixed-citation></ref><ref id="scirp.100071-ref24"><label>24</label><mixed-citation publication-type="other" xlink:type="simple">Du, P., Wang, X., Xu, C. and Gao, Y. (2012) PseAAC-Builder: A Cross-Platform Stand-Alone Program for Generating Various Special Chou’s Pseudo Amino Acid Compositions. Analytical Biochemistry, 425, 117-119.  
https://doi.org/10.1016/j.ab.2012.03.015</mixed-citation></ref><ref id="scirp.100071-ref25"><label>25</label><mixed-citation publication-type="other" xlink:type="simple">Cao, D.S., Xu, Q.S. and Liang, Y.Z. (2013) Propy: A Tool to Generate Various Modes of Chou’s PseAAC. Bioinformatics, 29, 960-962.  
https://doi.org/10.1093/bioinformatics/btt072</mixed-citation></ref><ref id="scirp.100071-ref26"><label>26</label><mixed-citation publication-type="other" xlink:type="simple">Du, P., Gu, S. and Jiao, Y. (2014) PseAAC-General: Fast Building Various Modes of General Form of Chou’s Pseudo Amino Acid Composition for Large-Scale Protein Datasets. International Journal of Molecular Sciences, 15, 3495-3506.  
https://doi.org/10.3390/ijms15033495</mixed-citation></ref><ref id="scirp.100071-ref27"><label>27</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2009) Pseudo Amino Acid Composition and Its Applications in Bioinformatics, Proteomics and System Biology. Current Proteomics, 6, 262-274.  
https://doi.org/10.2174/157016409789973707</mixed-citation></ref><ref id="scirp.100071-ref28"><label>28</label><mixed-citation publication-type="other" xlink:type="simple">Chen, W., Lei, T.Y., Jin, D.C., Lin, H. and Chou, K.C. (2014) PseKNC: A Flexible Web-Server for Generating Pseudo K-Tuple Nucleotide Composition. Analytical Biochemistry, 456, 53-60. https://doi.org/10.1016/j.ab.2014.04.001</mixed-citation></ref><ref id="scirp.100071-ref29"><label>29</label><mixed-citation publication-type="other" xlink:type="simple">Chen, W., Lin, H. and Chou, K.C. (2015) Pseudo Nucleotide Composition or PseKNC: An Effective Formulation for Analyzing Genomic Sequences. Molecular BioSystems, 11, 2620-2634. https://doi.org/10.1039/C5MB00155B</mixed-citation></ref><ref id="scirp.100071-ref30"><label>30</label><mixed-citation publication-type="other" xlink:type="simple">Liu, B., Yang, F. and Chou, K.C. (2017) 2L-piRNA: A Two-Layer Ensemble Classifier for Identifying Piwi-Interacting RNAs and Their Function. Molecular Therapy—Nucleic Acids, 7, 267-277. https://doi.org/10.1016/j.omtn.2017.04.008</mixed-citation></ref><ref id="scirp.100071-ref31"><label>31</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2019) Two Kinds of Metrics for Computational Biology. Genomics.  
https://www.sciencedirect.com/science/article/pii/S0888754319304604?via%3Dihu</mixed-citation></ref><ref id="scirp.100071-ref32"><label>32</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2013) Some Remarks on Predicting Multi-Label Attributes in Molecular Biosystems. Molecular Biosystems, 9, 1092-1100.  
https://doi.org/10.1039/c3mb25555g</mixed-citation></ref><ref id="scirp.100071-ref33"><label>33</label><mixed-citation publication-type="other" xlink:type="simple">Song, J., Wang, Y., Li, F., Akutsu, T., Rawlings, N.D., Webb, G.I. and Chou, K.C. (2018) iProt-Sub: A Comprehensive Package for Accurately Mapping and Predicting Protease-Specific Substrates and Cleavage Sites. Brief in Bioinform, 20, 638-658.  
https://doi.org/10.1093/bib/bby028</mixed-citation></ref><ref id="scirp.100071-ref34"><label>34</label><mixed-citation publication-type="other" xlink:type="simple">Zhang, M., Li, F., Marquez-Lago, T.T., Leier, A., Fan, C., Kwoh, C.K., Chou, K.C., Song, J. and Jia, C. (2019) MULTiPly: A Novel Multi-Layer Predictor for Discovering General and Specific Types of Promoters. Bioinformatics, 35, 2957-2965.  
https://doi.org/10.1093/bioinformatics/btz016</mixed-citation></ref><ref id="scirp.100071-ref35"><label>35</label><mixed-citation publication-type="other" xlink:type="simple">Shen, H.B. and Chou, K.C. (2009) A Top-Down Approach to Enhance the Power of Predicting Human Protein Subcellular Localization: Hum-mPLoc 2.0. Analytical Biochemistry, 394, 269-274. https://doi.org/10.1016/j.ab.2009.07.046</mixed-citation></ref><ref id="scirp.100071-ref36"><label>36</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. and Shen, H.B. (2010) Cell-PLoc 2.0: An Improved Package of Web-Servers for Predicting Subcellular Localization of Proteins in Various Organisms. Natural Science, 2, 1090-1103. https://doi.org/10.4236/ns.2010.210136</mixed-citation></ref><ref id="scirp.100071-ref37"><label>37</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C., Wu, Z.C. and Xiao, X. (2012) iLoc-Hum: Using Accumulation-Label Scale to Predict Subcellular Locations of Human Proteins with Both Single and Multiple Sites. Molecular Biosystems, 8, 629-641.  
https://doi.org/10.1039/C1MB05420A</mixed-citation></ref><ref id="scirp.100071-ref38"><label>38</label><mixed-citation publication-type="other" xlink:type="simple">Wang, X. and Li, G.Z. (2012) A Multi-Label Predictor for Identifying the Subcellular Locations of Singleplex and Multiplex Eukaryotic Proteins. PLoS ONE, 7, e36317. https://doi.org/10.1371/journal.pone.0036317</mixed-citation></ref><ref id="scirp.100071-ref39"><label>39</label><mixed-citation publication-type="other" xlink:type="simple">Pacharawongsakda, E. and Theeramunkong, T. (2013) Predict Subcellular Locations of Singleplex and Multiplex Proteins by Semi-Supervised Learning and Dimension-Reducing General Mode of Chou’s PseAAC. IEEE Transactions on Nanobioscience, 12, 311-320. https://doi.org/10.1109/TNB.2013.2272014</mixed-citation></ref><ref id="scirp.100071-ref40"><label>40</label><mixed-citation publication-type="other" xlink:type="simple">Chen, W., Feng, P.M., Lin, H. and Chou, K.C. (2013) iRSpot-PseDNC: Identify Recombination Spots with Pseudo Dinucleotide Composition. Nucleic Acids Research, 41, e68. https://doi.org/10.1093/nar/gks1450</mixed-citation></ref><ref id="scirp.100071-ref41"><label>41</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2001) Using Subsite Coupling to Predict Signal Peptides. Protein Engineering, 14, 75-79. https://doi.org/10.1093/protein/14.2.75</mixed-citation></ref><ref id="scirp.100071-ref42"><label>42</label><mixed-citation publication-type="other" xlink:type="simple">Lin, H., Deng, E.Z., Ding, H., Chen, W. and Chou, K.C. (2014) iPro54-PseKNC: A Sequence-Based Predictor for Identifying Sigma-54 Promoters in Prokaryote with Pseudo k-Tuple Nucleotide Composition. Nucleic Acids Research, 42, 12961-12972.  
https://doi.org/10.1093/nar/gku1019</mixed-citation></ref><ref id="scirp.100071-ref43"><label>43</label><mixed-citation publication-type="other" xlink:type="simple">Feng, P., Yang, H., Ding, H., Lin, H., Chen, W. and Chou, K.C. (2019) iDNA6mA-PseKNC: Identifying DNA N(6)-methyladenosine Sites by Incorporating Nucleotide Physicochemical Properties into PseKNC. Genomics, 111, 96-102.  
https://doi.org/10.1016/j.ygeno.2018.01.005</mixed-citation></ref><ref id="scirp.100071-ref44"><label>44</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2020) The Development of Gordon Life Science Institute: Its Driving Force and Accomplishments. Natural Science, 12, 202-217.  
https://doi.org/10.4236/ns.2020.124018</mixed-citation></ref><ref id="scirp.100071-ref45"><label>45</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2020) The Most Important Ethical Concerns in Science. Natural Science, 12, 35-36. https://doi.org/10.4236/ns.2020.122005</mixed-citation></ref><ref id="scirp.100071-ref46"><label>46</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2020) Other Mountain Stones Can Attack Jade: The 5-Steps Rule. Natural Science, 12, 59-64. https://doi.org/10.4236/ns.2020.123011</mixed-citation></ref><ref id="scirp.100071-ref47"><label>47</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2020) The Problem of Elsevier Series Journals Online Submission by Using Artificial Intelligence. Natural Science, 12, 37-38.  
https://doi.org/10.4236/ns.2020.122006</mixed-citation></ref><ref id="scirp.100071-ref48"><label>48</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2020) Proposing 5-Steps Rule Is a Notable Milestone for Studying Molecular Biology. Natural Science, 12, 74-79. https://doi.org/10.4236/ns.2020.123011</mixed-citation></ref><ref id="scirp.100071-ref49"><label>49</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. (2020) Using Similarity Software to Evaluate Scientific Paper Quality Is a Big Mistake. Natural Science, 12, 42-58. https://doi.org/10.4236/ns.2020.123008</mixed-citation></ref><ref id="scirp.100071-ref50"><label>50</label><mixed-citation publication-type="other" xlink:type="simple">Chou, K.C. and Shen, H.B. (2009) Recent Advances in Developing Web-Servers for Predicting Protein Attributes. Natural Science, 1, 63-92.  
https://doi.org/10.4236/ns.2009.12011</mixed-citation></ref></ref-list></back></article>