mirror of
https://github.com/msberends/AMR.git
synced 2025-09-07 02:09:37 +02:00
(v1.4.0.9008) like variations
This commit is contained in:
@@ -39,7 +39,7 @@
|
||||
</button>
|
||||
<span class="navbar-brand">
|
||||
<a class="navbar-link" href="../index.html">AMR (for R)</a>
|
||||
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.4.0</span>
|
||||
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.4.0.9008</span>
|
||||
</span>
|
||||
</div>
|
||||
|
||||
@@ -187,7 +187,8 @@
|
||||
|
||||
|
||||
|
||||
</header><script src="PCA_files/header-attrs-2.3/header-attrs.js"></script><script src="PCA_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
|
||||
</header><script src="PCA_files/accessible-code-block-0.0.1/empty-anchor.js"></script><link href="PCA_files/anchor-sections-1.0/anchor-sections.css" rel="stylesheet">
|
||||
<script src="PCA_files/anchor-sections-1.0/anchor-sections.js"></script><div class="row">
|
||||
<div class="col-md-9 contents">
|
||||
<div class="page-header toc-ignore">
|
||||
<h1 data-toc-skip>How to conduct principal component analysis (PCA) for AMR</h1>
|
||||
@@ -210,9 +211,9 @@
|
||||
<a href="#transforming" class="anchor"></a>Transforming</h1>
|
||||
<p>For PCA, we need to transform our AMR data first. This is what the <code>example_isolates</code> data set in this package looks like:</p>
|
||||
<div class="sourceCode" id="cb1"><pre class="downlit">
|
||||
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://msberends.github.io/AMR">AMR</a></span>)
|
||||
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org">dplyr</a></span>)
|
||||
<span class="fu"><a href="https://tibble.tidyverse.org/reference/glimpse.html">glimpse</a></span>(<span class="kw">example_isolates</span>)
|
||||
<span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://msberends.github.io/AMR/">AMR</a></span><span class="op">)</span>
|
||||
<span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://dplyr.tidyverse.org">dplyr</a></span><span class="op">)</span>
|
||||
<span class="fu"><a href="https://tibble.tidyverse.org/reference/glimpse.html">glimpse</a></span><span class="op">(</span><span class="va">example_isolates</span><span class="op">)</span>
|
||||
<span class="co"># Rows: 2,000</span>
|
||||
<span class="co"># Columns: 49</span>
|
||||
<span class="co"># $ date <date> 2002-01-02, 2002-01-03, 2002-01-07, 2002-01-07, 2002…</span>
|
||||
@@ -263,18 +264,17 @@
|
||||
<span class="co"># $ CHL <rsi> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…</span>
|
||||
<span class="co"># $ COL <rsi> NA, NA, R, R, R, R, R, R, R, R, R, R, NA, NA, NA, R, …</span>
|
||||
<span class="co"># $ MUP <rsi> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…</span>
|
||||
<span class="co"># $ RIF <rsi> R, R, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, R, R, R…</span>
|
||||
</pre></div>
|
||||
<span class="co"># $ RIF <rsi> R, R, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, R, R, R…</span></pre></div>
|
||||
<p>Now to transform this to a data set with only resistance percentages per taxonomic order and genus:</p>
|
||||
<div class="sourceCode" id="cb2"><pre class="downlit">
|
||||
<span class="kw">resistance_data</span> <span class="op"><-</span> <span class="kw">example_isolates</span> <span class="op">%>%</span>
|
||||
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/group_by.html">group_by</a></span>(order = <span class="fu"><a href="../reference/mo_property.html">mo_order</a></span>(<span class="kw">mo</span>), <span class="co"># group on anything, like order</span>
|
||||
genus = <span class="fu"><a href="../reference/mo_property.html">mo_genus</a></span>(<span class="kw">mo</span>)) <span class="op">%>%</span> <span class="co"># and genus as we do here</span>
|
||||
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/summarise_all.html">summarise_if</a></span>(<span class="kw">is.rsi</span>, <span class="kw">resistance</span>) <span class="op">%>%</span> <span class="co"># then get resistance of all drugs</span>
|
||||
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(<span class="kw">order</span>, <span class="kw">genus</span>, <span class="kw">AMC</span>, <span class="kw">CXM</span>, <span class="kw">CTX</span>,
|
||||
<span class="kw">CAZ</span>, <span class="kw">GEN</span>, <span class="kw">TOB</span>, <span class="kw">TMP</span>, <span class="kw">SXT</span>) <span class="co"># and select only relevant columns</span>
|
||||
<span class="va">resistance_data</span> <span class="op"><-</span> <span class="va">example_isolates</span> <span class="op">%>%</span>
|
||||
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/group_by.html">group_by</a></span><span class="op">(</span>order <span class="op">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_order</a></span><span class="op">(</span><span class="va">mo</span><span class="op">)</span>, <span class="co"># group on anything, like order</span>
|
||||
genus <span class="op">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_genus</a></span><span class="op">(</span><span class="va">mo</span><span class="op">)</span><span class="op">)</span> <span class="op">%>%</span> <span class="co"># and genus as we do here</span>
|
||||
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/summarise_all.html">summarise_if</a></span><span class="op">(</span><span class="va">is.rsi</span>, <span class="va">resistance</span><span class="op">)</span> <span class="op">%>%</span> <span class="co"># then get resistance of all drugs</span>
|
||||
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span><span class="op">(</span><span class="va">order</span>, <span class="va">genus</span>, <span class="va">AMC</span>, <span class="va">CXM</span>, <span class="va">CTX</span>,
|
||||
<span class="va">CAZ</span>, <span class="va">GEN</span>, <span class="va">TOB</span>, <span class="va">TMP</span>, <span class="va">SXT</span><span class="op">)</span> <span class="co"># and select only relevant columns</span>
|
||||
|
||||
<span class="fu"><a href="https://rdrr.io/r/utils/head.html">head</a></span>(<span class="kw">resistance_data</span>)
|
||||
<span class="fu"><a href="https://rdrr.io/r/utils/head.html">head</a></span><span class="op">(</span><span class="va">resistance_data</span><span class="op">)</span>
|
||||
<span class="co"># # A tibble: 6 x 10</span>
|
||||
<span class="co"># # Groups: order [2]</span>
|
||||
<span class="co"># order genus AMC CXM CTX CAZ GEN TOB TMP SXT</span>
|
||||
@@ -284,46 +284,40 @@
|
||||
<span class="co"># 3 Actinomycetales Cutibacterium NA NA NA NA NA NA NA NA</span>
|
||||
<span class="co"># 4 Actinomycetales Dermabacter NA NA NA NA NA NA NA NA</span>
|
||||
<span class="co"># 5 Actinomycetales Micrococcus NA NA NA NA NA NA NA NA</span>
|
||||
<span class="co"># 6 Actinomycetales Rothia NA NA NA NA NA NA NA NA</span>
|
||||
</pre></div>
|
||||
<span class="co"># 6 Actinomycetales Rothia NA NA NA NA NA NA NA NA</span></pre></div>
|
||||
</div>
|
||||
<div id="perform-principal-component-analysis" class="section level1">
|
||||
<h1 class="hasAnchor">
|
||||
<a href="#perform-principal-component-analysis" class="anchor"></a>Perform principal component analysis</h1>
|
||||
<p>The new <code><a href="../reference/pca.html">pca()</a></code> function will automatically filter on rows that contain numeric values in all selected variables, so we now only need to do:</p>
|
||||
<div class="sourceCode" id="cb3"><pre class="downlit">
|
||||
<span class="kw">pca_result</span> <span class="op"><-</span> <span class="fu"><a href="../reference/pca.html">pca</a></span>(<span class="kw">resistance_data</span>)
|
||||
<span class="va">pca_result</span> <span class="op"><-</span> <span class="fu"><a href="../reference/pca.html">pca</a></span><span class="op">(</span><span class="va">resistance_data</span><span class="op">)</span>
|
||||
<span class="co"># NOTE: Columns selected for PCA: AMC CXM CTX CAZ GEN TOB TMP SXT.</span>
|
||||
<span class="co"># Total observations available: 7.</span>
|
||||
</pre></div>
|
||||
<span class="co"># Total observations available: 7.</span></pre></div>
|
||||
<p>The result can be reviewed with the good old <code><a href="https://rdrr.io/r/base/summary.html">summary()</a></code> function:</p>
|
||||
<div class="sourceCode" id="cb4"><pre class="downlit">
|
||||
<span class="fu"><a href="https://rdrr.io/r/base/summary.html">summary</a></span>(<span class="kw">pca_result</span>)
|
||||
<span class="fu"><a href="https://rdrr.io/r/base/summary.html">summary</a></span><span class="op">(</span><span class="va">pca_result</span><span class="op">)</span>
|
||||
<span class="co"># Importance of components:</span>
|
||||
<span class="co"># PC1 PC2 PC3 PC4 PC5 PC6 PC7</span>
|
||||
<span class="co"># Standard deviation 2.154 1.6807 0.61365 0.33902 0.20757 0.03136 1.733e-16</span>
|
||||
<span class="co"># Proportion of Variance 0.580 0.3531 0.04707 0.01437 0.00539 0.00012 0.000e+00</span>
|
||||
<span class="co"># Cumulative Proportion 0.580 0.9331 0.98012 0.99449 0.99988 1.00000 1.000e+00</span>
|
||||
</pre></div>
|
||||
<span class="co"># Cumulative Proportion 0.580 0.9331 0.98012 0.99449 0.99988 1.00000 1.000e+00</span></pre></div>
|
||||
<p>Good news. The first two components explain a total of 93.3% of the variance (see the PC1 and PC2 values of the <em>Proportion of Variance</em>. We can create a so-called biplot with the base R <code><a href="https://rdrr.io/r/stats/biplot.html">biplot()</a></code> function, to see which antimicrobial resistance per drug explain the difference per microorganism.</p>
|
||||
</div>
|
||||
<div id="plotting-the-results" class="section level1">
|
||||
<h1 class="hasAnchor">
|
||||
<a href="#plotting-the-results" class="anchor"></a>Plotting the results</h1>
|
||||
<div class="sourceCode" id="cb5"><pre class="downlit">
|
||||
<span class="fu"><a href="https://rdrr.io/r/stats/biplot.html">biplot</a></span>(<span class="kw">pca_result</span>)
|
||||
</pre></div>
|
||||
<span class="fu"><a href="https://rdrr.io/r/stats/biplot.html">biplot</a></span><span class="op">(</span><span class="va">pca_result</span><span class="op">)</span></pre></div>
|
||||
<p><img src="PCA_files/figure-html/unnamed-chunk-5-1.png" width="750"></p>
|
||||
<p>But we can’t see the explanation of the points. Perhaps this works better with our new <code><a href="../reference/ggplot_pca.html">ggplot_pca()</a></code> function, that automatically adds the right labels and even groups:</p>
|
||||
<div class="sourceCode" id="cb6"><pre class="downlit">
|
||||
<span class="fu"><a href="../reference/ggplot_pca.html">ggplot_pca</a></span>(<span class="kw">pca_result</span>)
|
||||
</pre></div>
|
||||
<span class="fu"><a href="../reference/ggplot_pca.html">ggplot_pca</a></span><span class="op">(</span><span class="va">pca_result</span><span class="op">)</span></pre></div>
|
||||
<p><img src="PCA_files/figure-html/unnamed-chunk-6-1.png" width="750"></p>
|
||||
<p>You can also print an ellipse per group, and edit the appearance:</p>
|
||||
<div class="sourceCode" id="cb7"><pre class="downlit">
|
||||
<span class="fu"><a href="../reference/ggplot_pca.html">ggplot_pca</a></span>(<span class="kw">pca_result</span>, ellipse = <span class="fl">TRUE</span>) <span class="op">+</span>
|
||||
<span class="kw">ggplot2</span>::<span class="fu"><a href="https://ggplot2.tidyverse.org/reference/labs.html">labs</a></span>(title = <span class="st">"An AMR/PCA biplot!"</span>)
|
||||
</pre></div>
|
||||
<span class="fu"><a href="../reference/ggplot_pca.html">ggplot_pca</a></span><span class="op">(</span><span class="va">pca_result</span>, ellipse <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span> <span class="op">+</span>
|
||||
<span class="fu">ggplot2</span><span class="fu">::</span><span class="fu"><a href="https://ggplot2.tidyverse.org/reference/labs.html">labs</a></span><span class="op">(</span>title <span class="op">=</span> <span class="st">"An AMR/PCA biplot!"</span><span class="op">)</span></pre></div>
|
||||
<p><img src="PCA_files/figure-html/unnamed-chunk-7-1.png" width="750"></p>
|
||||
</div>
|
||||
</div>
|
||||
@@ -343,7 +337,7 @@
|
||||
</div>
|
||||
|
||||
<div class="pkgdown">
|
||||
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.9000.</p>
|
||||
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.6.1.</p>
|
||||
</div>
|
||||
|
||||
</footer>
|
||||
|
Reference in New Issue
Block a user