1
0
mirror of https://github.com/msberends/AMR.git synced 2025-07-09 01:22:25 +02:00

(v0.7.1.9030) eucast_rules() fix

This commit is contained in:
2019-08-08 15:52:07 +02:00
parent f67c739892
commit 22a206ffd8
70 changed files with 470 additions and 420 deletions

View File

@ -40,7 +40,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">0.7.1.9015</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">0.7.1.9029</span>
</span>
</div>
@ -185,7 +185,7 @@
<h1>How to import data from SPSS / SAS / Stata</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">29 July 2019</h4>
<h4 class="date">08 August 2019</h4>
<div class="hidden name"><code>SPSS.Rmd</code></div>
@ -211,29 +211,33 @@
</li>
<li>
<p><strong>R is extremely flexible.</strong></p>
<p>Because you write the syntax yourself, you can do anything you want. The flexibility in transforming, gathering, grouping, summarising and drawing plots is endless - with SPSS, SAS or Stata you are bound to their algorithms and styles. They may be a bit flexible, but you can probably never create that very specific publication-ready plot without using other (paid) software.</p>
<p>Because you write the syntax yourself, you can do anything you want. The flexibility in transforming, gathering, grouping and summarising data, or drawing plots, is endless - with SPSS, SAS or Stata you are bound to their algorithms and format styles. They may be a bit flexible, but you can probably never create that very specific publication-ready plot without using other (paid) software. If you sometimes write syntaxes in SPSS to run a complete analysis or to automate some of your work, you could do this a lot less time in R. You will notice that writing syntaxes in R is a lot more nifty and clever than in SPSS. Still, as working with any statistical package, you will have to have knowledge about what you are doing (statistically) and what you are willing to accomplish.</p>
</li>
<li>
<p><strong>R can be easily automated.</strong></p>
<p>Over the last years, <a href="https://rmarkdown.rstudio.com/">R Markdown</a> has really made an interesting development. With R Markdown, you can very easily produce reports, whether format has to be Word, PowerPoint, a website, a PDF document or just the raw data to Excel. It even allows the use of a reference file containing the layout style (e.g. fonts and colours) of your organisation. I use this a lot to generate weekly and monthly reports automatically. Just write the code once and enjoy the automatically updated reports at any interval you like.</p>
<p>Over the last years, <a href="https://rmarkdown.rstudio.com/">R Markdown</a> has really made an interesting development. With R Markdown, you can very easily produce reports, whether the format has to be Word, PowerPoint, a website, a PDF document or just the raw data to Excel. It even allows the use of a reference file containing the layout style (e.g. fonts and colours) of your organisation. I use this a lot to generate weekly and monthly reports automatically. Just write the code once and enjoy the automatically updated reports at any interval you like.</p>
<p>For an even more professional environment, you could create <a href="https://shiny.rstudio.com/">Shiny apps</a>: live manipulation of data using a custom made website. The webdesign knowledge needed (JavaScript, CSS, HTML) is almost <em>zero</em>.</p>
</li>
<li>
<p><strong>R has a huge community.</strong></p>
<p>Many R users just ask questions on websites like <a href="https://stackoverflow.com">StackOverflow.com</a>, the largest online community for programmers. At the time of writing, almost <a href="https://stackoverflow.com/questions/tagged/r?sort=votes">300,000 R-related questions</a> have already been asked on this platform (which covers questions and answers for any programming language). In my own experience, most questions are answered within a couple of minutes.</p>
<p>Many R users just ask questions on websites like <a href="https://stackoverflow.com">StackOverflow.com</a>, the largest online community for programmers. At the time of writing, more than <a href="https://stackoverflow.com/questions/tagged/r?sort=votes">300,000 R-related questions</a> have already been asked on this platform (which covers questions and answers for any programming language). In my own experience, most questions are answered within a couple of minutes.</p>
</li>
<li>
<p><strong>R understands any data type, including SPSS/SAS/Stata.</strong></p>
<p>And thats not vice versa Im afraid. You can import data from any source into R. From SPSS, SAS and Stata (<a href="https://haven.tidyverse.org/">link</a>), from Minitab, Epi Info and EpiData (<a href="https://cran.r-project.org/package=foreign">link</a>), from Excel (<a href="https://readxl.tidyverse.org/">link</a>), from flat files like CSV, TXT or TSV (<a href="https://readr.tidyverse.org/">link</a>), or directly from databases and datawarehouses from anywhere on the world (<a href="https://dbplyr.tidyverse.org/">link</a>). You can even scrape websites to download tables that are live on the internet (<a href="https://github.com/hadley/rvest">link</a>) or get the results of an API call (<a href="https://github.com/Rdatatable/data.table/wiki/Convenience-features-of-fread">link</a>).</p>
<p>And thats not vice versa Im afraid. You can import data from any source into R. For example from SPSS, SAS and Stata (<a href="https://haven.tidyverse.org/">link</a>), from Minitab, Epi Info and EpiData (<a href="https://cran.r-project.org/package=foreign">link</a>), from Excel (<a href="https://readxl.tidyverse.org/">link</a>), from flat files like CSV, TXT or TSV (<a href="https://readr.tidyverse.org/">link</a>), or directly from databases and datawarehouses from anywhere on the world (<a href="https://dbplyr.tidyverse.org/">link</a>). You can even scrape websites to download tables that are live on the internet (<a href="https://github.com/hadley/rvest">link</a>) or get the results of an API call and transform it into data in only one command (<a href="https://github.com/Rdatatable/data.table/wiki/Convenience-features-of-fread">link</a>).</p>
<p>And the best part - you can export from R to most data formats as well. So you can import an SPSS file, do your analysis neatly in R and export the resulting tables to Excel files for sharing.</p>
</li>
<li>
<p><strong>R is completely free and open-source.</strong></p>
<p>No strings attached. It was created and is being maintained by volunteers who believe that (data) science should be open and publicly available to everybody. SPSS, SAS and Stata are quite expensive. IBM SPSS Staticstics only comes with subscriptions nowadays, varying <a href="https://www.ibm.com/products/spss-statistics/pricing">between USD 1,300 and USD 8,500</a> per computer <em>per year</em>. SAS Analytics Pro costs <a href="https://www.sas.com/store/products-solutions/sas-analytics-pro/prodPERSANL.html">around USD 10,000</a> per computer. Stata also has a business model with subscription fees, varying <a href="https://www.stata.com/order/new/bus/single-user-licenses/dl/">between USD 600 and USD 1,200</a> per computer per year, but lower prices come with a limitation of the number of variables you can work with.</p>
<p>No strings attached. It was created and is being maintained by volunteers who believe that (data) science should be open and publicly available to everybody. SPSS, SAS and Stata are quite expensive. IBM SPSS Staticstics only comes with subscriptions nowadays, varying <a href="https://www.ibm.com/products/spss-statistics/pricing">between USD 1,300 and USD 8,500</a> per user <em>per year</em>. SAS Analytics Pro costs <a href="https://www.sas.com/store/products-solutions/sas-analytics-pro/prodPERSANL.html">around USD 10,000</a> per computer. Stata also has a business model with subscription fees, varying <a href="https://www.stata.com/order/new/bus/single-user-licenses/dl/">between USD 600 and USD 2,800</a> per computer per year, but lower prices come with a limitation of the number of variables you can work with. And still they do not offer the above benefits of R.</p>
<p>If you are working at a midsized or small company, you can save it tens of thousands of dollars by using R instead of e.g. SPSS - gaining even more functions and flexibility. And all R enthousiasts can do as much PR as they want (like I do here), because nobody is officially associated with or affiliated by R. It is really free.</p>
</li>
<li>
<p><strong>R is (nowadays) the preferred analysis software in academic papers.</strong></p>
<p>At present, R is among the world most powerful statistical languages, and it is generally very popular in science (Bollmann <em>et al.</em>, 2017). For all the above reasons, the number of references to R as an analysis method in academic papers <a href="https://r4stats.com/2014/08/20/r-passes-spss-in-scholarly-use-stata-growing-rapidly/">is rising continuously</a> and has even surpassed SPSS for academic use (Muenchen, 2014).</p>
<p>I believe that the thing with SPSS is, that it has always had a great user interface which is very easy to learn and use. Back when they developed it, they had very little competition, let alone from R. R didnt even had a professional user interface until the last decade (called RStudio, see below). How people used R between the nineties and 2010 is almost completely incomparable to how R is being used now. The language itself <a href="https://www.tidyverse.org/packages/">has been restyled completely</a> by volunteers who are dedicated professionals in the field of data science. SPSS was great when there was nothing else that could compete. But now in 2019, I dont see any reason why SPSS would be of any better use than R.</p>
</li>
</ul>
<p>If you sometimes write syntaxes in SPSS to run a complete analysis or to automate some of your work, you should perhaps do this in R. You will notice that writing syntaxes in R is a lot more nifty and clever than in SPSS. Still, as working with any statistical package, you will have to have knowledge about what you are doing (statistically) and what you are willing to accomplish.</p>
<p>To demonstrate the first point:</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb1-1" data-line-number="1"><span class="co"># not all values are valid MIC values:</span></a>
<a class="sourceLine" id="cb1-2" data-line-number="2"><span class="kw"><a href="../reference/as.mic.html">as.mic</a></span>(<span class="fl">0.125</span>)</a>
@ -251,10 +255,10 @@
<a class="sourceLine" id="cb1-14" data-line-number="14">klebsiella_test &lt;-<span class="st"> </span><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/data.frame">data.frame</a></span>(<span class="dt">mo =</span> <span class="st">"klebsiella"</span>, </a>
<a class="sourceLine" id="cb1-15" data-line-number="15"> <span class="dt">amox =</span> <span class="st">"S"</span>,</a>
<a class="sourceLine" id="cb1-16" data-line-number="16"> <span class="dt">stringsAsFactors =</span> <span class="ot">FALSE</span>)</a>
<a class="sourceLine" id="cb1-17" data-line-number="17">klebsiella_test</a>
<a class="sourceLine" id="cb1-17" data-line-number="17">klebsiella_test <span class="co"># (our original data)</span></a>
<a class="sourceLine" id="cb1-18" data-line-number="18"><span class="co"># mo amox</span></a>
<a class="sourceLine" id="cb1-19" data-line-number="19"><span class="co"># 1 klebsiella S</span></a>
<a class="sourceLine" id="cb1-20" data-line-number="20"><span class="kw"><a href="../reference/eucast_rules.html">eucast_rules</a></span>(klebsiella_test, <span class="dt">info =</span> <span class="ot">FALSE</span>)</a>
<a class="sourceLine" id="cb1-20" data-line-number="20"><span class="kw"><a href="../reference/eucast_rules.html">eucast_rules</a></span>(klebsiella_test, <span class="dt">info =</span> <span class="ot">FALSE</span>) <span class="co"># (the edited data by EUCAST rules)</span></a>
<a class="sourceLine" id="cb1-21" data-line-number="21"><span class="co"># mo amox</span></a>
<a class="sourceLine" id="cb1-22" data-line-number="22"><span class="co"># 1 klebsiella R</span></a>
<a class="sourceLine" id="cb1-23" data-line-number="23"></a>