AMR/docs/articles/SPSS.html

532 lines
30 KiB
HTML
Raw Normal View History

2019-02-14 15:18:17 +01:00
<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>How to import data from SPSS / SAS / Stata • AMR (for R)</title>
2019-02-14 15:18:17 +01:00
<!-- favicons --><link rel="icon" type="image/png" sizes="16x16" href="../favicon-16x16.png">
<link rel="icon" type="image/png" sizes="32x32" href="../favicon-32x32.png">
<link rel="apple-touch-icon" type="image/png" sizes="180x180" href="../apple-touch-icon.png">
<link rel="apple-touch-icon" type="image/png" sizes="120x120" href="../apple-touch-icon-120x120.png">
<link rel="apple-touch-icon" type="image/png" sizes="76x76" href="../apple-touch-icon-76x76.png">
<link rel="apple-touch-icon" type="image/png" sizes="60x60" href="../apple-touch-icon-60x60.png">
2020-12-27 00:07:00 +01:00
<!-- jquery --><script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js" integrity="sha256-CSXorXvZcTkaix6Yvo6HppcZGetbYMGWSFlBw8HfCJo=" crossorigin="anonymous"></script><!-- Bootstrap --><link href="https://cdnjs.cloudflare.com/ajax/libs/bootswatch/3.4.0/flatly/bootstrap.min.css" rel="stylesheet" crossorigin="anonymous">
<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.4.1/js/bootstrap.min.js" integrity="sha256-nuL8/2cJ5NDSSwnKD8VqreErSWHtnEP9E7AySL+1ev4=" crossorigin="anonymous"></script><!-- bootstrap-toc --><link rel="stylesheet" href="../bootstrap-toc.css">
2020-04-17 19:16:30 +02:00
<script src="../bootstrap-toc.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous">
<!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- pkgdown --><link href="../pkgdown.css" rel="stylesheet">
2020-02-23 20:56:11 +01:00
<script src="../pkgdown.js"></script><link href="../extra.css" rel="stylesheet">
2019-02-14 15:18:17 +01:00
<script src="../extra.js"></script><meta property="og:title" content="How to import data from SPSS / SAS / Stata">
2020-04-13 21:09:56 +02:00
<meta property="og:description" content="AMR">
2021-12-12 11:07:02 +01:00
<meta property="og:image" content="https://msberends.github.io/AMR/logo.svg">
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:creator" content="@msberends">
<meta name="twitter:site" content="@univgroningen">
2019-02-14 15:18:17 +01:00
<!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
2020-04-17 19:16:30 +02:00
<body data-spy="scroll" data-target="#toc">
2021-12-06 11:12:30 +01:00
2019-02-14 15:18:17 +01:00
<div class="container template-article">
<header><div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
2022-05-11 10:26:58 +02:00
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">1.8.1.9008</span>
2019-02-14 15:18:17 +01:00
</span>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="../index.html">
<span class="fa fa-home"></span>
2019-02-14 15:18:17 +01:00
Home
</a>
</li>
<li class="dropdown">
2022-05-11 10:26:58 +02:00
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" data-bs-toggle="dropdown" aria-expanded="false">
<span class="fa fa-question-circle"></span>
2019-02-14 15:18:17 +01:00
How to
<span class="caret"></span>
</a>
<ul class="dropdown-menu" role="menu">
<li>
<a href="../articles/AMR.html">
<span class="fa fa-directions"></span>
2019-02-14 15:18:17 +01:00
Conduct AMR analysis
</a>
</li>
<li>
<a href="../articles/resistance_predict.html">
<span class="fa fa-dice"></span>
2019-02-14 15:18:17 +01:00
Predict antimicrobial resistance
</a>
</li>
2020-08-17 21:49:58 +02:00
<li>
<a href="../articles/datasets.html">
<span class="fa fa-database"></span>
2020-08-17 21:49:58 +02:00
Data sets for download / own use
2020-08-17 21:49:58 +02:00
</a>
</li>
2020-04-13 21:09:56 +02:00
<li>
<a href="../articles/PCA.html">
<span class="fa fa-compress"></span>
2020-04-13 21:09:56 +02:00
Conduct principal component analysis for AMR
</a>
</li>
2019-05-23 16:58:59 +02:00
<li>
<a href="../articles/MDR.html">
<span class="fa fa-skull-crossbones"></span>
2019-05-23 16:58:59 +02:00
Determine multi-drug resistance (MDR)
</a>
</li>
2019-02-14 15:18:17 +01:00
<li>
<a href="../articles/WHONET.html">
<span class="fa fa-globe-americas"></span>
2019-02-14 15:18:17 +01:00
Work with WHONET data
</a>
</li>
<li>
<a href="../articles/SPSS.html">
<span class="fa fa-file-upload"></span>
2019-02-14 15:18:17 +01:00
Import data from SPSS/SAS/Stata
</a>
</li>
<li>
<a href="../articles/EUCAST.html">
<span class="fa fa-exchange-alt"></span>
2019-02-14 15:18:17 +01:00
Apply EUCAST rules
</a>
</li>
<li>
<a href="../reference/mo_property.html">
<span class="fa fa-bug"></span>
2019-02-14 15:18:17 +01:00
Get properties of a microorganism
</a>
</li>
<li>
2019-05-10 16:44:59 +02:00
<a href="../reference/ab_property.html">
<span class="fa fa-capsules"></span>
2019-02-14 15:18:17 +01:00
Get properties of an antibiotic
</a>
</li>
<li>
<a href="../articles/benchmarks.html">
<span class="fa fa-shipping-fast"></span>
2019-02-14 15:18:17 +01:00
Other: benchmarks
</a>
</li>
</ul>
</li>
<li>
2020-07-28 18:39:57 +02:00
<a href="../reference/index.html">
<span class="fa fa-book-open"></span>
2019-02-14 15:18:17 +01:00
Manual
</a>
</li>
<li>
<a href="../authors.html">
<span class="fa fa-users"></span>
2019-02-14 15:18:17 +01:00
Authors
</a>
</li>
<li>
2020-07-28 18:39:57 +02:00
<a href="../news/index.html">
2021-05-24 15:29:17 +02:00
<span class="far fa-newspaper"></span>
2019-02-14 15:18:17 +01:00
Changelog
</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
2021-12-06 11:12:30 +01:00
<a href="https://github.com/msberends/AMR" class="external-link">
2021-05-24 15:29:17 +02:00
<span class="fab fa-github"></span>
2019-02-14 15:18:17 +01:00
Source Code
</a>
</li>
</ul>
</div>
<!--/.nav-collapse -->
</div>
<!--/.container -->
</div>
<!--/.navbar -->
2019-10-13 09:31:58 +02:00
2022-05-11 10:26:58 +02:00
</header><div class="row">
2019-02-14 15:18:17 +01:00
<div class="col-md-9 contents">
<div class="page-header toc-ignore">
2020-04-17 19:16:30 +02:00
<h1 data-toc-skip>How to import data from SPSS / SAS / Stata</h1>
2022-05-11 10:26:58 +02:00
<h4 data-toc-skip class="author">Dr. Matthijs
Berends</h4>
2019-02-14 15:18:17 +01:00
2022-05-11 10:26:58 +02:00
<h4 data-toc-skip class="date">11 mei 2022</h4>
2019-02-14 15:18:17 +01:00
2021-12-06 11:12:30 +01:00
<small class="dont-index">Source: <a href="https://github.com/msberends/AMR/blob/HEAD/vignettes/SPSS.Rmd" class="external-link"><code>vignettes/SPSS.Rmd</code></a></small>
2019-02-14 15:18:17 +01:00
<div class="hidden name"><code>SPSS.Rmd</code></div>
</div>
2021-12-06 11:12:30 +01:00
<div class="section level2">
<h2 id="spss-sas-stata">SPSS / SAS / Stata<a class="anchor" aria-label="anchor" href="#spss-sas-stata"></a>
</h2>
2022-05-11 10:26:58 +02:00
<p>SPSS (Statistical Package for the Social Sciences) is probably the
most well-known software package for statistical analysis. SPSS is
easier to learn than R, because in SPSS you only have to click a menu to
run parts of your analysis. Because of its user-friendliness, it is
taught at universities and particularly useful for students who are new
to statistics. From my experience, I would guess that pretty much all
(bio)medical students know it at the time they graduate. SAS and Stata
are comparable statistical packages popular in big industries.</p>
2019-02-14 15:18:17 +01:00
</div>
2021-12-06 11:12:30 +01:00
<div class="section level2">
<h2 id="compared-to-r">Compared to R<a class="anchor" aria-label="anchor" href="#compared-to-r"></a>
</h2>
2022-05-11 10:26:58 +02:00
<p>As said, SPSS is easier to learn than R. But SPSS, SAS and Stata come
with major downsides when comparing it with R:</p>
2019-02-14 15:18:17 +01:00
<ul>
<li>
<p><strong>R is highly modular.</strong></p>
2022-05-11 10:26:58 +02:00
<p>The <a href="https://cran.r-project.org/" class="external-link">official R network
(CRAN)</a> features more than 16,000 packages at the time of writing,
our <code>AMR</code> package being one of them. All these packages were
peer-reviewed before publication. Aside from this official channel,
there are also developers who choose not to submit to CRAN, but rather
keep it on their own public repository, like GitHub. So there may even
be a lot more than 14,000 packages out there.</p>
<p>Bottom line is, you can really extend it yourself or ask somebody to
do this for you. Take for example our <code>AMR</code> package. Among
other things, it adds reliable reference data to R to help you with the
data cleaning and analysis. SPSS, SAS and Stata will never know what a
valid MIC value is or what the Gram stain of <em>E. coli</em> is. Or
that all species of <em>Klebiella</em> are resistant to amoxicillin and
that Floxapen<sup>®</sup> is a trade name of flucloxacillin. These facts
and properties are often needed to clean existing data, which would be
very inconvenient in a software package without reliable reference data.
See below for a demonstration.</p>
2019-02-14 15:18:17 +01:00
</li>
<li>
<p><strong>R is extremely flexible.</strong></p>
2022-05-11 10:26:58 +02:00
<p>Because you write the syntax yourself, you can do anything you want.
The flexibility in transforming, arranging, grouping and summarising
data, or drawing plots, is endless - with SPSS, SAS or Stata you are
bound to their algorithms and format styles. They may be a bit flexible,
but you can probably never create that very specific publication-ready
plot without using other (paid) software. If you sometimes write
syntaxes in SPSS to run a complete analysis or to automate some of
your work, you could do this a lot less time in R. You will notice that
writing syntaxes in R is a lot more nifty and clever than in SPSS.
Still, as working with any statistical package, you will have to have
knowledge about what you are doing (statistically) and what you are
willing to accomplish.</p>
2019-02-14 15:18:17 +01:00
</li>
<li>
<p><strong>R can be easily automated.</strong></p>
2022-05-11 10:26:58 +02:00
<p>Over the last years, <a href="https://rmarkdown.rstudio.com/" class="external-link">R
Markdown</a> has really made an interesting development. With R
Markdown, you can very easily produce reports, whether the format has to
be Word, PowerPoint, a website, a PDF document or just the raw data to
Excel. It even allows the use of a reference file containing the layout
style (e.g. fonts and colours) of your organisation. I use this a lot to
generate weekly and monthly reports automatically. Just write the code
once and enjoy the automatically updated reports at any interval you
like.</p>
<p>For an even more professional environment, you could create <a href="https://shiny.rstudio.com/" class="external-link">Shiny apps</a>: live manipulation of
data using a custom made website. The webdesign knowledge needed
(JavaScript, CSS, HTML) is almost <em>zero</em>.</p>
2019-02-14 15:18:17 +01:00
</li>
<li>
<p><strong>R has a huge community.</strong></p>
2022-05-11 10:26:58 +02:00
<p>Many R users just ask questions on websites like <a href="https://stackoverflow.com" class="external-link">StackOverflow.com</a>, the largest
online community for programmers. At the time of writing, <a href="https://stackoverflow.com/questions/tagged/r?sort=votes" class="external-link">447,735
R-related questions</a> have already been asked on this platform (that
covers questions and answers for any programming language). In my own
experience, most questions are answered within a couple of
minutes.</p>
2019-02-14 15:18:17 +01:00
</li>
<li>
2022-05-11 10:26:58 +02:00
<p><strong>R understands any data type, including
SPSS/SAS/Stata.</strong></p>
<p>And thats not vice versa Im afraid. You can import data from any
source into R. For example from SPSS, SAS and Stata (<a href="https://haven.tidyverse.org/" class="external-link">link</a>), from Minitab, Epi Info
and EpiData (<a href="https://cran.r-project.org/package=foreign" class="external-link">link</a>), from Excel
(<a href="https://readxl.tidyverse.org/" class="external-link">link</a>), from flat files like
CSV, TXT or TSV (<a href="https://readr.tidyverse.org/" class="external-link">link</a>), or
directly from databases and datawarehouses from anywhere on the world
(<a href="https://dbplyr.tidyverse.org/" class="external-link">link</a>). You can even scrape
websites to download tables that are live on the internet (<a href="https://github.com/hadley/rvest" class="external-link">link</a>) or get the results of
an API call and transform it into data in only one command (<a href="https://github.com/Rdatatable/data.table/wiki/Convenience-features-of-fread" class="external-link">link</a>).</p>
<p>And the best part - you can export from R to most data formats as
well. So you can import an SPSS file, do your analysis neatly in R and
export the resulting tables to Excel files for sharing.</p>
2019-02-14 15:18:17 +01:00
</li>
<li>
<p><strong>R is completely free and open-source.</strong></p>
2022-05-11 10:26:58 +02:00
<p>No strings attached. It was created and is being maintained by
volunteers who believe that (data) science should be open and publicly
available to everybody. SPSS, SAS and Stata are quite expensive. IBM
SPSS Staticstics only comes with subscriptions nowadays, varying <a href="https://www.ibm.com/products/spss-statistics/pricing" class="external-link">between USD
1,300 and USD 8,500</a> per user <em>per year</em>. SAS Analytics Pro
costs <a href="https://www.sas.com/store/products-solutions/sas-analytics-pro/prodPERSANL.html" class="external-link">around
USD 10,000</a> per computer. Stata also has a business model with
subscription fees, varying <a href="https://www.stata.com/order/new/bus/single-user-licenses/dl/" class="external-link">between
USD 600 and USD 2,800</a> per computer per year, but lower prices come
with a limitation of the number of variables you can work with. And
still they do not offer the above benefits of R.</p>
<p>If you are working at a midsized or small company, you can save it
tens of thousands of dollars by using R instead of e.g. SPSS - gaining
even more functions and flexibility. And all R enthousiasts can do as
much PR as they want (like I do here), because nobody is officially
associated with or affiliated by R. It is really free.</p>
2019-02-14 15:18:17 +01:00
</li>
2019-08-08 15:52:07 +02:00
<li>
2022-05-11 10:26:58 +02:00
<p><strong>R is (nowadays) the preferred analysis software in
academic papers.</strong></p>
<p>At present, R is among the world most powerful statistical languages,
and it is generally very popular in science (Bollmann <em>et al.</em>,
2017). For all the above reasons, the number of references to R as an
analysis method in academic papers <a href="https://r4stats.com/2014/08/20/r-passes-spss-in-scholarly-use-stata-growing-rapidly/" class="external-link">is
rising continuously</a> and has even surpassed SPSS for academic use
(Muenchen, 2014).</p>
<p>I believe that the thing with SPSS is, that it has always had a great
user interface which is very easy to learn and use. Back when they
developed it, they had very little competition, let alone from R. R
didnt even had a professional user interface until the last decade
(called RStudio, see below). How people used R between the nineties and
2010 is almost completely incomparable to how R is being used now. The
language itself <a href="https://www.tidyverse.org/packages/" class="external-link">has been
restyled completely</a> by volunteers who are dedicated professionals in
the field of data science. SPSS was great when there was nothing else
that could compete. But now in 2022, I dont see any reason why SPSS
would be of any better use than R.</p>
2019-08-08 15:52:07 +02:00
</li>
2019-02-14 15:18:17 +01:00
</ul>
2019-03-06 14:39:02 +01:00
<p>To demonstrate the first point:</p>
2021-05-24 15:29:17 +02:00
<div class="sourceCode" id="cb1"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="co"># not all values are valid MIC values:</span>
2020-10-26 12:23:03 +01:00
<span class="fu"><a href="../reference/as.mic.html">as.mic</a></span><span class="op">(</span><span class="fl">0.125</span><span class="op">)</span>
2020-05-28 16:48:55 +02:00
<span class="co"># Class &lt;mic&gt;</span>
2020-04-17 19:16:30 +02:00
<span class="co"># [1] 0.125</span>
2020-10-26 12:23:03 +01:00
<span class="fu"><a href="../reference/as.mic.html">as.mic</a></span><span class="op">(</span><span class="st">"testvalue"</span><span class="op">)</span>
2020-05-28 16:48:55 +02:00
<span class="co"># Class &lt;mic&gt;</span>
2020-04-17 19:16:30 +02:00
<span class="co"># [1] &lt;NA&gt;</span>
2021-12-06 11:12:30 +01:00
<span class="co"># the Gram stain is available for all bacteria:</span>
2020-10-26 12:23:03 +01:00
<span class="fu"><a href="../reference/mo_property.html">mo_gramstain</a></span><span class="op">(</span><span class="st">"E. coli"</span><span class="op">)</span>
2020-04-17 19:16:30 +02:00
<span class="co"># [1] "Gram-negative"</span>
2021-12-06 11:12:30 +01:00
<span class="co"># Klebsiella is intrinsic resistant to amoxicillin, according to EUCAST:</span>
2021-12-06 11:12:30 +01:00
<span class="va">klebsiella_test</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/data.frame.html" class="external-link">data.frame</a></span><span class="op">(</span>mo <span class="op">=</span> <span class="st">"klebsiella"</span>,
2020-10-26 12:23:03 +01:00
amox <span class="op">=</span> <span class="st">"S"</span>,
stringsAsFactors <span class="op">=</span> <span class="cn">FALSE</span><span class="op">)</span>
<span class="va">klebsiella_test</span> <span class="co"># (our original data)</span>
2020-04-17 19:16:30 +02:00
<span class="co"># mo amox</span>
<span class="co"># 1 klebsiella S</span>
2020-10-26 12:23:03 +01:00
<span class="fu"><a href="../reference/eucast_rules.html">eucast_rules</a></span><span class="op">(</span><span class="va">klebsiella_test</span>, info <span class="op">=</span> <span class="cn">FALSE</span><span class="op">)</span> <span class="co"># (the edited data by EUCAST rules)</span>
2020-04-17 19:16:30 +02:00
<span class="co"># mo amox</span>
2021-12-06 11:12:30 +01:00
<span class="co"># 1 klebsiella R</span>
2020-04-17 19:16:30 +02:00
<span class="co"># hundreds of trade names can be translated to a name, trade name or an ATC code:</span>
2020-10-26 12:23:03 +01:00
<span class="fu"><a href="../reference/ab_property.html">ab_name</a></span><span class="op">(</span><span class="st">"floxapen"</span><span class="op">)</span>
2020-04-17 19:16:30 +02:00
<span class="co"># [1] "Flucloxacillin"</span>
2020-10-26 12:23:03 +01:00
<span class="fu"><a href="../reference/ab_property.html">ab_tradenames</a></span><span class="op">(</span><span class="st">"floxapen"</span><span class="op">)</span>
2020-04-17 19:16:30 +02:00
<span class="co"># [1] "floxacillin" "floxapen" "floxapen sodium salt"</span>
<span class="co"># [4] "fluclox" "flucloxacilina" "flucloxacillin" </span>
<span class="co"># [7] "flucloxacilline" "flucloxacillinum" "fluorochloroxacillin"</span>
2020-10-26 12:23:03 +01:00
<span class="fu"><a href="../reference/ab_property.html">ab_atc</a></span><span class="op">(</span><span class="st">"floxapen"</span><span class="op">)</span>
2021-05-24 15:29:17 +02:00
<span class="co"># [1] "J01CF05"</span></code></pre></div>
2019-02-14 15:18:17 +01:00
</div>
2021-12-06 11:12:30 +01:00
<div class="section level2">
<h2 id="import-data-from-spsssasstata">Import data from SPSS/SAS/Stata<a class="anchor" aria-label="anchor" href="#import-data-from-spsssasstata"></a>
</h2>
<div class="section level3">
<h3 id="rstudio">RStudio<a class="anchor" aria-label="anchor" href="#rstudio"></a>
</h3>
2022-05-11 10:26:58 +02:00
<p>To work with R, probably the best option is to use <a href="https://www.rstudio.com/products/rstudio/" class="external-link">RStudio</a>. It is an
open-source and free desktop environment which not only allows you to
run R code, but also supports project management, version management,
package management and convenient import menus to work with other data
sources. You can also install <a href="https://www.rstudio.com/products/rstudio/" class="external-link">RStudio Server</a> on a
private or corporate server, which brings nothing less than the complete
RStudio software to you as a website (at home or at work).</p>
<p>To import a data file, just click <em>Import Dataset</em> in the
Environment tab:</p>
<p><img src="https://github.com/msberends/AMR/raw/main/docs/import1.png"></p>
2022-05-11 10:26:58 +02:00
<p>If additional packages are needed, RStudio will ask you if they
should be installed on beforehand.</p>
<p>In the the window that opens, you can define all options (parameters)
that should be used for import and youre ready to go:</p>
<p><img src="https://github.com/msberends/AMR/raw/main/docs/import2.png"></p>
2022-05-11 10:26:58 +02:00
<p>If you want named variables to be imported as factors so it resembles
SPSS more, use <code><a href="https://haven.tidyverse.org/reference/as_factor.html" class="external-link">as_factor()</a></code>.</p>
2019-02-14 15:18:17 +01:00
<p>The difference is this:</p>
2021-05-24 15:29:17 +02:00
<div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="va">SPSS_data</span>
2020-04-17 19:16:30 +02:00
<span class="co"># # A tibble: 4,203 x 4</span>
<span class="co"># v001 sex status statusage</span>
<span class="co"># &lt;dbl&gt; &lt;dbl+lbl&gt; &lt;dbl+lbl&gt; &lt;dbl&gt;</span>
<span class="co"># 1 10002 1 1 76.6</span>
<span class="co"># 2 10004 0 1 59.1</span>
<span class="co"># 3 10005 1 1 54.5</span>
<span class="co"># 4 10006 1 1 54.1</span>
<span class="co"># 5 10007 1 1 57.7</span>
<span class="co"># 6 10008 1 1 62.8</span>
<span class="co"># 7 10010 0 1 63.7</span>
<span class="co"># 8 10011 1 1 73.1</span>
<span class="co"># 9 10017 1 1 56.7</span>
<span class="co"># 10 10018 0 1 66.6</span>
<span class="co"># # ... with 4,193 more rows</span>
2020-04-17 19:16:30 +02:00
2020-10-26 12:23:03 +01:00
<span class="fu">as_factor</span><span class="op">(</span><span class="va">SPSS_data</span><span class="op">)</span>
2020-04-17 19:16:30 +02:00
<span class="co"># # A tibble: 4,203 x 4</span>
<span class="co"># v001 sex status statusage</span>
<span class="co"># &lt;dbl&gt; &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt;</span>
<span class="co"># 1 10002 Male alive 76.6</span>
<span class="co"># 2 10004 Female alive 59.1</span>
<span class="co"># 3 10005 Male alive 54.5</span>
<span class="co"># 4 10006 Male alive 54.1</span>
<span class="co"># 5 10007 Male alive 57.7</span>
<span class="co"># 6 10008 Male alive 62.8</span>
<span class="co"># 7 10010 Female alive 63.7</span>
<span class="co"># 8 10011 Male alive 73.1</span>
<span class="co"># 9 10017 Male alive 56.7</span>
<span class="co"># 10 10018 Female alive 66.6</span>
<span class="co"># # ... with 4,193 more rows</span></code></pre></div>
2019-02-14 15:18:17 +01:00
</div>
2021-12-06 11:12:30 +01:00
<div class="section level3">
<h3 id="base-r">Base R<a class="anchor" aria-label="anchor" href="#base-r"></a>
</h3>
2022-05-11 10:26:58 +02:00
<p>To import data from SPSS, SAS or Stata, you can use the <a href="https://haven.tidyverse.org/" class="external-link">great <code>haven</code> package</a>
yourself:</p>
2021-05-24 15:29:17 +02:00
<div class="sourceCode" id="cb3"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="co"># download and install the latest version:</span>
2021-12-06 11:12:30 +01:00
<span class="fu"><a href="https://rdrr.io/r/utils/install.packages.html" class="external-link">install.packages</a></span><span class="op">(</span><span class="st">"haven"</span><span class="op">)</span>
2020-04-17 19:16:30 +02:00
<span class="co"># load the package you just installed:</span>
2021-12-06 11:12:30 +01:00
<span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://haven.tidyverse.org" class="external-link">haven</a></span><span class="op">)</span> </code></pre></div>
2019-02-14 15:18:17 +01:00
<p>You can now import files as follows:</p>
2021-12-06 11:12:30 +01:00
<div class="section level4">
<h4 id="spss">SPSS<a class="anchor" aria-label="anchor" href="#spss"></a>
</h4>
2019-02-14 15:18:17 +01:00
<p>To read files from SPSS into R:</p>
2021-05-24 15:29:17 +02:00
<div class="sourceCode" id="cb4"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="co"># read any SPSS file based on file extension (best way):</span>
2021-12-06 11:12:30 +01:00
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html" class="external-link">read_spss</a></span><span class="op">(</span>file <span class="op">=</span> <span class="st">"path/to/file"</span><span class="op">)</span>
2020-04-17 19:16:30 +02:00
<span class="co"># read .sav or .zsav file:</span>
2021-12-06 11:12:30 +01:00
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html" class="external-link">read_sav</a></span><span class="op">(</span>file <span class="op">=</span> <span class="st">"path/to/file"</span><span class="op">)</span>
2020-04-17 19:16:30 +02:00
<span class="co"># read .por file:</span>
2021-12-06 11:12:30 +01:00
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html" class="external-link">read_por</a></span><span class="op">(</span>file <span class="op">=</span> <span class="st">"path/to/file"</span><span class="op">)</span></code></pre></div>
<p>Do not forget about <code><a href="https://haven.tidyverse.org/reference/as_factor.html" class="external-link">as_factor()</a></code>, as mentioned above.</p>
2019-02-14 15:18:17 +01:00
<p>To export your R objects to the SPSS file format:</p>
2021-05-24 15:29:17 +02:00
<div class="sourceCode" id="cb5"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="co"># save as .sav file:</span>
2021-12-06 11:12:30 +01:00
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html" class="external-link">write_sav</a></span><span class="op">(</span>data <span class="op">=</span> <span class="va">yourdata</span>, path <span class="op">=</span> <span class="st">"path/to/file"</span><span class="op">)</span>
2020-04-17 19:16:30 +02:00
<span class="co"># save as compressed .zsav file:</span>
2021-12-06 11:12:30 +01:00
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html" class="external-link">write_sav</a></span><span class="op">(</span>data <span class="op">=</span> <span class="va">yourdata</span>, path <span class="op">=</span> <span class="st">"path/to/file"</span>, compress <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span></code></pre></div>
2019-02-14 15:18:17 +01:00
</div>
2021-12-06 11:12:30 +01:00
<div class="section level4">
<h4 id="sas">SAS<a class="anchor" aria-label="anchor" href="#sas"></a>
</h4>
2019-02-14 15:18:17 +01:00
<p>To read files from SAS into R:</p>
2021-05-24 15:29:17 +02:00
<div class="sourceCode" id="cb6"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="co"># read .sas7bdat + .sas7bcat files:</span>
2021-12-06 11:12:30 +01:00
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_sas.html" class="external-link">read_sas</a></span><span class="op">(</span>data_file <span class="op">=</span> <span class="st">"path/to/file"</span>, catalog_file <span class="op">=</span> <span class="cn">NULL</span><span class="op">)</span>
2020-04-17 19:16:30 +02:00
<span class="co"># read SAS transport files (version 5 and version 8):</span>
2021-12-06 11:12:30 +01:00
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_xpt.html" class="external-link">read_xpt</a></span><span class="op">(</span>file <span class="op">=</span> <span class="st">"path/to/file"</span><span class="op">)</span></code></pre></div>
2019-03-06 14:39:02 +01:00
<p>To export your R objects to the SAS file format:</p>
2021-05-24 15:29:17 +02:00
<div class="sourceCode" id="cb7"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="co"># save as regular SAS file:</span>
2021-12-06 11:12:30 +01:00
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_sas.html" class="external-link">write_sas</a></span><span class="op">(</span>data <span class="op">=</span> <span class="va">yourdata</span>, path <span class="op">=</span> <span class="st">"path/to/file"</span><span class="op">)</span>
2020-04-17 19:16:30 +02:00
<span class="co"># the SAS transport format is an open format </span>
<span class="co"># (required for submission of the data to the FDA)</span>
2021-12-06 11:12:30 +01:00
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_xpt.html" class="external-link">write_xpt</a></span><span class="op">(</span>data <span class="op">=</span> <span class="va">yourdata</span>, path <span class="op">=</span> <span class="st">"path/to/file"</span>, version <span class="op">=</span> <span class="fl">8</span><span class="op">)</span></code></pre></div>
2019-02-14 15:18:17 +01:00
</div>
2021-12-06 11:12:30 +01:00
<div class="section level4">
<h4 id="stata">Stata<a class="anchor" aria-label="anchor" href="#stata"></a>
</h4>
2019-02-14 15:18:17 +01:00
<p>To read files from Stata into R:</p>
2021-05-24 15:29:17 +02:00
<div class="sourceCode" id="cb8"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="co"># read .dta file:</span>
2021-12-06 11:12:30 +01:00
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_dta.html" class="external-link">read_stata</a></span><span class="op">(</span>file <span class="op">=</span> <span class="st">"/path/to/file"</span><span class="op">)</span>
2020-04-17 19:16:30 +02:00
<span class="co"># works exactly the same:</span>
2021-12-06 11:12:30 +01:00
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_dta.html" class="external-link">read_dta</a></span><span class="op">(</span>file <span class="op">=</span> <span class="st">"/path/to/file"</span><span class="op">)</span></code></pre></div>
2019-02-14 15:18:17 +01:00
<p>To export your R objects to the Stata file format:</p>
2021-05-24 15:29:17 +02:00
<div class="sourceCode" id="cb9"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="co"># save as .dta file, Stata version 14:</span>
2020-04-17 19:16:30 +02:00
<span class="co"># (supports Stata v8 until v15 at the time of writing)</span>
2021-12-06 11:12:30 +01:00
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_dta.html" class="external-link">write_dta</a></span><span class="op">(</span>data <span class="op">=</span> <span class="va">yourdata</span>, path <span class="op">=</span> <span class="st">"/path/to/file"</span>, version <span class="op">=</span> <span class="fl">14</span><span class="op">)</span></code></pre></div>
2019-02-14 15:18:17 +01:00
</div>
</div>
</div>
</div>
2020-04-13 21:09:56 +02:00
<div class="col-md-3 hidden-xs hidden-sm" id="pkgdown-sidebar">
2019-10-13 09:31:58 +02:00
2020-04-17 19:16:30 +02:00
<nav id="toc" data-toggle="toc"><h2 data-toc-skip>Contents</h2>
</nav>
2019-02-14 15:18:17 +01:00
</div>
</div>
2019-10-13 09:31:58 +02:00
2019-02-14 15:18:17 +01:00
<footer><div class="copyright">
2021-12-06 11:12:30 +01:00
<p></p>
2022-03-27 09:37:55 +02:00
<p>Developed by Matthijs S. Berends, Christian F. Luz, Dennis Souverein, Erwin E. A. Hassing.</p>
2019-02-14 15:18:17 +01:00
</div>
<div class="pkgdown">
2021-12-06 11:12:30 +01:00
<p></p>
2022-05-11 10:26:58 +02:00
<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.3.</p>
2019-02-14 15:18:17 +01:00
</div>
2019-10-13 09:31:58 +02:00
2019-02-14 15:18:17 +01:00
</footer>
</div>
2020-02-23 20:56:11 +01:00
2021-12-06 11:12:30 +01:00
2020-02-23 20:56:11 +01:00
</body>
2019-02-14 15:18:17 +01:00
</html>