Tag Archives: r tutorial

R For Beginners: A Video Tutorial on Installing and Using the Deducer Statistics Package


R For Beginners:  A Video Tutorial on Installing and Using the Deducer Statistics Package with the R Console

In previous tutorials I have discussed the use of R Commander and Deducer statistical packages that provide a menu based GUI for R.  In this video tutorial I will discuss downloading and installing the Deducer statistics package.  This video is designed to support my previous tutorial on the same subject.

I have embedded the video below,   I hope you find this tutorial  a useful adjunct to installing and using the menu based Deducer package.

This document is an embedded Word document.  To view it full screen click on the icon in the lower right corner of the screen

 

R For Beginners: Installing the JGR GUI On a Linux Platform


A Tutorial by D. M. Wiig

This is an embedded Word document.  To view it full screen click on the icon in the lower right cornet of the document.

Watch for more tutorials discussing  R statistics on a Linux platform.

How to Install the Latest Version of R Statistics on Your Raspberry Pi


R for Beginners:  How to Install the Latest Version of R Statistics on Your Raspberry Pi

A tutorial by D. M. Wiig

One of the nice characteristics of open source software such as R is the rapid development of new releases and updates.  While the base core remains stable for a period of time there is a considerable amount of updating,  adding, and removing the component packages.  At the time of this writing the latest iteration is R version 3.3.1, “Bug in Your Hair.” If you are using a Windows platform you will likely go directly to the archive web site and download the latest distribution as a Windows executable installation package.

If you are using a Linux distribution  such as Ubuntu or Debian, the process of adding software is usually accomplished via the menu based installer.  These software installers allow  R and its dependencies to be downloaded from the community archive.

One of the disadvantages of using this approach is that the versions of some software in the community archives may not be updated to the latest version.  This is often the case with R as well as with many other software packages.

To insure that you are downloading the latest R version you need to use the platform’s command line to install what is needed.  You can add the URL’s of some backport archives that are more likely to be kept up to date with current releases.  As an example In this tutorial I will use the R statistical software that I am running on my Raspberry Pi 3 board with a Raspbian OS and the new PIXEL desktop.

Regardless of which Linux distribution you are using first open a command console from the desktop menu. Make sure all is up to date by using the command:

pi@raspberrypi:~ $ sudo apt-get update
This will insure all appropriate packages currently installed are running the latest updates.  If you are running a Raspbian distribution such as jessie you will need to edit the /etc/apt/sources.list file to add a backport to the latest version of R.  Start the nano editor by using the command:

sudo nano /etc/apt/sources.list

This should produce the output as seen below:

pi@raspberrypi:~ $ sudo nano /etc/apt/sources.list

------------------------------------------------
GNU nano 2.2.6 File: /etc/apt/sources.list

deb http://mirrordirector.raspbian.org/raspbian/ jessie main contrib non-free r$
# Uncomment line below then 'apt-get update' to enable 'apt-get source'
deb-src http://archive.raspbian.org/raspbian/ jessie main contrib non-free rpi
deb http://archive.raspbian.org/raspbian/ stretch main
deb http://mirror.las.iastate.edu/CRAN/bin/linux/debian/ jessie main
deb http://mirror.las.iastate.edu/CRAN/bin/linux/ubuntu xenial/

[ Read 8 lines ]
^G Get Help ^O WriteOut ^R Read File ^Y Prev Page ^K Cut Text ^C Cur Pos
^X Exit ^J Justify ^W Where Is ^V Next Page ^U UnCut Text^T To Spell

As is seen above there are several lines containing the standard  Raspbian archives to search.


If you are using a Debian distribution you would add the following line to the file:

http://mirror.las.iastate.edu/CRAN/bin/linux/debian/ jessie main

Replace the 'jessie' portion with the name of the specific Debian distribution you are using replace the 'mirror' portion with the R CRAN mirror that you use.  You also need to add the line that provides the URL of a Raspian 'stretch' archive that contains the most recent updates of many different software packages.  In my case I was looking for the latest R release, but you should search this this archive for the latest version of any software package you are installing.

If you are using an Ubuntu distribution add a line with the appropriate changes for the specific Ubuntu distribution that you are using. 
Check with the documentation provided with your specific Linux distribution to see if there is also a 'stretch' archive maintained for new versions. 

Once these changes are made exit the nano editor using the ^O key command to write the file and then the ^X key command to return to the command line.  You should now be able to issue the command:

pi@raspberrypi:~ $ sudo apt-get install r-base r-base-core r-base-dev

Once the download and install processes have completed you should now be able to invoke R from the command line or menu and see the latest version:

pi@raspberrypi:~ $ R

R version 3.3.2 RC (2016-10-23 r71578) -- "Sincere Pumpkin Patch"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: arm-unknown-linux-gnueabihf (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

 Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 


For other Linux distributions you would add a line similar to the above examples in the /etc/apt/sources.list. Check the documentation for your specific Linux platform for further information about backport archives.

R For Beginners: Installing the latest version of R on a Linux platform


R for Beginners:  Installing the latest version of R on a Linux platform

A tutorial by D. M. Wiig

One of the nice characteristics of open source software such as R is the rapid development of new releases and updates.  While the base core remains stable for a period of time there is a considerable amount of updating,  adding, and removing the component packages.  At the time of this writing the latest iteration is R version 3.3.1, “Bug in Your Hair.” If you are using a Windows platform you will likely go directly to the archive web site and download the latest distribution as a Windows executable installation package.

If you are using a Linux distribution  such as Ubuntu or Debian, the process of adding software is usually accomplished via the menu based installer.  These software installers allow  R and its dependencies to be downloaded from the community archive.

One of the disadvantages of using this approach is that the versions of some software in the archives may not be updated to the latest version.  This is often the case with R.

To insure that you are downloading the latest R version you need to use the platform’s command line to install what is needed.  Regradless of which Linux distribution you are using first open a command console from the desktop menu. Make sure all is up to date by using the command:

pi@raspberrypi:~ $ sudo apt-get update
This will insure all appropriate packages currently installed are running the latest updates.  If you are running a Debian distribution such as jessie you will need to edit the /etc/apt/sources.list file to add a backport to the latest version of R.  Use the nano editor by using the command:

sudo nano /etc/apt/sources.list

This should produce the output as seen below:

pi@raspberrypi:~ $ sudo nano /etc/apt/sources.list

------------------------------------------------
GNU nano 2.2.6 File: /etc/apt/sources.list

deb http://mirrordirector.raspbian.org/raspbian/ jessie main contrib non-free r$
# Uncomment line below then 'apt-get update' to enable 'apt-get source'
deb-src http://archive.raspbian.org/raspbian/ jessie main contrib non-free rpi
deb http://archive.raspbian.org/raspbian/ stretch main
deb http://mirror.las.iastate.edu/CRAN/bin/linux/debian/ jessie main
deb http://mirror.las.iastate.edu/CRAN/bin/linux/ubuntu xenial/

[ Read 8 lines ]
^G Get Help ^O WriteOut ^R Read File ^Y Prev Page ^K Cut Text ^C Cur Pos
^X Exit ^J Justify ^W Where Is ^V Next Page ^U UnCut Text^T To Spell


If you are using a Debian distribution you would add the line to the file

http://mirror.las.iastate.edu/CRAN/bin/linux/debian/ jessie main

Replace the mirror portion with <URL of your favorite CRAN mirror>.  Replace the 'jessie' portion with the name of the specific Debian distribution you are using.

If you are using an Ubuntu distribution add a line with the appropriate changes for the specific Ubuntu distribution that you are using.

Once these changes are made exit the nano editor using the ^O key command to write the file and then the ^X key command to return to the command line.  You should now be able to issue the command:

pi@raspberrypi:~ $ sudo apt-get install r-base r-base-core r-base-dev

Once the download and install processes have completed you should now be able to invoke R from the command line or menu and see the latest version:

pi@raspberrypi:~ $ R

R version 3.3.2 RC (2016-10-23 r71578) -- "Sincere Pumpkin Patch"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: arm-unknown-linux-gnueabihf (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

 Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 


For other Linux distributions you would add a line similar to the above examples in the /etc/apt/sources.list. Check the documentation for your specific Linux platform for further information.

 

R for Beginners: Using R Commander in an Introductory Statistics Course


R for beginners:  Using R Commander in introductory statistics courses

A tutorial by D. M. Wiig

As with previous tutorials in this series this document is an embedded Word documents.  To view the document full screen click on the icon in the lower right corner of the window.

R for Beginners: Using R Commander for Basic t Tests and One Way ANOVA


R for Beginners:  Using R Commander for Basic t Tests and One Way ANOVA

A tutorial by D. M. Wiig

This post is contained in an embedded Word document.  To read it full screen click on the icon in the lower right corner of the document window.

I hope that you found this tutorial informative.  Stop back by to check for new installments.  I have many currently in the writing stage.

R for Beginners: Installing and Using the R Commander GUI, Part One


A tutorial by D.M. Wiig

This tutorial is posted as an embedded Word document.  To view the document full screen click on the button in the lower right corner of the window. Please note that you must be online for the full page Word document display to work.

R For Beginners: Installing and Using the R Console in a Windows Environment


An R tutorial by D. M. Wiig

This tutorial is posted as an embedded Word document. To view the document full screen click on the icon in the lower-right corner of the document window.

My next post covering installing and using the Rcommander GUI will be out in a day or two.

Using R to Create Ternary Diagrams: An Example Using 2016 Presidential Polling Data


An R Tutorial by D. M. Wiig

In previous tutorials I have discussed the basics of creating a ternary plot using the ggtern package using a simple hypothetical data frame containing five values. In a subsequent tutorial I discussed the application by creating a ternary graph using election results from the British House of Commons from the last half of the 20th century. This type of plot creates a very nice visual of the effects of a third party on the election outcome.

In this tutorial I will discuss using the same technique as applied to recent polling data from the ongoing 2016 U.S. presidential campaign. Before discussing the current election campaign I am going to refresh your memory relative to using the ggtern package.

Before running the script in this tutorial make sure that the packages ggplot, ggplot2, and ggtern are loaded into your R environment. Please also note the you will need a recent version of R that is version 3.1.x or newer. A very basic graph can be easily constructed. I will the use theoretical quantities XA , XB , and XC to demonstrate a basic ternary diagram. In this simple example I will create a sample of n=5 by entering the data from the keyboard into a data frame ‘sampfile.’ To invoke the editor use the following code:

###################################################

#create a sample file of n=5

###################################################

sampfile <-data.frame(Xa=numeric(0),Xb=numeric(0),Xc=numeric(0))

sampfile <-edit(sampfile)

###################################################

This will open up a data entry sheet with three columns labeled Xa, Xb, and Xc. The number that are entered do not matter for purposes of this illustration. The table I entered is as follows:

Xa      Xb      Xc

1 100   135   250

2 90     122    210

3 98      44     256

4 100   97     89

5 90     75     89

To produce a very basic ternary diagram with the above data set use the code segment:

##################################################

#do basic graph with sample data

##################################################

ggtern(data=sampfile, aes(x=Xa,y=Xb, z=Xc)) + geom_point()

##################################################

This produces the graph seen below:

A Simple Ternary Diagram
A Simple Ternary Diagram

  

The triangular representation of the dimensions Xa →Xb, Xc → Xa and Xb →Xc allow each case to be represented as a single point located relative to each of the three vectors. There are a large number of additions, modifications and tweaks that can be done to this basic pattern. In the next tutorial I will discuss generating a more elaborate ternary diagram using polling data from the current U.S. presidential campaign.

Thu US has a two party dominant system with several minor parties that regularly contest elections. In the current presidential election campaign there are the two major party candidates as well as two minor party candidates for the Libertarian and Green parties that are being included in the numerous public opinion polls that are being done nationally.

For purposes of this example I have added the percentages for these two minor parties together. This results in three variables that are being plotted, the percentage for Clinton (Democrat), Trump (Republican), and for the combined Johnson (Libertarian) and Stein (Green). By plotting the three variables over time on a ternary diagram we can visualize any changes in the mixture of support indicated for the candidates.

The poll data used in this project were taken from the web site RealClearPolitics.com for the time period from July 29 to August 18.¹ It should be noted that the poll numbers were not necessarily from the same polling organization for each date but all polls used were listed as being national in scope with a Clinton v. Trump v. Johnson v. Stein format.

Before working through this tutorial make sure that you have the ggplot, ggplot2, and ggtern packages loaded into your R environment.² I originally created the table shown above using Excel and then converted it into a *cvs format before importing it into R studio for analysis.³ The data can be entered directly via the R data editor as shown in the previous example. The code segment below was used to load the *csv format file:

####################################################Enter data into spreadsheet and save a a *csv file

#Load the data into a table using the read.table function

polldata <- read.table(“d:/16electiondata.csv”, header = TRUE, sep=”,”)

#Make sure the table is ok

View(polldata)

###################################################

date                 clinton trump johnson/stein
17-Aug              41          35                  10
16-Aug              43          37                  15
14-Aug              42          37                  12
11-Aug              43          40                  10
10-Aug              44         40                   13
9-Aug                 44         38                   14
8-Aug                 50         37                     9
7-Aug                 45         37                   12
5-Aug                 39         35                   17
4-Aug                 43         34                   15
2-Aug                 42         38                   13
1-Aug                 45         37                   14
30-Jul                46          41                    8
29-Jul                37          37                    6
25-Jul                39          41                   15
21-Jul                38          35                    0
19-Jul                39          40                   15
18-Jul                45          43                    6
17-Jul                42          37                   18

Once the data set is loaded use the following code to create the ternary diagram. Note that in this diagram we are using the base code as shown in the first tutorial with some additions that make the diagram easier to interpret such as the vector arrows and legend. The code segment is:

###################################################

#create ternary plot using percentage polled for each candidate for each polling period

#uses enhanced formatting for easier interpretation

#results of ggtern function are placed in variable ‘plot’ for rendering

###################################################

plot <- ggtern(data = polldata, aes(x = clinton, y = trump, z = johnson.stein)) +

geom_point(aes(fill = date),

 

size = 6,

shape = 21,

color = “black”) +

ggtitle(“2016 U.S. Presidential Election Polls”) +

labs(fill = “Date”) +

theme_rgbw() +

theme(legend.position = c(0,1),

legend.justification = c(1, 1))

###################################################

To show the diagram simply use:

###################################################

#now plot the diagram

###################################################

plot

###################################################

The resulting ternary diagram is:

2016ternplot

Each point on the graph represents the percentage of support for each of the three candidates by the location of the point on the 3-way graph axes. This R routine provides a quick and straightforward method for representing a 3-dimensional relationship in two dimensions.

Code segments in this article were written using R Studio Version 0.98.993 running R version 3.1.1  in a Windows 7 environment.

Notes:

¹As indicated above the poll data used in this tutorial was located at http://realclearpolitics.com. This website is an excellent source of information about all aspects of American electoral politics.

² For additional information about ternary graphs see the website http://www.ggtern.com. See also the CRAN website at http://cran.r-project.org/web/packages/ggtern/ggtern.pdf.

³For information about using the IDE R Studio see the website https://www.rstudio.com