Tag Archives: r code

R For Beginners: Installing the latest version of R on a Linux platform

November 11, 2016 dmwiig Leave a comment

R for Beginners: Installing the latest version of R on a Linux platform

A tutorial by D. M. Wiig

One of the nice characteristics of open source software such as R is the rapid development of new releases and updates. While the base core remains stable for a period of time there is a considerable amount of updating, adding, and removing the component packages. At the time of this writing the latest iteration is R version 3.3.1, “Bug in Your Hair.” If you are using a Windows platform you will likely go directly to the archive web site and download the latest distribution as a Windows executable installation package.

If you are using a Linux distribution such as Ubuntu or Debian, the process of adding software is usually accomplished via the menu based installer. These software installers allow R and its dependencies to be downloaded from the community archive.

One of the disadvantages of using this approach is that the versions of some software in the archives may not be updated to the latest version. This is often the case with R.

To insure that you are downloading the latest R version you need to use the platform’s command line to install what is needed. Regradless of which Linux distribution you are using first open a command console from the desktop menu. Make sure all is up to date by using the command:

pi@raspberrypi:~ $ sudo apt-get update
This will insure all appropriate packages currently installed are running the latest updates. If you are running a Debian distribution such as jessie you will need to edit the /etc/apt/sources.list file to add a backport to the latest version of R. Use the nano editor by using the command:

sudo nano /etc/apt/sources.list

This should produce the output as seen below:

pi@raspberrypi:~ $ sudo nano /etc/apt/sources.list

------------------------------------------------
GNU nano 2.2.6 File: /etc/apt/sources.list

deb http://mirrordirector.raspbian.org/raspbian/ jessie main contrib non-free r$
# Uncomment line below then 'apt-get update' to enable 'apt-get source'
deb-src http://archive.raspbian.org/raspbian/ jessie main contrib non-free rpi
deb http://archive.raspbian.org/raspbian/ stretch main
deb http://mirror.las.iastate.edu/CRAN/bin/linux/debian/ jessie main
deb http://mirror.las.iastate.edu/CRAN/bin/linux/ubuntu xenial/

[ Read 8 lines ]
^G Get Help ^O WriteOut ^R Read File ^Y Prev Page ^K Cut Text ^C Cur Pos
^X Exit ^J Justify ^W Where Is ^V Next Page ^U UnCut Text^T To Spell

If you are using a Debian distribution you would add the line to the file

http://mirror.las.iastate.edu/CRAN/bin/linux/debian/ jessie main

Replace the mirror portion with <URL of your favorite CRAN mirror>.  Replace the 'jessie' portion with the name of the specific Debian distribution you are using.

If you are using an Ubuntu distribution add a line with the appropriate changes for the specific Ubuntu distribution that you are using.

Once these changes are made exit the nano editor using the ^O key command to write the file and then the ^X key command to return to the command line.  You should now be able to issue the command:

pi@raspberrypi:~ $ sudo apt-get install r-base r-base-core r-base-dev

Once the download and install processes have completed you should now be able to invoke R from the command line or menu and see the latest version:

pi@raspberrypi:~ $ R

R version 3.3.2 RC (2016-10-23 r71578) -- "Sincere Pumpkin Patch"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: arm-unknown-linux-gnueabihf (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

 Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 


For other Linux distributions you would add a line similar to the above examples in the /etc/apt/sources.list. Check the documentation for your specific Linux platform for further information.

R Code Development, R Tutorials

R Video Tutorial For Beginners: Installing And Using the Rcommander GUI

November 7, 2016 dmwiig Leave a comment

R Video Tutorial For Beginners: Installing And Using the Rcommander GUI

A tutorial video by D. M. Wiig

In my recent series of tutorials for those interested in the R statistical programming language I have discussed both the installation and use of the R console and R Commander statistics GUI. Before viewing the tutorial make sure the R Commander package has been download into your R library via the Install Packages menu option. This procedure was discussed in the previously posted R Commander tutorial.

Relative to this first tutorial I have have created a video that covers the initial installation of R Commander. The video is seen below:

Click the icon in the lower right side of the screen to view the tutorial in full screen mode.

I hope that you find this useful in your pursuit of learning about R statistics.

R Code Development, R Tutorials

R for Beginners: Using R Commander in an Introductory Statistics Course

September 28, 2016 dmwiig Leave a comment

R for beginners: Using R Commander in introductory statistics courses

A tutorial by D. M. Wiig

As with previous tutorials in this series this document is an embedded Word documents. To view the document full screen click on the icon in the lower right corner of the window.

R Code Development, R Tutorials

R for Beginners: Using R Commander, Graphing and Correlation

September 13, 2016 dmwiig Leave a comment

A tutorial by Douglas M. Wiig

Please note that this post is an embedded Word document. To read the document full screen click on the icon in the lower right portion of the document window.

R Code Development, R Tutorials

R for Beginners: Installing and Using the R Commander GUI, Part One

September 7, 2016 dmwiig 2 Comments

A tutorial by D.M. Wiig

This tutorial is posted as an embedded Word document. To view the document full screen click on the button in the lower right corner of the window. Please note that you must be online for the full page Word document display to work.

R Code Development, R Tutorials

R For Beginners: Installing and Using the R Console in a Windows Environment

September 2, 2016 dmwiig Leave a comment

An R tutorial by D. M. Wiig

This tutorial is posted as an embedded Word document. To view the document full screen click on the icon in the lower-right corner of the document window.

My next post covering installing and using the Rcommander GUI will be out in a day or two.

R Code Development, R Tutorials

R-Fiddle R Console and Data Editor: R Collaboration in the Cloud

February 12, 2016 dmwiig Leave a comment

R-Fiddle is a great tool to develop and test code segments or complete R programs. By accessing the R-Fiddle web site users have a fully functioning R console, code editor and discussion board all in one place. If a user has code uploaded that has been designated to share, other users can access the code and make suggestions or additions. Code can be run with full R support from your web browser.

Try the link below to test out R-Fiddle. I have uploaded a small program as a demo. Feel free to share your own projects, help others or try out code segments.

http://www.r-fiddle.org/#/embed?id=rtOt8yR3

Click in the link above to activate the R editor and R console.

R Tutorials

Ternary Diagrams Using R: An Example Using Election Outcomes

August 13, 2015 dmwiig 1 Comment

Ternary Diagrams Using R: An Example Using Election Outcomes

A tutorial by D. M. Wiig

In part one of this tutorial I discussed creating a ternary diagram using a simple data frame that contained five hypothetical cases. In this tutorial I will expand on that foundation by creating a more informative ternary diagram using live data.

A useful application of this package in social science research is creating a visual display of parliamentary election outcomes. Specifically we can use a ternary graph to examine the distribution of seats in the British House of Commons over a period of time. Since the UK uses a proportional system to allocate seats in the House of Commons there can be a variety of outcomes in any given national election.

Since 1945 general elections in the UK have produced a division of seats among the Labour, Conservative, and various minor parties. To demonstrate how this division of seats can be shown over time data was collected for all of the general elections from the years 1945 to 2015. These data show the percentage of the popular vote won by each party and the number of seats allocated to that party based on the vote division(retrieved from http://www.ukpolitical.info). I have created a summary table of these results as follows:

Year Con Lab LD+Other SeatsCon SeatsLab SeatsOther

2015 36.9 30.4 32.7 331 232 95

2010 36.1 29 34.9 306 258 85

2005 35.2 32.4 32.4 355 198 92

2001 40.7 31.7 27.6 412 166 81

1997 43.2 30.7 26.1 418 165 76

1992 42.3 35.2 23.5 336 271 44

1987 42.2 30.8 27 375 229 48

1983 42.4 27.6 26.9 397 209 27

1979 43.9 36.9 15.8 339 268 28

1974 39.2 35.8 21.8 319 276 39

1974 37.1 37.9 20.1 301 296 38

1970 46.4 43 8.6 330 287 19

1966 47.9 41.9 8.5 363 253 25

1964 44.1 53.4 11.2 317 304 22

1959 49.4 43.8 5.9 365 258 19

1955 49.7 46.4 0 344 277 18

1951 48 48.8 2.5 321 295 18

1950 46.1 43.5 9.1 315 297 22

1945 47.8 39.8 1 393 213 57

The UK has a two party dominant system with a number of minor parties that regularly contest elections. As indicated above, a proportional representation method of allocating seats is used so these minor parties are able to gain some representation in the Commons. For readers interested in learning more about political parties in the UK there are a number of resources readily available at various online and other sources.

For purposes of this example I have added the popular vote of all minor parties together in the ‘LD+Other’ column, and the number of seats gained in the ‘SeatsOther’ column. By plotting the three variables ‘SeatsCon’, ‘SeatsLab’, and ‘SeatsOther’ by year on a ternary diagram we can visualize any changes in the mixture of seats won for the three groups. Before working through this tutorial make sure that you have the ggplot, ggplot2, and ggtern packages loaded into your R environment.

I originally created the table shown above using Excel and then imported it into R studio for analysis. If you are not using R studio you can enter the data via the R data editor as shown in the previous tutorial, or put the data into an Excel or LibreOffice spreadsheet and import it into R using the read.spss() function that I have discussed in earlier tutorials. You can also use any other method that you are familiar with to get the data into your R environment.

Once the data set is loaded use the following code to create the ternary diagram. Note that in this diagram we are using the base code as shown in the first tutorial with some additions that make the diagram easier to interpret such as the vector arrows and legend. The code segment is:

################################################### #create ternary plot using seats allocated by party for each election #uses enhanced formatting for easier interpretation #results of #ggtern function are placed in ‘plot for rendering ################################################### plot <- ggtern(data = ukvotedata, aes(x = SeatsCon, y = SeatsLab, z = SeatsOther)) +geom_point(aes(fill = Year), size = 4, shape = 21, color = “black”) + ggtitle(“Proportion of Seats Won 1945-2015”) + labs(fill = “Year”) + theme_rgbw() + theme(legend.position = c(0,1), legend.justification = c(0, 1)) ###################################################

To show the diagram simply use:

################################################### #now plot the diagram ################################################### plot ###################################################

The resulting ternary diagram is:

Each point on the graph represent the relative division of seats for each of the 19 elections in the table. The shading represents the year with the darkest being 1945 and the lightest 2015. The diagram clearly shows the trend toward more minor party representation and a move away from the two major parties over time. Indeed coalition governments resulted in several of the more recent elections due to the increase in minor party influence.

My purpose here is not to discuss UK politics but to show how ternary diagrams can be used in a social science application. With the many additions and extensions that are being added to the ggtern package it can be a very power device for graphical analysis.

R Tutorials

Ternary Diagrams Using R: The ggtern Package

August 10, 2015 dmwiig 1 Comment

Ternary Diagrams Using R: The ggtern Package

A tutorial by Douglas M. Wiig

There are a number of very useful and popular graphics packages available for R such as lattice, ggplot, ggplot2 and others. Some of these offer general purpose graphics capabilities and others are more specialized. A recently developed extension to the ggplot2 package is ggtern. This package is essentially a wrapper for a number of functions that can be used to create a variety of ternary diagrams. Ternary diagrams are useful when analyzing the relationship among three factors or elements. A ternary diagram essentially represents the proportions of three related factors in two-dimensional space.

Before running the script in this tutorial make sure that the packages ggplot, ggplot2, and ggtern are loaded into your R environment. A basic graph can be easily constructed. I will the use theoretical quantities Xa , Xb , and Xc to demonstrate a basic ternary diagram. In this simple example I will create a sample of n=5 by entering the data from the keyboard into a data frame ‘sampfile.’ To invoke the editor use the following code:

################################################### #create a sample file of n=5 ################################################### sampfile <-data.frame(Xa=numeric(0),Xb=numeric(0),Xc=numeric(0)) sampfile <-edit(sampfile) ###################################################

This will open up a data entry sheet with three columns labeled Xa, Xb, and Xc. The number that are entered do not matter for purposes of this illustration. The table I entered is as follows:
Xa Xb Xc

1 100 135 250

2 90 122 210

3 98 144 256

4 100 97 89

5 90 75 89

To produce a very basic ternary diagram with the above data set use the command:

################################################## #do basic graph with sample data ################################################## ggtern(data=sampfile,aes(x=Xa,y=Xb, z=Xc))+geom_point() ##################################################

This produces the graph seen below:

As can be seen the triangular representation of the dimensions Xa →Xb, Xc → Xa and Xb →Xc allow each case to be represented as a single point located relative to each of the three vectors. There are a large number of additions, modifications and tweaks that can be done to this basic pattern.

In the next tutorial I will discuss generating a more elaborate ternary diagram using election outcome data from British general elections. For more information about the ggtern package see the CRAN documentation and information as well as the web site http://www.ggtern.com for all of the latest news and developments.

R Tutorials

Using R: Random Sample Selection and One-Way ANOVA

August 7, 2015 dmwiig Leave a comment

A tutorial by Douglas M. Wiig

In the previous tutorial we looked at the hypothesis that one’s outlook on life is influenced by the amount of education attained. Using the GSS 2014 data file we looked at the education variable ‘educ’, and the outlook on life variable ‘life’, a measure of outlook on life as ‘DULL’, ‘ROUTINE’, or ‘EXCITING.’ We selected a subset for each response category and found that there appeared to be
differences among the mean level of education measured in years for each of the categories of outlook on life. To further examine this we will first randomly select a sample from the data file, look at the
mean education for each category of outlook on life, and evaluate the means using simple one-way ANOVA.

To randomly select a sample from a population of values we can use the sample() function. There are a number of options and variations of the function that are beyond the scope of this tutorial. Since the
variable ‘educ’ is measured in years we can use the sample.int() function which is designed for use with integer values. The general format of the function is:

sample.int(n, size = n, replace = FALSE)

where: n = the size of population the sample is from
size = the size of the sample
replace = FALSE if sampling without replacement; TRUE if sampling with replacement

For this example I will select a sample of n=500 without replacement from the data file containing a total of 2538 cases. The sample data is loaded into a data matrix as it is selected. This will be
accomplished in two steps. In the first step we will load the sample.int() function with the values to use for selecting the sample and put the vector in ‘randsamp2. The code is:

randsamp2 <- sample.int(2538, size=500, replace=FALSE)

To select the sample make sure that make sure that the GSS2014 data file is loaded into the R environment. I previously loaded the data file into a data frame ‘gss14.’ To select the sample and load
it into a data frame ‘randgss2’ the code is:

randgss2 <- gss14[randsamp2,]

Once the sample has been generated we can look at the mean years of education for each of the three responses for outlook on life. We do this by selecting a subset for each response. Use the following
code:

###################################################
#look at educ means by life by selecting 3 subsets from randgss2
###################################################
life12 <- subset(randgss2, life == “DULL”, select=educ)
life22 <- subset(randgss2, life == “ROUTINE”, select=educ)
life32 <- subset(randgss2, life == “EXCITING”, select=educ)

Now run summary statistics for each subset to look at the means:

summary(life12)
summary(life22)
summary(life32)

We can now see that the means are as follows:

life12 = 13.0
life22 = 13.29
life 32 = 14.51

and we can generate a summary visual of the differences among the three subsets by doing a simple boxplot using:

###################################################
# do boxplots of the subsets to visualize differences
#boxplot using educ and life variables from the ‘randgss2’ data #frame
###################################################
boxplot(randgss2$educ ~ randgss2$life, main=”Education and View on Life n = 500″, xlab=”View of Life”,ylab=”Years of Education”)

The following graph will result:

As can be seen above there does appear to be a difference among these means, particularly for those who see life as ‘DULL.’ To see if these differences are significant an ANOVA will be run using the
simple one way ANOVA function aov(). The basic function is:
aov(formula, data = NULL)For our example we use:

model2 <- aov(educ ~ life, data=randgss2)

which analyzes the mean education by category of outlook on life using the randgss2 sample of n=500. The results are stored in ‘model2.’ The output from this operation is shown using the summary() function. This produces the following output:

summary(model2)

Df Sum Sq Mean Sq F value Pr(>F) life
2 171.5 85.77 10.42 4.08e-05 ***
Residuals 332 2732.2 8.23
—
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
165 observations deleted due to missingness

This output shows that at least one of the means differs significantly from the others. To test this difference further we can use a pair-wise comparison of means to see which means differ significantly
from each other. There are several options available. We will use a basic Tukey HSD comparison. This is accomplished using:

##################################################
#run HSD on sample
TukeyHSD(model2)
##################################################

producing the following output:

Tukey multiple comparisons of means
95% family-wise confidence level

Fit: aov(formula = educ ~ life, data = randgss2, projections = TRUE)

$life diff lwr upr p adj
ROUTINE-EXCITING -1.074404 -1.833686 -0.31512199 .0027590
DULL-EXCITING -2.729412 -4.447386 -1.01143754 .0006325
DULL-ROUTINE -1.655008 -3.384551 0.07453525 00640910

By looking at the p value for each comparison it can be seen that both the ROUTINE-EXCITING and DULL-EXCITING means differ significantly at p ≤ .05

I might point out that if a researcher was using the GSS 2014 data file as we used here there would need to be more data preparation prior to running any analysis. For example, there is a fair amount of
missing data as indicated by NA in the raw data file. The missing data would need to be handled in some way. R has numerous functions and packages that can assist in resolving missing data issues of
various types, but a discussion of these is a subject for a future tutorial.
8/7/15 Douglas M. Wiig http://dmwiig.net

	Olavi Koskela on This Site Now Updating With Ne…
	Hydra Themes on R for Beginners: Some Simple C…
	Juan Carlos Rubio Po… on Ternary Diagrams Using R: An E…
	Nicholas Beltran on R Video Tutorial: Basic R Code…
	Ellena Field on Using R for Basic Cross Tabula…

R Statistics and Programming

Tag Archives: r code

R For Beginners: Installing the latest version of R on a Linux platform

R Video Tutorial For Beginners: Installing And Using the Rcommander GUI

R for Beginners: Using R Commander in an Introductory Statistics Course

R for Beginners: Using R Commander, Graphing and Correlation

R for Beginners: Installing and Using the R Commander GUI, Part One

R For Beginners: Installing and Using the R Console in a Windows Environment

Ternary Diagrams Using R: An Example Using Election Outcomes

Ternary Diagrams Using R: The ggtern Package

Using R: Random Sample Selection and One-Way ANOVA

Resources and Information About R Statistics and Programming

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Resources and Information About R Statistics and Programming