Tag Archives: r computing

How To: Download and install the latest version or R on your Linux Ubuntu OS

June 20, 2014 dmwiig Leave a comment

How To: Download and install the latest version or R on your Linux Ubuntu OS

(A Tutorial by D.M. Wiig)

I have several computers that use Linux operating systems and I have installed R on all of them. I use Debian on some of the machines and Ubuntu on others. When downloading R using the distribution’s package manager or from the command line I have notice that I will get versions of R ranging from 2.13.xx to 2.15.xxx depending on the Linux distribution. That has not been a problem until the release of the current version of R, version 3.0.3. Since this version is not backwards compatible with earlier releases it is necessary to upgrade to the new version to take advantage of new packages that are rapidly being developed as well as modification to existing packages to accommodate R 3.0.3. This tutorial will cover the installation of R 3.0.3 on the Ubuntu distribution of Linux.

When installing R 3.0.3 it is necessary to make sure that the current binaries are installed to your version of the Linux OS. If you are running a Ubuntu distribution you can edit the sources.list file on your computer to access the most up to date CRANs. Open a terminal program and enter the following from the command line:

$ cd /etc/apt/

$ dir

Make sure the file sources.list is in the directory and then edit the file opening the nano editor:

$ sudo nano sources.list

You should see a file in the editor that is similar to the file shown below:

—————————————————————————————————-

deb cdrom:[Kubuntu 11.10 _Oneiric Ocelot_ – Release i386 (20111012)]/ oneiric main restricted

deb http://streaming.stat.iastate.edu/CRAN/bin/linux/ubuntu precise/

# See http://help.ubuntu.com/community/UpgradeNotes for how to upgrade to

# newer versions of the distribution.

deb http://us.archive.ubuntu.com/ubuntu/ precise main restricted

deb-src http://us.archive.ubuntu.com/ubuntu/ precise main restricted

## Major bug fix updates produced after the final release of the

## distribution.

deb http://us.archive.ubuntu.com/ubuntu/ precise-updates main restricted

deb-src http://us.archive.ubuntu.com/ubuntu/ precise-updates main restricted

## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu

## team. Also, please note that software in universe WILL NOT receive any

## review or updates from the Ubuntu security team.

deb http://us.archive.ubuntu.com/ubuntu/ precise universe

deb-src http://us.archive.ubuntu.com/ubuntu/ precise universe

deb http://us.archive.ubuntu.com/ubuntu/ precise-updates universe

deb-src http://us.archive.ubuntu.com/ubuntu/ precise-updates universe

## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu

^G Get Help ^O WriteOut ^R Read File ^Y Prev Page ^K Cut Text ^C Cur Pos

^X Exit ^J Justify ^W Where Is ^V Next Page ^U UnCut Text ^T To Spell

—————————————————————————

I have highlighted the line that I added to this file. This line will force Linux to access the CRAN for the latest version of R in a library that is not normally searched for updates if you have an earlier version of R installed. For an Ubuntu distribution change the line to one of the following depending on the distribution that you have installed:

http://<myfavorite-cran-mirror> /bin/linux/ubuntu saucy/

http://<myfavorite-cran-mirror> /bin/linux/ubuntu quantal/

http://<myfavorite-cran-mirror> /bin/linux/ubuntu precise/

http://<myfavorite-cran-mirror> /bin/linux/ubuntu lucid/

Replace <myfavorite-cran-mirror> with the CRAN repository of your choice found at the web site http://cran.r.project.org/mirrors.html. In my case as shown above I used a CRAN repository here in Iowa at Iowa State University. Once the line has been entered in your sources.list file press ctrl-o to save the file, and press ctrl-x to exit the editor. Be sure when you invoke nano that you have root privileges (by using sudo nano) or you will not be able to write out the modified file.

Once you have successfully modified the sources.list file proceed with the R 3.0.3 installation by issuing the command:

$ sudo apt-get update (to make sure all supporting files are current)

and then:

$ sudo apt-get install r-base

When the update runs you should see that R 3.0.x is downloaded and is being installed. After the installation is complete test it by issuing the command:

$ R

You will see the output as shown below:

———————————————

R version 3.0.3 (2014-03-06) — “Warm Puppy”

Platform: i686-pc-linux-gnu (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.

You are welcome to redistribute it under certain conditions.

Type ‘license()’ or ‘license()’ for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.

Type ‘contributors()’ for more information and

‘citation()’ on how to cite R or R packages in publications.

Type ‘demo()’ for some demos, ‘help()’ for on-line help, or

‘help.start()’ for an HTML browser interface to help.

Type ‘q()’ to quit R.

You are now up and running with the latest version of R. The process for installation of R 3.0.x is similar for Debian and Fedora distributions. Each of these will be covered in a future tutorial.

R Tutorials

Using R in Nonparametric Statistics: Basic Table Analysis, Part Three, Using assocstats and collapse.table

June 19, 2014 dmwiig Leave a comment

A tutorial by D.M. Wiig

As discussed in a previous tutorial one of the most common methods displaying and analyzing data is through the use of tables. In this tutorial I will discuss setting up a basic table using R and exploring the use of the assocstats function to generate several commonly used nonparametric measures of association. The assocstats function will generate the association measures of the Phi-coefficient, the Contingency Coefficient and Cramer’s V, in addition to the Likelihood Ratio and Pearson’s Chi-Squared for independence. Cramer’s V and the Contigency Coefficient are commonly applied to r x c tables while the Phi-coefficient is used in the case of dichotomous variables in a 2 x 2 table.

To illustrate the use of assocstats I will use hypthetical data exploring the relationship between level of education and average annual income. Education will be measured using the nominal categories “High School”, “College”, and “Graduate”. Average annual income will be measured using ordinal categories and expressed in thousands:

“< 25”; “25-50”; “51-100” and “>100”

Frequency counts of individuals that fall into each category are numeric.

In the first example a 4 x 3 table created with hypothetical frequencies as shown below:

Income Education
(thousands) High School College Graduate

<25 15 8 5

26-50 12 12 8

51-100 10 22 25

>100 5 10 32
The first table, table1, is entered into R as a data frame using the following commands:

#create 4 x 3 data frame
#enter table1 in frequency form
table1 <- data.frame(expand.grid(income=c(“<25″,”25-50″,”51-100″,”>100″), education=c(“HS”,”College”, “Graduate”)),count=c(15,12,10,5,8,12,22,10,5,8,25,32))

Check to make sure the data are in the right row and column categories. Notice that the data are entered in the ‘count’ list by columns.

> table1
income education count
1 <25 HS 15
2 25-50 HS 12
3 51-100 HS 10
4 >100 HS 5
5 <25 College 8
6 25-50 College 12
7 51-100 College 22
8 >100 College 10
9 <25 Graduate 5
10 25-50 Graduate 8
11 51-100 Graduate 25
12 >100 Graduate 32
>

If the stable structure looks correct generate the table, tab1, using the xtabs function:

> #create table tab1 from data.frame
> tab1 <- xtabs(count ~income + education, data=table1)
Show the table using the command:

>tab1
education
income HS College Graduate
<25 15 8 5
25-50 12 12 8
51-100 10 22 25
>100 5 10 32
>
Use the assocstats function to generate measures of association for the table. Make sure that you have loaded the vcd package and the vcdExtras packages. Run assocstats with the following commands:

> assocstats(tab1)
X^2 df P(> X^2)
Likelihood Ratio 31.949 6 1.6689e-05
Pearson 32.279 6 1.4426e-05

Phi-Coefficient : 0.444
Contingency Coeff.: 0.406
Cramer’s V : 0.314
>

The measures show an association between the two variables. My intent is not to provide an analysis of how to evaluate each of the measures. There are excellent sources of documention on each measure of association in the R CRAN Literature. Since the Phi-coefficient is designed primarily to measure association between dichotomous variables in a 2 x 2 table,collapse the 4 x 3 table using the collapse.table function to get a more accurate Phi-coefficient. Since we want to go from a 4 x 3 to a 2 x 2 table we essentially collapse the table in two stages. The first stage collapses the table to a 2 x 3 table by combining the “<25” with the “25-50” and the “51-100” with the “>100” categories of income.

The resulting 2 x 3 table is seen below:

Education
Income High School College Graduate

<50 27 20 13

>50 15 32 57

To collapse the table use the R function collapse.table to combine the “<25” and “26-50” categories and the “50-100” and “>100” categories as discussed above:

> #collapse table tab1 to a 2 x 3 table, table2
> table2 <-collapse.table(tab1, income=c(“<50″,”<50″,”>50″,”>50″))

View the resulting table, table2, with:

> table2
education
income HS College Graduate
<50 27 20 13
>50 15 32 57
>

Now collapse the table to a 2 x 2 table by combining the “College” and “Graduate” columns:
> #collapse 2 x 3 table2 to a 2 x2 table, table3
> table3 <-collapse.table(table2, education=c(“HS”,”College”,”College”))

View the resulting table, table3, with:

> table3
education
income HS College
<25 27 33
>100 15 89
>

Use the assocstats function to evaluated the 2 x 2 table:

> #use assocstats on the 2 x 2 table, table3
> assocstats(table3)
X^2 df P(> X^2)
Likelihood Ratio 18.220 1 1.9684e-05
Pearson 18.673 1 1.5519e-05

Phi-Coefficient : 0.337
Contingency Coeff.: 0.32
Cramer’s V : 0.337
>

There are many other table manipulation function available in the R vcd and vcdExtras packages and well as other packages to provide analysis of nonparametric data. This series of tutorials hopefully serves to illustrate some of the more basic and common table functions using these packages. The next tutorial looks at the use of the ca function to perform and graph the results of a basic Correspondence Analysis.

	Hydra Themes on R for Beginners: Some Simple C…
	Juan Carlos Rubio Po… on Ternary Diagrams Using R: An E…
	Nicholas Beltran on R Video Tutorial: Basic R Code…
	Ellena Field on Using R for Basic Cross Tabula…
	Dynamics Square on Thanks for Visiting This Blog

R Statistics and Programming

Tag Archives: r computing

How To: Download and install the latest version or R on your Linux Ubuntu OS

Using R in Nonparametric Statistics: Basic Table Analysis, Part Three, Using assocstats and collapse.table

Resources and Information About R Statistics and Programming

Share this:

Share this:

Resources and Information About R Statistics and Programming