Using R for Nonparametric Analysis, The Kruskal-Wallis Test Part Four: R Script and Some Notes on IDE’s


Using R for Nonparametric Analysis, The Kruskal-Wallis Test: R Script and Some Notes on IDE’s


A tutorial by Douglas M. Wiig


In the previous three parts of this tutorial I discussed using R to enter a data set and perform a nonparametric Kruska-Wallis test for ranked means. In this final part the commented script that was used in the first three parts is listed. 


If you are going to use R for the majority of your statistical analysis it is highly advisable that you investigate some of the IDE’s (Integrated Development Environments) that are available to assist in coding and debugging R script and creating R packages for personal use or distribution. I think one of the easiest to use is R Studio. R Studio is available in both free open source and commercial versions and can be downloaded at    There are versions available for Windows, various Linux distributions, and Mac OS.

The R studio console provides a number of useful tools that facilitate coding. The screen is divided into four sections with one section providing a code editor that features syntax highlighting, code completion and many other features such as line or block code execution. Another window contains R and all displays output, error messages and warnings when code is executed from the editor. A third window displays all of the current environmental variables that are active and can also show all currently loaded R packages. A fourth window can show graphic output from executed code, can be used to manage, download and install R packages, and can be used to access the CRAN database of online help. There are other useful tools that are too numerous to discuss here.


Another program that is worth looking at is RKWard which combines an IDE with a graphics GUI for R statistical analysis. Information and downloads for RKWard can be found at This program is also free and open source and can be run on a Windows platform, Mac OS, or various distributions of Linux. The program has been optimized for Linux. A discussion of these IDE’s is beyond the scope of this posting.


Shown below is the commented R script for all three parts of the Kruskal-Wallis tutorial. For ease of reading code portions are shown in bold print.


#packages that must be present in the global environment before running these scripts


#stats; graphics; grDevices; utils; datasets; methods; base




#code to enter data using the data editor


#KW data entry, define file kruskal as a data frame


kruskal <-data.frame()


#invoke the data editor


kruskal <-edit(kruskal)


#define group as containing 3 factors; tell R which data column goes with which factor


group <- factor(1,2,3)


#alternative data entry method


#Define factor Group as containing three categories


Group <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3)


#create a vector defined as authscore and enter values


authscore <- c(96,128,83,61,101,82,121,132,135,109,115,149,166,147)


#create data frame kruskal matching each group factor to individual scores


kruskal <- data.frame(Group, authscore)


#use the following line to look at the structure of the data frame created




#run the basic Kruskal-Wallis test




kruskal.test(authscore ~ Group, data=kruskal)




#the following code is used to conduct a post-hoc comparison of the ranked means


#it is useful to first do a simple boxplot for a visual comparison


#Use this script to save the boxplot graphic to a .png data file


#save output in pdf file authplot


#send output to screen and file




authplot <- boxplot(authscore ~ Group, data=kruskal, main=”Group Comparison”, ylab=”authscore”)


#save .png file




#now return all output to console




#make sure that the package PMCMR is loaded before running the following script




#use the ‘with’ function to pass the data from the kruskal data frame to the post hoc


#test script; specify the Tukey HSD method for determining significance of each


# pair of comparisions




with(kruskal, {



posthoc.kruskal.nemenyi.test(authscore, Group, “Tukey”)






#NOTE: if using a version of R < 3.xx then use the package pgirmess instead of PMCMR




#the following lines show the post hoc analysis using the pgirmess package


#note the function kuskalmc is used for the comparisons




authscore <- c(96,128,83,61,101,82,121,132,135,109,115,149,166,147)


Group <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3)


kruskalmc(authscore ~ Group, probs=.05, cont=NULL)





Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s