Using R in Nonparametric Statistics: Basic Table Analysis, Part Two

Using R in Nonparametric Statistics: Basic Table Analysis, Part Two

A Tutorial by D.M. Wiig

As discussed in a previous tutorial one of the most common methods display ng and analyzing data is through the use of tables. In this tutorial I will discuss setting up a basic table using R and exploring the use of the CrossTable function that is available in the R ‘gmodel’ package. I will use the same hypothetical data table that I created in Part One of this tutorial, data that examines the relationship between income and political party identification among a group of registered voters. The variable “income” will be considered ordinal in nature and consists of categories of income in thousands as follows:

“< 25”; “25-50”; “51-100” and “>100”

Political party identification is nominal in nature with the following categories:

“Dem”, “Rep”, “Indep”

Frequency counts of individuals that fall into each category are numeric. In the first example we will create a table by entering the data as a data frame and displaying the results. When using this method it is a good idea to set up the table on paper before entering the data into R. This will help to make sure that all cases and factors are entered correctly. The table I want to generate will look like this:

party
income Dem Rep Indep
<25 1 5 5 10
26-50 20 15 15
51-100 10 20 10
>100 5 30 10

When using the CrossTable() function the data should be entered in matrix format. Enter the data from the table above as follows:

>#enter data as table matrix creating the variable ‘Partyid’
>#enter the frequencies
>Partyid <-matrix(c(15,20,10,5, 5,15,20,30, 10,15,10,10),4,3)
>#enter the column dimension names and column heading categories
>dimnames(Partyid) = list(income=c(“<25”, “25-50″,”51-100”, “>100”), party=c(“Dem”,”Rep”,”Indep”))

To view the structue of the created data matrix use the command:

> str(Partyid)
num [1:4, 1:3] 15 20 10 5 5 15 20 30 10 15 …
– attr(*, “dimnames”)=List of 2
..$ income: chr [1:4] “<25” “25-50” “51-100” “>100”
..$ party : chr [1:3] “Dem” “Rep” “Indep”
>

To view the table use the command:

> Partyid
party
income Dem Rep Indep
<25 15 5 10
25-50 20 15 15
51-100 10 20 10
>100 5 30 10
>

Remember that R is case sensitive so make sure you use upper case if you named your variable ‘Partyid.’

Once the table has been entered as a matrix it can be displayed with a number of available options using the CrossTable() function. In this example I will produce a table in SAS format(default format), display both observed and expected cell frequencies, the proportion of the Chi-square total contributed by each cell, and the results of the chi-square analysis. The script is:
> #make sure gmodels package is loaded
> require(gmodels)
> #CrossTable analysis
> CrossTable(Partyid,prop.t=FALSE,prop.r=FALSE,prop.c=FALSE,expected=TRUE,chisq=TRUE,prop.chisq=TRUE)

Cell Contents
|—————————–|
| N |
| Expected N |
| Chi-square contribution |
|—————————-|
Total Observations in Table: 165
| party
income | Dem | Rep | Indep | Row Total |
<25 | 15 | 5 | 10 | 30 |
| 9.091 | 12.727 |8.182 | |
| 3.841 | 4.692 | 0.404 | |

25-50 | 20 15 | 15 | |50

15.152 | 21.212 | 13.636 | |
| 1.552 | 1.819 | 0.136 | |

51-100 | 10 | 20 | 10 | 40 |
| 12.121 | 16.970 | 10.909 | |
| 0.371 | 0.541 | 0.076 | |
————-|———–|———–|———–|———–|
>100 | 5 | 30 | 10 | 45 |
| 13.636 | 19.091 | 12.273 | |
| 5.470 | 6.234 | 0.421 | |
————-|———–|———–|———–|———–|
Column Total | 50 | 70 | 45 | 165 |
————-|———–|———–|———–|———–|
Statistics for All Table Factors
Pearson’s Chi-squared test
————————————————————
Chi^2 = 25.55608 d.f. = 6 p = 0.0002692734

As seen above row marginal totals and column marginal totals are displayed by default with the SAS format. There are other options available for the CrossTable() function. See the CRAN documentation for a detailed description of all of the options available. In the next installment of this tutorial I will examine some of the measures of association that are available in R for nominal and ordinal data displayed in a table format.

	Hydra Themes on R for Beginners: Some Simple C…
	Juan Carlos Rubio Po… on Ternary Diagrams Using R: An E…
	Nicholas Beltran on R Video Tutorial: Basic R Code…
	Ellena Field on Using R for Basic Cross Tabula…
	Dynamics Square on Thanks for Visiting This Blog

R Statistics and Programming

Using R in Nonparametric Statistics: Basic Table Analysis, Part Two

Leave a comment Cancel reply

Resources and Information About R Statistics and Programming

Share this:

Leave a comment Cancel reply

Resources and Information About R Statistics and Programming