Dr. Mark Gardener

GO...
Gardeners Own Home
Navigation Index
Using R Introduction
About Us

On this page...

Line Plots

Plot types

Time series

Custom axes

Using R for statistical analyses - More on graphs

This page is intended to be a help in getting to grips with the powerful statistical program called R. It is not intended as a course in statistics (see here for details about those). If you have an analysis to perform I hope that you will be able to find the commands you need here and copy/paste them into R to get going.

I run training courses in data management, visualisation and analysis using Excel and R: The Statistical Programming Environment. From 2013 courses will be held at The Field Studies Council Field Centre at Slapton Ley in Devon. Alternatively I can come to you and provide the training at your workplace. See details on my Courses Page.

On this page learn how to create line plots and add custom axes for graphs.

See also: R Courses | R Tips, Tricks & Hints | MonogRaphs | Writer's bloc


My publications about R

See my books about R on my Publications page

Statistics for Ecologists | Beginning R | The Essential R Reference | Community Ecology

Statistics for Ecologists is available now from Pelagic Publishing. Get a 20% discount using the S4E20 code!
Beginning R is available from Wrox the publisher or see the entry on Amazon.co.uk.
The Essential R Reference is available from the publisher Wiley now (see the entry on Amazon.co.uk)!
Community Ecology is in production now and expected by the end of 2013 from Pelagic Publishing.

I have more projects in hand - visit my Publications page from time to time. You might also like my random essays on selected R topics in MonogRaphs. See also my Writer's Bloc page, details about my latest writing project including R scripts developed for the book.


Skip directly to the 1st topic

R is Open Source

R is Free

Get R at the R Project Page

What is R?

R is an open-source (GPL) statistical environment modeled after S and S-Plus. The S language was developed in the late 1980s at AT&T labs. The R project was started by Robert Gentleman and Ross Ihaka (hence the name, R) of the Statistics Department of the University of Auckland in 1995. It has quickly gained a widespread audience. It is currently maintained by the R core-development team, a hard-working, international team of volunteer developers. The R project web page is the main site for information on R. At this site are directions for obtaining the software, accompanying packages and other sources of documentation.

R is a powerful statistical program but it is first and foremost a programming language. Many routines have been written for R by people all over the world and made freely available from the R project website as "packages". However, the basic installation (for Linux, Windows or Mac) contains a powerful set of tools for most purposes.

Because R is a programming language it can seem a bit daunting; you have to type in commands to get it to work. However, it does have a Graphical User Interface (GUI) to make things easier. You can also copy and paste text from other applications into it (e.g. word processors). So, if you have a library of these commands it is easy to pop in the ones you need for the task at hand. That is the purpose of this web page; to provide a library of basic commands that the user can copy and paste into R to perform a variety of statistical analyses.


Top

Navigation index

Introduction

Getting started with R:

Top
What is R?
Introduction
Data files
Inputting data
Seeing your data in R
What data are loaded?
Removing data sets
Help and Documentation


Data2

More about manipulating data and entering data without using a spreadsheet:

Making Data
Combine command
Types of Data
Entering data with scan()
Multiple variables
More types of data
Variables within data
Transposing data
Making text columns
Missing values
Stacking data
Selecting columns
Naming columns
Unstacking data


Help and Documentation

A short section on how to find more help with R

 

Basic Statistics

Some statistical tests:

Basic stats
Mean
Variance
Quantile
Length

T-test
Variance unequal
Variance Equal
Paired t-test
T-test Step by Step

U-test
Two sample test
Paired test
U-test Step by Step

Paired tests
T-test: see T-test
Wilcoxon: see U-test

Chi Squared
Yates Correction for 2x2 matrix
Chi-Squared Step by Step

Goodness of Fit test
Goodness of Fit Step by Step


Non-Parametric stats

Stats on multiple samples when you have non-parametric data.

Kruskal Wallis test
Kruskal-Wallis Stacked
Kruskal Post-Hoc test
Studentized Range Q
Selecting sub-sets
Friedman test
Friedman post-hoc
Rank data ANOVA

 

Correlation

Getting started with correlation and a basic graph:

Correlation
Correlation and Significance tests
Graphing the Correlation
Correlation step by step


Regression

Multiple regression analysis:

Multiple Regression
Linear regression models
Regression coefficients
Beta coefficients
R squared
Graphing the regression
Regression step by step


ANOVA

Analysis of variance:

ANOVA analysis of variance
One-Way ANOVA
Simple Post-hoc test
ANOVA Models
ANOVA Step by Step

 

Graphs

Getting started with graphs, some basic types:

Introduction
Bar charts
Multi-category
Stacked bars
Frequency plots
Horizontal bars

Histograms

Box-whisker plots
Single sample
Multi-sample
Horizontal plot


Graphs2

More graphical methods:

Scatter plot

Stem-Leaf plots

Pie charts


Graphs3

More advanced graphical methods:

Line Plots
Plot types
Time series
Custom axes

Bottom


Top

Navigation Index

Line plots

Previously we learnt about bar charts (incl. histograms), box-whisker plots and scatter graphs. However, there may be occasions when we wish to display data as a line, perhaps to show a time series. There is no specific lineplot command in R so we must use other graph types and coerce the program to produce our line.


Top

Navigation Index

 

use type ="b" to produce a plot with both points and lines e.g. plot(x, y, type= "b")

Plot types

If we produce a plot we generally get a series of points. The default symbol for the points is an open circle but we can alter it using the pch= n parameter (see the section on scatter plots). Actually the points are only one sort of plot type that we can achieve in R (the default). We can use the parameter type = 'type" to create other plots.

Command Plot type
type = "p" Produces points.
type = "l" Produces line segments.
type = "b" Produces points joined by line segments.
type = "o" Similar to "b" but the points are overlaid onto the line.
type = "n" Produces a graph with nothing it it! This can be used to create a graph frame that you add lines to later.

So for example we may type: plot (x, y, type = "b") to produce a simple line plot with added points.


Top

Navigation Index

Time Series

Rather than an series of x, y data you may have a single time series. Here is an example of data to illustrate.

vostok

 
month
temp
1
Jan
-32.0
2
Feb
-47.3
3
Mar
-57.2
4
Apr
-62.9
5
May
-61.0
6
Jun
-70.6
7
Jul
-65.5
8
Aug
-68.2
9
Sep
-63.2
10
Oct
-58.0
11
Nov
-42.0
12
Dec
-30.4

Here we have mean monthly temperatures for an Antarctic research station. The file was read in using the standard read.csv() command and so contains two columns; month is a factor and temp is a numeric variable (see the section on data types for more information).

If we attempt to plot the whole variable e.g. plot(temp ~ month) we get a horrid mess (try it and see). This is because the month is a factor and cannot be represented on an x,y scatter plot.

However, if we plot the temperature alone we get the beginnings of something sensible:

attach(vostok)
plot(temp)

So far so good. There appear to be a series of points and they are in the correct order. We can easily join the dots to make a line plot by adding (type= "b") to the plot command (see the section on plot types). Notice how R have used default labels for the axes, temp for the y-axis is taken from the values in the variable but index is used for the x-axis because we have no reference (we only plotted a single variable).

What we need to do next is to alter the x-axis to reflect our month variable.


Top

Navigation Index

Custom axes

When we look at the time series plot produced above we see that the x-axis needs a bit of work. Since the plot was made from a single variable (temp) there are no values for x and R substitutes a numeric index.

We need to scrap the current axes and start again with our own. It is simple to produce a plot with no axes, merely add (axes= F) to the plot command like so:

plot(temp, axes= F)

However, R appends default labels to the axes so we need to get rid of those too:

plot(temp, axes= F, xlab= "", ylab= "")

That does the job. We are going to add axis labels of course so could have specified them now but I use the "" double (double) quotes to illustrate how to produce blank ones (setting xlab= F produces a label FALSE so we have to use "").

To add an axis we use the axis() command. Axis 1 is the bottom of the plot (i.e. the x-axis), axis 2 is the left side of the plot (the y-axis). We can also specify the top (3) and the right side (4) if we wish. In it's simplest form axis(n) adds in the axis specified with it's default parameters. This won't do here because the default x-axis contains only index information. We need to tell R where to find the labels associated with the axis.

To generate an axis we need to specify the length of it and the labels to be used. Here is what we need for our temperature example:

axis(1, at = 1:length(temp), labels = month)

 

Oops, that doesn't look right. The 1 specified the bottom (x) axis so that is okay, the at= part specified the length (this is right, there are 12) but the months ar not displayed, we get numerical values instead.

The problem is that the month variable is not plain text but is regarded as a factor. The types of data are covered in the section on manipulating data. We have three sensible ways of altering the month column into a text variable.

We can read the original CSV data file in with an extra command to regard the month column as text (see the section on reading text from CSV files). To do that we would append as.is = 1 to the command (assuming that the month column was the first one) so:

vostok = read.csv(file.choose(), as.is = 1)

We could read the month column as row names instead:

vostok = read.csv(file.choose(), row.names= 1)

As a last resort we could create a new vector of text and type the names. This is rather tedious but if there were only a few it might be worth considering. See the section on manipulating data for ways to do this.

Assuming that we chose the option of setting the column as text using the as.is parameter we can now re-run our axis command so:

axis(1, at = 1:length(temp), labels = month)

If we had set the month column to be row names we would modify the axis() command slightly:

axis(1, at = 1:length(temp), labels = row.names(vostok))

Now we need to add in the y-axis and the axis labels. We could also add a title and perhaps the whole thing would look better if the dots were joined up to make a lineplot (which was after all the point of the exercise). Here is the whole series of commands from start to finish.

vostok= read.csv(file.choose(), as.is= 1)
attach(vostok)
plot(temp, axes=F, xlab="", ylab= "", type= "b")
axis(1, at = 1:length(temp), labels = month)
axis(2)
title(main= "Time Series", font.main=4, xlab= "Month", ylab= "Mean Temp C")
box()

The box() command merely adds a border around the plot. This looks a lot better. It is possible to alter the plot character and the colour of the lines, see the section on scatter plots for more information.


 
Gardeners Own Home
Top
Navigation Index