R365: Day 35 – {zoo}

Most people in science have to deal with time series at some point or another: you measured glucose levels in blood over a period of weeks, you recorded the number of birds in a habitat over a summer, you monitored the efficiency of a an enzyme as it was overwhelmed with substrate. Beyond science, people like economists monitor for trends in markets over time (e.g.-the number of new home buyers in the US from 2006-2012). Whatever your poison, time series are inescapable. We have already covered some of the tools that you can use to view and analyse time series, like ts() and acf() and detrend(). While these tools are useful, they don’t always work if your time series is irregular (irregular data intervals, time data in different units, etc). {zoo} is a useful package if you do anything with time series.

Something that {zoo} does not like to do is handle duplicate time data. So if you have two data points from the same date in a {zoo} time series, many of the zoo functions will not work. One way to handle that is to average the replicated time data points, using aggregate()

> z <- suppressWarnings(zoo(1:8, c(1, 2, 2, 2, 3, 4, 5, 5)))
> z
1 2 2 2 3 4 5 5 
1 2 3 4 5 6 7 8 
> aggregate(z, identity, mean)
 1 2 3 4 5 
1.0 3.0 5.0 6.0 7.5

or just taking the last data point in a series of repeated data points:

> aggregate(z, identity, tail, 1)
1 2 3 4 5 
1 4 5 6 8

and you can even interpolate between time points if that is useful to you

> time(z) <- na.approx(ifelse(duplicated(time(z)), NA, time(z)), na.rm = FALSE)
> z[!is.na(time(z))]
 1 2 2.3333 2.6667 3 4 5 
 1 2 3 4 5 6 7

{zoo} also does not like to display log scales for the x axis, although you can circumvent that using the suppressWarnings() function. You can also create your own script to get around that

> z <- zoo(1:100)
> plot(z, log = “y”, panel = function(…, log) lines(…))


{zoo} can also read directly from .txt and .csv files, but you have to have them formatted correctly. I had an issue when I had multiple different time series (eg-site A, B, C,D, and E) all next to each other in one file. {zoo} read the first column as a date, but it did not interpret other columns as dates.


I will talk more about {zoo} as things come up.



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s