R365: Day 7 – datasets package
After a nice weekend seeing Mimi and the newest (!) Choudhury, Becks, as well as a Superbowl (Seahawks for a very lopsided win over Broncos), I am almost out of time to write a post for day 7! I have always wanted to explore what sort of datasets are included in the package datasets, and so I figured that I would devote this post to looking at what is available. The datasets package contains all sorts of datasets that are alluded to in various other packages and functions. I first noticed this package when exploring how to plot data series over time (can’t remember the package) and they used the example of the body temperature of two beavers. How they got the body temperatures of two beavers, I have no idea. Some poor grad student in the days of yesterday probably crawled into a den somewhere with some unsuspecting beavers and some unwanted thermometers and spent a cozy 24 hours with them. However they acquired the data, the example dataset helped to illustrate how plotting out time series data could lead to very basic visual understanding of changes in biology or behavior.
Using library(help=”datasets”), you can find a list of all of the datasets that are included within the package, including a listing of monthly airline passenger numbers 1949-1960, location and depth of earthquakes around Fiji, dates of “important” discoveries, and many more. Some of the datasets, including the infamous beavers dataset, need special calls, so you cannot just type in plot(beavers) and get output. A quick google reveals that the beavers set is split up into two animals, #1 and #2, and that #2 seems to be running a fever during a portion of the experiment.
Here are a quick and dirty look at some of the graphs you can make while exploring the datasets package.
This graph shows the increase in quarterly sales earnings for the Johnson And Johnson company from 1960-1980 (datasets(johnsonjohnson))
This graph shows the levels of Lake Huron in Feet from the Mid-1800’s – the 1970’s (datasets(LakeHuron))
This graph I used a portion of the Orange dataset and plotted the age of orange trees by their circumference (datasets(Orange))
This graph shows some correlation between the levels of murders and assaults in US states (datasets(USArrests))
This graph (which I mentioned in the body of the text) is of body temperatures of two different beavers (datasets(beaver1) and datasets(beaver2))
The graph that launched a thousand powerpoints, the Mona Loa measurements of atmospheric CO2 (datasets(co2))
Yearly number of “discoveries” 1850’s – 1960’s. (datasets(discoveries))