I ran into expand.grid() a couple of times while exploring examples recently and I wanted to know what it did. The descriptor for the function (“Create a data frame from all combinations of the supplied vectors or factors”) was kind of confusing. It made it seem like it would function like the regular data.frame() function, where you supply the exact number of variables that you want in each category and the resulting dataframe is exactly as you specified.
Rather, you give expand.grid() m, n, and o variables and it will give you back a dataframe m×n×o long. Lets explore.
Lets say you want to create a dataset so that it looks like you measured 20 individuals of each sex at 6 different weights for something that falls on a normal curve. We can make that dataset:
expand.grid(height = runif(20), weight = seq(100, 300, 50), + sex = c("Male","Female"))
And now you end up with a dataset that looks like this:
height weight sex 1 0.48631865298 100 Male 2 0.26785198110 100 Male 3 0.76407905715 100 Male 4 0.48805436585 100 Male 5 0.74689734611 100 Male 6 0.04256570549 100 Male ... 200 0.02533159009 300 Female
This is pretty nifty and might come in handy if you want your own training dataset.