R365: Day 23 – Neural Networks

I’ve wanted to explore neural networks for a while and today seems like a pretty good day to do it! Rainy, boring, and with a pint of pale ale, anything in R is entertaining. Artificial neural networks (ANNs) are a general class of analyses that look at finding connections between datasets. The math behind neural networks can be a bit over my head, but it seems that given a set of inputs and a specified number of “layers” in the model and a (typically) non-linear activation function, you arrive at an output. This underlying process mimics how our own minds process data and come to (sometimes erroneous) causal relationships. Because the artificial neural network is fairly naïve to what sorts of inputs are being used, you can end up with some cock-eyed results if you don’t screen them yourself. Think of when you were a kid and some technology “magically” performed a task and your mind came to the simplest (and often most fanciful) conclusions. This screening may be especially relevant in biology, where lots of different factors both biotic and abiotic can impact growth of a single species, but some interactions do not make sense (eg – number of grasshoppers does not meaningfully impact precipitation). For a really good / not completely dumbed-down visualization of ANNs, this page  was pretty useful even for someone with little to no formal mathematical training. I first heard of ANNs about 2 years ago when I was analyzing a set of data looking for a causal relationship between temperature and growth of a powdery mildew fungus. I had performed a factorial data and my committee member (and current advisor) Neil McRoberts suggested that I throw all of the data into an ANN and see what happens. I ended up going in another direction with the analysis, but I have always been interested. I recently re-read a review article in the journal Phytopathology about ANNs and I wanted to re-visit them.

This article gives a lucid and well-annotated example of how to use {neuralnet} to calculate square roots of an input.

I changed up the script a little bit so that it would be cubes, but really props to the original coder for such nice annotation.

#Going to create a neural network to perform sqare rooting
#Type ?neuralnet for more information on the neuralnet library

#Generate 50 random numbers uniformly distributed between 0 and 100
#And store them as a dataframe
traininginput <-  as.data.frame(rnorm(50, min=0, max=100))
trainingoutput <- (traininginput)^3

#Column bind the data into one variable
trainingdata <- cbind(traininginput,trainingoutput)
colnames(trainingdata) <- c("Input","Output")

#Train the neural network
#Going to have 10 hidden layers
#Threshold is a numeric value specifying the threshold for the partial
#derivatives of the error function as stopping criteria.
net.cube <- neuralnet(Output~Input,trainingdata, hidden=10, threshold=0.01)
print(net.cube)

#Plot the neural network
plot(net.cube)

#Test the neural network on some training data
testdata <- as.data.frame((1:10)^(1/3)) #Generate some cubed root numbers
net.results <- compute(net.cube, testdata) #Run them through the neural network

#Lets see what properties net.sqrt has
ls(net.results)

#Lets see the results
print(net.results$net.result)

#Lets display a better version of the results
cleanoutput <- cbind(testdata,(testdata)^3,
                     as.data.frame(net.results$net.result))
colnames(cleanoutput) <- c("Input","Expected Output","Neural Net Output")
print(cleanoutput)
155
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s