# More neurons in the hidden layer than predictive features in neural nets

**R-english – Freakonometrics**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This week, we were talking about neural networks for the first time, and I was saying that, in many illustrations of neural networks, there was a layer with fewer neurons than predictive variables,

but sometimes, it could make sense to have more neurons in the layer than predictive variables,

To illustrate, consider a simple example with one single variable \(x\), and a binary outcome \(y\in\{0,1\}\)

```
set.seed(12345)
n = 100
x = c(runif(n),1+runif(n),2+runif(n))
y = rep(c(0,1,0),each=n)
```

We should insure that observations are in the \([0,1]\) interval,

```
minmax = function(z) (z-min(z))/(max(z)-min(z))
xm = minmax(x)
df = data.frame(x=xm,y=y)
```

just like what we can visualize below

`plot(df$x,rep(0,3*n),col=1+df$y)`

Here, the blue and the red dots (when \(y\) is either 0 or 1) are not linearly separable. The standard activation function in neural nets is the sigmoid

`sigmoid = function(x) 1 / (1 + exp(-x))`

Let us compute a neural network

```
library(nnet)
set.seed(1234)
model_nnet = nnet(y~x,size=2,data=df)
```

We can then get the weights, and we can visualize the two neurons

```
w = neuralweights(model_nnet)
x1 = cbind(1,df$x)%*%w$wts$"hidden 1 1"
x2 = cbind(1,df$x)%*%w$wts$"hidden 1 2"
b = w$wts$`out 1`
plot(sigmoid(x1),sigmoid(x2),col=1+df$y)
```

Now, the the blue and the red dots (when \(y\) is either 0 or 1) are actually linearly separable.

`abline(a=-b[1]/b[3],b=-b[2]/b[3])`

If we do not specify the seed of the random generator, we can get a different outcome since, obviously, this model is not identifiable

or

If we now have

```
set.seed(12345)
n=100
x=c(runif(n),1+runif(n),2+runif(n),3+runif(n))
y=rep(c(0,1,0,1),each=n)
xm = minmax(x)
df = data.frame(x=xm,y=y)
plot(df$x,rep(0,4*n),col=1+df$y)
```

Now we need more neurons

```
set.seed(321)
model_nnet = nnet(y~x,size=3,data=df)
w = neuralweights(model_nnet)
x1 = cbind(1,df$x)%*%w$wts$"hidden 1 1"
x2 = cbind(1,df$x)%*%w$wts$"hidden 1 2"
x3 = cbind(1,df$x)%*%w$wts$"hidden 1 3"
b = w$wts$`out 1`
library(scatterplot3d)
s3d = scatterplot3d(x=sigmoid(x1),
y=sigmoid(x2), z=sigmoid(x3),color=1+df$y)
```

but one more time, we have been able to separate (linearly) the blue and the red points

Finally, consider

```
set.seed(123)
n=500
x1=runif(n)*3-1.5
x2=runif(n)*3-1.5
y = (x1^2+x2^2)<=1
x1m = minmax(x1)
x2m = minmax(x2)
df = data.frame(x1=x1m,x2=x2m,y=y)
plot(df$x1,df$x2,col=1+df$y)
```

and again, we three neurons (for two explanatory variables) we can, again, linearly, separate the blue and the red points

```
set.seed(1234)
model_nnet = nnet(y~x1+x2,size=3,data=df)
w = neuralweights(model_nnet)
x1 = cbind(1,df$x1,df$x2)%*%w$wts$"hidden 1 1"
x2 = cbind(1,df$x1,df$x2)%*%w$wts$"hidden 1 2"
x3 = cbind(1,df$x1,df$x2)%*%w$wts$"hidden 1 3"
b = w$wts$`out 1`
library(scatterplot3d)
s3d <- scatterplot3d(x=sigmoid(x1), y=sigmoid(x2), z=sigmoid(x3),
color=1+df$y)
```

Here, neural networks play the rule of the kernel trick, as coined in Koutroumbas, K. & Theodoridis, S. (2008). *Pattern Recognition*. Academic Press

**leave a comment**for the author, please follow the link and comment on their blog:

**R-english – Freakonometrics**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.