Thomas Walker

Using SplineCam, we can investigate how the decision boundaries of neural networks evolve through training. Below are different neural network architectures trained on different datasets. The choice of architectures are made to investigate how the decision boundary of a neural network relates to its width and depth. Specifically, I was interested in trying to replicate the conclusions made here. Throughout, networks employ the ReLU activation function and are trained using the Adam optimizer, with learning rate 0.001, for 500 epochs.

5 layers of width 10.

5 layers of width 20.

10 layers of width 10.

10 layers of width 20.

10 layers of width 10, but with the central layer of width 3.

10 layers of width 20, but with the central layer of width 3.

From these animations we make the following observations.

Bottle necking improves the speed at which the decision boundaries form.
More layers improves the strength of the decision boundaries.
Bottle necking also improves the uniformity by reducing the amount of overfitting in the boundaries.

We now proceed with similar experiments on an alternative dataset.

5 layers of width 10.

5 layers of width 20.

10 layers of width 10.

10 layers of width 10, but with the central layer of width 3.

10 Layers of width 20, but with the central layer of width 3.

From experiments on this dataset we reinforce the observations made on the previous dataset.