Adversarial Robustness Through the Lens of Singular Learning Theory
This project was inspired by Timaeus who a progressing the field of developmental interpretability of neural networks using singular learning theory. I was motivated to investigation adversarial robustness after having insightful discussions with Jesse Hoogland, and my previous interest in the connection between generalisation and robustness. This project was also supported by helpful conversations in the Developmental Interpretability Discord, most notably from Dr Daniel Murfet.