Intriguing properties of neural networks
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, Rob Fergus
tl;dr: Authors find that it is possible to perturb images to cause erroneous outputs (adversarial images)
Understanding the basics of neural networks
- There is no distinction between individual high-level units and random linear combination of units
- DNNs learn input-output mappings that are fairly discontinuous. Authors can add a small perturbation to cause network to misclassify an image
When I first read this in 2017, I found it quite fascinating and super interesting. It's remarkable how this almost "accidental" discovery launched a whole subfield. If anything, I think this gives us glimpse of how human discovery can proceed.
Test of time:
A seminal paper on adversarial examples and the field of machine learning security as a whole. With the privilege of many years of hindsight, I find reading this paper again even more intriguing.