Intriguing properties of neural networks
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, Rob Fergus
Paper Link
tl;dr: Authors find that it is possible to perturb images to cause erroneous outputs (adversarial images)
Previous Readings:
Understanding the basics of neural networks
Contributions:
- There is no distinction between individual high-level units and random linear combination of units
- DNNs learn input-output mappings that are fairly discontinuous. Authors can add a small perturbation to cause network to misclassify an image
Experiments:
[WIP]
My thoughts:
When I first read this in 2017, I found it quite fascinating and super interesting.
It's remarkable how this almost "accidental" discovery launched a whole subfield.
If anything, I think this gives us glimpse of how human discovery can proceed.
Test of time:
A seminal paper on adversarial examples and the field of machine learning security as a whole.
With the privilege of many years of hindsight, I find reading this paper again even more intriguing.