Intriguing properties of neural networks

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, Rob Fergus

arXiv, 2013
Paper Link

tl;dr: Authors find that it is possible to perturb images to cause erroneous outputs (adversarial images)

Previous Readings:
Understanding the basics of neural networks

Contributions:

There is no distinction between individual high-level units and random linear combination of units
DNNs learn input-output mappings that are fairly discontinuous. Authors can add a small perturbation to cause network to misclassify an image

Experiments:
[WIP]

My thoughts:
When I first read this in 2017, I found it quite fascinating and super interesting. It's remarkable how this almost "accidental" discovery launched a whole subfield. If anything, I think this gives us glimpse of how human discovery can proceed.

Test of time:
A seminal paper on adversarial examples and the field of machine learning security as a whole. With the privilege of many years of hindsight, I find reading this paper again even more intriguing.