Decision Boundary Geometries and Robustness of Neural Networks
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact AdriĆ Garriga Alonso.
Adversarial examples are small perturbations to an input point that cause a Neural Network (NN) to misclassify it.
Some recent research shows the existence of “universal adversarial perturbations” which, unlike previous adversarial examples, are not specific to data points and network architectures. We will also talk about some results which try to link this behaviour to the geometry of decision boundaries learned by neural networks.
Adversarial inputs by themselves aren’t the main concern for the value alignment problem. However, the insight they can give about NN internals will be important if future AIs rely on NNs at all.
Relevant readings:
The Robustness of Deep Networks: A Geometrical Perspective
http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8103145&tag=1
Adversarial Spheres https://arxiv.org/abs/1801.02774
This talk is part of the Engineering Safe AI series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|