Neural networks can be secretly trained to misbehave, according to a new research paper.
A team of New York University scientists has found that people can corrupt artificial intelligence systems by tampering with their training data, and such malicious amendments can be difficult to detect.
This method of attack could even be used to cause real-world accidents.
Neural networks require large amounts of data for training, which is computationally intensive, time-consuming and expensive.
Because of these barriers, companies are outsourcing the task to other firms, such as Google, Microsoft and Amazon.
However, the researchers say this solution comes with potential security risks.
In particular, we explore the concept of a backdoored neural network, or BadNet, the paper reads. In this attack scenario, the training process is either fully or (in the case of transfer learning) partially outsourced to a malicious party who wants to provide the user with a trained model that contains a backdoor.
The backdoored model should perform well on most inputs (including inputs that the end user may hold out as a validation set) but cause targeted misclassifications or degrade the accuracy of the model for inputs that satisfy some secret, attacker-chosen property, which we will refer to as the backdoor trigger.
In one instance, the researchers managed to train a system to misidentify a stop sign with a post-it stuck to it as a speed limit sign, which could potentially [cause] an autonomous vehicle to continue through an intersection without stopping.
What's more, so-called 'BadNets' can be hard to detect.
BadNets are stealthy, i.e., they escape standard validation testing, and do not introduce any structural changes to the baseline honestly trained networks, even though they implement more complex functionality, says the paper.
Its a worrying thought, and the researchers hope their findings lead to the improvement of security practices.
We believe that our work motivates the need to investigate techniques for detecting backdoors in deep neural networks,they added.
Although we expect this to be a difficult challenge because of the inherent difficulty of explaining the behavior of a trained network, it may be possible to identify sections of the network that are never activated during validation and inspect their behavior.