The upcoming European General Data Protection Regulation imposes strict rules on how to handle personal data, creating a challenge for AI companies everywhere.
Many people wonder why Europe is so keen to protect people's privacy. The reason dates back from WWII, when French and German governments used centralized citizen files
to target, and subsequently deport, Jews and other ethnic minorities. Following this, the Universal Declaration of Human Rights included an entire article about the right to privacy
, which gave birth to many laws in European countries. The demand for uniform privacy regulations across the continent leads us to the European General Data Protection Regulation
(GDPR), which will take effect in May 2018.
In a nutshell, the GDPR forces companies offering a product or service to a European citizen to follow Privacy by Design
principles. It doesn't matter where in the world they are headquartered; if they want to do business in Europe, they will need to comply. The penalty for failing to do so is up to 4 percent of global turnover, which could amount to billions of dollars for large companies. This is why over 93 percent of US companies made this their top legal priority in 2017
Although the GDPR applies to any use of personal data (defined as data that can identify someone directly or indirectly), it poses major challenges to artificial intelligence in particular, as machine learning algorithms often rely on user data to learn to do things.
In the GDPR there are four principles that makes it virtually impossible to do AI as commonly practiced:
Companies will now have to ask for consent in simple terms, rather than buried in legalese terms and conditions. This creates many challenges, in particular for cloud-based voice assistants. Voice is considered to be personal data, therefore devices that listen ambiently should in theory ask everyone in the room for consent before sending their voice to the cloud. Imagine the nightmare of having 10 people over for dinner, and having your Google Home device asking each of them for consent! One way to solve this is to process the user voice directly on the device instead of sending it to the cloud, therefore avoiding the need for explicit consent.
Right to be forgotten.
This means that anyone can ask for their personal data to be completely deleted. While this may be easy for a user account in a database, what happens when the data was used to train a machine learning algorithm? One might argue that the user data is still present in the form of outlier nodes in the neural network, and thus that they haven't been "forgotten." The easiest way to solve this would be to retrain the models without the user data, which is a quite costly process. A better approach would be to find algorithms that can "unlearn" specific inputs without retraining over the entire dataset.
I love this one because it means European residents will be able to access all the data a company has about them, and transfer it to another provider. The regulation states that the data subject can ask for her personal data to be transferred directly to a new provider, without hindrance, and in a machine readable format. Just like you can switch mobile providers while keeping your phone number, you will be able to switch social networks or search engines without any loss of data. This breaks the personal data lock-in that many services are using to keep us captive. It also opens the door to a massive data exodus when companies mess up: They would lose their data and hence their ability to train their AIs, while new providers would gain more data and thus improve their own AIs. In effect, it would speed up the demise of bad companies by creating an exponentially increasing gap when people switch over.
This one is particularly tricky, as it states that European residents have a "right to explanation" when an automated decision was made about them. The logic behind it is to avoid discrimination and implicit bias by enabling people to go to court if they feel unfairly treated. But, this would also effectively prohibit the use of deep learning, since it is currently a black box. Many researchers are working on explaining how neural networks make decisions, as this will be a requirement before we can hope for AI to enter areas such as medicine or law.
While the above regulations certainly introduce some new problems for those working in AI, the GDPR also creates a lot of opportunity -- by giving people control over their data, while breaking digital monopolies that prevent innovation beyond incumbent companies.
More than just ethically beneficial for Europe, the GDPR is also a way to reclaim digital territory, and make companies all around the world start respecting a fundamental human right that they have been ignoring for 70 years: privacy.