OpenAI has just over a week to comply with European data protection law, or the company faces huge fines, data deletion and even a ban, writes MIT Technology Review.
It must fulfill these requirements after a series of investigations in the EU countries and a temporary ban on activities in Italy. But it will be almost impossible for OpenAI to comply with the rules, since the company downloaded content from the Internet to train AI models.
AI development is dominated by the paradigm that more training data is better. For example, in the OpenAI GPT-2 model, the dataset consisted of 40 gigabytes of text. GPT-3, on which ChatGPT is based, was trained on 570 gigabytes. At the same time, the data volume for GPT-4 remains unknown.
As you know, some western data protection authorities have recently launched an investigation into how the company collects and processes the data underlying ChatGPT. According to them, OpenAI could collect people’s personal data and use it without their consent.
Earlier OpenAI launched a reward program for discovering vulnerabilities in its APIs, including the popular chatbot ChatGPT, and is offering a cash reward of up to $20,000.