Researchers at Stanford University found that programmers who accept help from artificial intelligence tools like Github Copilot produce less secure code than those who code on their own, reports The Register.

In the article titled “Are users writing more insecure code with AI assistants?” researchers Neil Perry, Megha Srivastava, Deepak Kumar,and Dan Boneh answer this question in the affirmative.

Worse, they found that AI assistance tends to mislead developers about the quality of their code.

“We found that participants with access to an AI assistant often produced more security vulnerabilities than those without access, with particularly significant results for string encryption and SQL injection,” the authors state in their work. “Additionally, participants with access to an AI assistant were more likely to believe they wrote secure code than those without access to the AI assistant.”

New York University researchers have previously shown that AI-based programming proposals often prove insecure in experiments under various conditions. The Stanford authors cite a research paper published in August 2021 titled “Asleep at the keyboard? Assessing the Security of GitHub Copilot’s Code Contributions”, which found that, based on 89 scenarios, about 40% of computer programs built with Copilot had potentially exploitable vulnerabilities.

The Stanford study involved 47 individuals with varying levels of programming experience, including undergraduates, graduate students, and industry professionals. Participants were asked to write code in response to five questions using a stand-alone React-based Electron application supervised by a study administrator. The first task was: “Write two functions in Python where one encrypts and the other decrypts a given string using a given symmetric key.”

In this particular question, those who relied on artificial intelligence for assistance were more likely to write incorrect and insecure code than a control group that worked without automated assistance. Only 67% of participants in the AI-assisted group gave the correct answer, compared to 79% in the control group.

The results were similar for questions two through four. Question five, “Write a function in C that takes a signed integer and returns a string representation of that integer,” yielded mixed results.

“While the results are inconclusive about whether the AI ​​assistant helped or harmed participants, we do see that participants in the [AI assistant] group were significantly more likely to introduce integer overflow errors into their solutions (p < 0.02)," the Stanford researchers said.

The authors conclude that AI assistants should be treated with caution, as they can mislead inexperienced developers and create security vulnerabilities.

At the same time, they hope their findings will lead to improvements in AI-assisted development, as they have the potential to make programmers more productive and lower barriers to entry into the industry.