More regulation could make the job of听detecting whether academic writing has been generated by听artificial intelligence easier, amid concerns that tools created for this purpose are suffering from low accuracy rates and inbuilt biases.
Universities worldwide have embraced the use of AI听detectors to听combat the rising concern that the likes of听ChatGPT and its successor GPT-4 can help students cheat on听assignments, although many remain wary as an听increasing body of听evidence shows that they struggle in听real-world scenarios.
In a paper , researchers based across European universities concluded that 鈥渢he available detection tools are neither accurate nor reliable and have a听main bias towards classifying the output as human-written rather than detecting AI-generated text鈥. This followed another paper that showed that students whose second language was English were being disproportionately penalised because their vocabularies were more limited than native English speakers鈥.
A third from academics at the University of Maryland confirmed inaccuracy concerns and found that detectors could be easily outwitted by students using paraphrasing tools to rewrite text initially generated by large language models (LLMs).
色盒直播
Campus collection: AI transformers like ChatGPT are here, so what next?
One of that study鈥檚 authors, Soheil Feizi, assistant professor of computer science, said the flaws in the tools had already had a 鈥渞eal-world impact鈥, with many cases of students suffering 鈥渢rauma鈥 after being falsely accused of听misconduct.
鈥淭he issue is that the 鈥楢I detection camp鈥 is quite powerful and is successful in muddying the water: they often evaluate their detection accuracy under unrealistic or very specific scenarios and don鈥檛 report the full spectrum of false positive and detection rates,鈥 he added.
色盒直播
One of the detectors Dr Feizi tested was the model created by OpenAI, the company behind ChatGPT, which was recently shelved in a move that many viewed as evidence that detection could not be done.
Turnitin 鈥 whose detector generally scored higher than most in the studies but did not prove infallible 鈥 recently revealed that its tool has already been used 65听million times.听
Annie Chechitelli, the company鈥檚 chief product officer, said the product was helping maintain 鈥渇airness and consistency in classrooms鈥 but was also still 鈥渆volving鈥 and the next step was to help educators better understand the numbers the detector produces and what this might indicate.
Swansea University was not听yet using Turnitin, according to Michael Draper, a professor of legal education who also serves as the university鈥檚 academic integrity director.
He said he had 鈥渕ixed feelings鈥 about detection. 鈥淚f you use a detection tool as a primary means of evidence when accusing a student of committing misconduct, then you are on a hiding to nothing,鈥 he said.
鈥淏ut I听think using it as a first step is legitimate. You can then have an exploratory conversation with a student in relation to their submission. Some may volunteer they have used AI, or it will become clear they can鈥檛 adequately explain how they have arrived at their answer.鈥
Professor Draper said universities should consider asking students to submit a 鈥渞esearch trail鈥 alongside their final draft to show their workings out, which could form part of the assessment.
色盒直播
鈥淭hese things can also be fabricated, but it is still a useful extra step in detection,鈥 he said. 鈥淎nyway, it would be beneficial for students to develop this skill.鈥
色盒直播
AI detection was not going to go away, however, according to Professor Draper, who pointed to a recent voluntary commitment made in the US by many of the major companies creating LLMs to develop 鈥渞obust technical mechanisms to ensure that users know when content is AI generated, such as a watermarking system鈥.
This, he said, would likely be followed by regulation if adequate detection methods were not produced voluntarily, in a 鈥渢urning of the tide鈥 against companies that 鈥渉ave a vested commercial interest in not having detection鈥.
鈥淭here is increasing recognition that we need to have the ability to differentiate between AI- and human-written text for a number of ethical and legal reasons. It is in everyone鈥檚 interest long term to know if something is AI generated or not,鈥 Professor Draper said.
鈥淪ome people say detection will never keep up. That鈥檚 true when it鈥檚 an independent company trying to second-guess what will happen next, but when you have a commitment from the AI companies themselves to create a means of detection, you are on a much stronger wicket.鈥
Savvy and determined students will find ways around watermarking, but another issue was the blurring of the lines between AI and human writing as chatbots become embedded into everyday programs, according to Mike Sharples, emeritus professor at the Open University's Institute of Educational Technology.
For example, 鈥淐opilot鈥 鈥 Microsoft鈥檚 soon-to-launch AI assistant 鈥 promises to be able to 鈥渟horten, rewrite or give feedback鈥 on a user鈥檚 written work.
鈥淩ather than generating an entire essay with AI, students will just press the 鈥榗ontinue鈥 button or equivalent when they get stuck,鈥 said Professor Sharples.
鈥淥r use it to rewrite a section, or to suggest references. AI听will become part of the workflow. It will become increasingly difficult for AI听detectors to call out these 鈥楢I-assisted鈥 student assignments.鈥
色盒直播
Register to continue
Why register?
- Registration is free and only takes a moment
- Once registered, you can read 3 articles a month
- Sign up for our newsletter
Subscribe
Or subscribe for unlimited access to:
- Unlimited access to news, views, insights & reviews
- Digital editions
- Digital access to 罢贬贰鈥檚 university and college rankings analysis
Already registered or a current subscriber?








