The race to prevent ‘the worst case scenario for machine learning’

Dave Willner has had a front-row seat to the evolution of the worst things on the internet.

He started working at Facebook in 2008, back when social media companies were making up their rules as they went along. As the company’s head of content policy, it was Willner who wrote Facebook’s first official community standards more than a decade ago, turning what he has said was an informal one-page list that mostly boiled down to a ban on “Hitler and naked people” into what is now a voluminous catalog of slurs, crimes and other grotesqueries that are banned across all of Meta’s platforms.

So last year, when the San Francisco artificial intelligence lab OpenAI was preparing to launch Dall-E, a tool that allows anyone to instantly create an image by describing it in a few words, the company tapped Willner to be its head of trust and safety. Initially, that meant sifting through all of the images and prompts that Dall-E’s filters flagged as potential violations – and figuring out ways to prevent would-be violators from succeeding.

It didn’t take long in the job before Willner found himself considering a familiar threat.

Just as child predators had for years used Facebook and other major tech platforms to disseminate pictures of child sexual abuse, they were now trying to use Dall-E to create entirely new ones. “I am not surprised that it was a thing that people would attempt to do,” Willner said. “But to be very clear, neither were the folks at OpenAI.”

For all of the recent talk of the hypothetical existential risks of generative AI, experts say it is this immediate threat – child predators using new AI tools already – that deserves the industry’s undivided attention.

Discover the stories of your interest


In a newly published paper by the Stanford internet Observatory and Thorn, a nonprofit that fights the spread of child sexual abuse online, researchers found that, since last August, there has been a small but meaningful uptick in the amount of photorealistic AI-generated child sexual abuse material circulating on the dark web. According to Thorn’s researchers, this has manifested for the most part in imagery that uses the likeness of real victims but visualizes them in new poses, being subjected to new and increasingly egregious forms of sexual violence. A majority of these images, the researchers found, have been generated not by Dall-E but by open-source tools that were developed and released with few protections in place.

In their paper, the researchers reported that less than 1% of child sexual abuse material found in a sample of known predatory communities appeared to be photorealistic AI-generated images. But given the breakneck pace of development of these generative AI tools, the researchers predict that number will only grow.

“Within a year, we’re going to be reaching very much a problem state in this area,” said David Thiel, the chief technologist of the Stanford internet Observatory, who co-wrote the paper with Thorn’s director of data science, Rebecca Portnoff, and Thorn’s head of research, Melissa Stroebel. “This is absolutely the worst case scenario for machine learning that I can think of.”

‘We Trust People’

In 2003, Congress passed a law banning “computer-generated child pornography” – a rare instance of congressional future-proofing. But at the time, creating such images was both prohibitively expensive and technically complex.

The cost and complexity of creating these images has been steadily declining, but changed last August with the public debut of Stable Diffusion, a free, open-source text-to-image generator developed by Stability AI, a machine-learning company based in London.

In its earliest iteration, Stable Diffusion placed few limits on the kind of images its model could produce, including ones containing nudity. “We trust people, and we trust the community,” the company’s chief executive, Emad Mostaque, told The New York Times last fall.

In a statement, Motez Bishara, the director of communications for Stability AI, said that the company prohibited misuse of its technology for “illegal or immoral” purposes, including the creation of child sexual abuse material. “We strongly support law enforcement efforts against those who misuse our products for illegal or nefarious purposes,” Bishara said.

Because the model is open-source, developers can download and modify the code on their own computers and use it to generate, among other things, realistic adult pornography. In their paper, the researchers at Thorn and the Stanford internet Observatory found that predators have tweaked those models so that they are capable of creating sexually explicit images of children, too. The researchers demonstrate a sanitized version of this in the report, by modifying one AI-generated image of a woman until it looks like an image of Audrey Hepburn as a child.

Stability AI has since released filters that try to block what the company calls “unsafe and inappropriate content.” And newer versions of the technology were built using data sets that exclude content deemed “not safe for work.” But, according to Thiel, people are still using the older model to produce imagery that the newer one prohibits.

Unlike Stable Diffusion, Dall-E is not open-source and is only accessible through OpenAI’s own interface. The model was also developed with many more safeguards to prohibit the creation of even legal nude imagery of adults. “The models themselves have a tendency to refuse to have sexual conversations with you,” Willner said. “We do that mostly out of prudence around some of these darker sexual topics.”

Open Questions

Thorn has a tool called Safer, which scans images for child abuse and helps companies report them to the National Center for Missing and Exploited Children, which runs a federally designated clearinghouse of suspected child sexual abuse material. OpenAI uses Safer to scan content that people upload to Dall-E’s editing tool. That’s useful for catching real images of children, but Willner said that even the most sophisticated automated tools could struggle to accurately identify AI-generated imagery.

That is an emerging concern among child safety experts: That AI will not just be used to create new images of real children but also to make explicit imagery of children who do not exist.

That content is illegal on its own and will need to be reported. But this possibility has also led to concerns that the federal clearinghouse may become further inundated with fake imagery that would complicate efforts to identify real victims. Last year alone, the center’s CyberTipline received roughly 32 million reports.

“If we start receiving reports, will we be able to know?” said Yiota Souras, the general counsel of the National Center for Missing and Exploited Children. “Will they be tagged or be able to be differentiated from images of real children?

At least some of those answers will need to come not just from AI companies, like OpenAI and Stability AI, but from companies that run messaging apps or social media platforms, like Meta, which is the top reporter to the CyberTipline.

Last year, more than 27 million tips came from Facebook, WhatsApp and Instagram. Already, tech companies use a classification system, developed by the industry alliance Tech Coalition, to categorize suspected child sexual abuse material by the victim’s apparent age and the nature of the acts depicted. The Thorn and Stanford researchers argue that these classifications should be broadened to also reflect whether an image was computer-generated.

In a statement to The New York Times, Meta’s global head of safety, Antigone Davis, said, “We’re working to be purposeful and evidence-based in our approach to AI-generated content, like understanding when the inclusion of identifying information would be most beneficial and how that information should be conveyed.” Davis said the company would be working with the National Center for Missing and Exploited Children to determine the best way forward.

Beyond the responsibilities of platforms, researchers argue that there is more that AI companies can be doing. Specifically, they could train their models to not create images of child nudity and to clearly identify images as generated by artificial intelligence as they make their way around the internet. This would mean baking a watermark into those images that is more difficult to remove than the ones either Stability AI or OpenAI have already put in place.

As lawmakers look to regulate AI, experts view mandating some form of watermarking or provenance tracing as key to fighting not only child sexual abuse material but also misinformation.

“You’re only as good as the lowest common denominator here, which is why you want a regulatory regime,” said Hany Farid, a professor of digital forensics at the University of California, Berkeley.

Farid is responsible for developing PhotoDNA, a tool introduced in 2009 by Microsoft, which many tech companies now use to automatically find and block known child sexual abuse imagery. Farid said tech giants were too slow to use that technology after it was developed, enabling the scourge of child sexual abuse material to openly fester for years. He is working with a number of tech companies to create a new technical standard for tracing AI-generated imagery. Stability AI is among the companies planning to follow this standard.

Another open question is how the court system will treat cases brought against creators of AI-generated child sexual abuse material – and what liability AI companies will have. Though the law against “computer-generated child pornography” has been on the books for two decades, it’s never been tested in court. An earlier law that tried to ban what was then referred to as virtual child pornography was struck down by the Supreme Court in 2002 for infringing on speech.

Members of the European Commission, the White House and the U.S. Senate Judiciary Committee have been briefed on Stanford and Thorn’s findings. It is critical, Thiel said, that companies and lawmakers find answers to these questions before the technology advances even further to include things like full motion video. “We’ve got to get it before then,” Thiel said.

For all the latest Technology News Click Here 

Read original article here

Denial of responsibility! TechAI is an automatic aggregator around the global media. All the content are available free on Internet. We have just arranged it in one platform for educational purpose only. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials on our website, please contact us by email – [email protected]. The content will be deleted within 24 hours.