Not ‘safe anymore’: Singaporean ex-Twitter adviser warns of faltering fight against child sex exploitation
In place of human interventions, Twitter is now leaning heavily on automation to moderate harmful content posted by users, its new head of trust and safety Ella Irwin said earlier in December.
This, however, is not making Twitter a safer place to be, said the ex-employee. “If I were a bad actor, I would be testing the system now.”
These recent developments are a “huge concern”, said Eirliani Abdul Rahman, a Singaporean child safety activist and founding member of Twitter’s Trust and Safety Council.
“I don’t think it’s safe anymore for children online,” she told CNA.
“You cannot do content moderation just shooting from the hip. It is done through evidence-based work.”
In protest at Twitter’s declining online safety standards, she and two fellow advisers on the council resigned last week.
On Monday (Dec 12), Twitter dissolved the council, which it formed in 2016 with 100 members.
The company did not respond to CNA’s request for comment on its enforcement efforts against child sexual exploitation.
TWEETING IN CODE
The ex-employee outlined to CNA how automated machine-learning models often struggle to catch up with the evolving modus operandi of perpetrators of child sexual abuse material.
Trading such content on Twitter involves treading a fine line between being obvious enough to prospective “buyers” yet subtle enough to avoid detection.
In practice, this means speaking in codewords that are ever-changing, to try and evade enforcement.
For example, “MAP” is a common acronym to identify oneself as a “minor-attracted person”, but over time this has evolved to the use of the map emoji ????️ and even the term “cartographer”.
With fewer content moderators and domain specialists in Twitter to keep track of such changes, there’s a danger that abusers will take the opportunity to coordinate yet another new set of codewords that automated systems and a smaller team cannot quickly pick up, said the ex-employee.
Aside from keywords, image hashes or digital fingerprints of known child sexual abuse material and account behaviour – such as interactions with sexually explicit accounts – are also used to identify violative accounts.
Manual reviews play a key role. For instance, moderators use criteria like the Tanner scale, a sexual maturity rating, to determine whether an individual appearing in sexually explicit material is underaged based on physical characteristics.
Rarer but highly harmful activity, such as the creation of new child sexual abuse material or the solicitation of sexual services from a minor, is especially hard to track down with automation.
This is because the real human beings behind those accounts talk in organic ways that are not scripted, and no image hash exists for new visual material, said the ex-employee.
The ex-employee added that any automated process to identify violative tweets also needs to be nuanced enough to not take down “benign uses” of terms associated with child sexual exploitation.
“A victim drawing attention to their plight, having no easy way to do so and in a compromised situation or state of mind, might easily use problematic hashtags and keywords,” they said.
Failure to distinguish such uses of language could conversely end up silencing and re-victimising those suffering from child sexual abuse, they said.
For all the latest world News Click Here