Here’s something to consider: Researchers using machine learning artificial intelligence (AI) often don’t know exactly how their algorithms solve the problems they’re responsible for.
For example, AI can identify race in X-rays that humans can’t see, or Facebook AI starts developing its own language. Join these probably everyone’s favorite text to image generator DALLE-2.
Computer science doctoral student Giannis Daras noticed that the DALLE-2 system, which creates images based on text input prompts, returns nonsense words as text in some cases.
“A known limitation of DALLE-2 is that it struggles with text,” he wrote in a paper published on the preprint server Arxiv. “For example, textual cues such as ‘image of the word airplane’ often result in generated images depicting garbled characters.”
“We found that this generated text was not random, but revealed a hidden vocabulary that the model seemed to be developing internally. For example, the model often generated airplanes when these jumbled texts were input.”
In an illustration posted on Twitter, Daras explained that when asked to caption a conversation between two farmers, it showed them talking, but the speech bubbles were filled with what looked like utter nonsense.
Dallas, however, has thought about feeding these meaningless words back into the system to see if the AI has given them its own meaning. As he did so, he found that the words did seem to have their own meaning to the AI: Farmers were talking about vegetables and birds.
If Daras is right, he thinks this will have security implications for text-to-image generators.
“The first security issue involves using these gibberish hints as a way to backdoor adversarial attacks or evade filters,” he wrote in the paper. “Currently, natural language processing systems filter text prompts that violate policy rules, and may use garbled prompts to bypass these filters.”
“More importantly, the absurd cues of constantly generating images challenge our confidence in these large generative models.”
However — while other algorithms have been shown to create their own languages — the paper has not yet been peer-reviewed, and other researchers are questioning Darras’ claims. Research analyst Benjamin Hilton asked the generator to show two whales talking about food, with captions. After the first few results did not return legible text, garbled or not, he kept going until he did.
“What do I think?” Hilton tweeted. “‘Evve waeles’ is either nonsense or a perversion of the word ‘whale’. Janis was lucky when his whale said ‘Wa ch zod rea’, which happened to produce a picture of the food.”
Additionally, adding other phrases like “3D rendering” to other phrases yielded different results, suggesting that their meanings are inconsistent.
At least in some cases, the language can be more of a noise line. We’ll know more when papers are peer-reviewed, but it’s still possible that something happens that we don’t know.
Hilton added that the phrase “Apoploe vesrreaitais” returns images of birds every time, “so there must be something to it”.