Emojis are everywhere – and that includes your evidence.
What is an emoji? It is a small digital image or icon used to express and idea or an emotion. They are much like emoticons, but emoji are actual pictures instead of typographics. Originally meaning pictograph, the word emoji comes from Japanese (絵, “picture”) + moji (文字, “character”); the resemblance to the English words emotion and emoticon is purely coincidental. The ISO 15924 script code for emoji is Zsye
.
Preston Farley, a Special Investigator with the Federal Aviation Administration (FAA), believes “emoji will emerge as a prominent form of communication sooner rather than later,” and that there are potential ramifications for digital forensics examiners and investigators when it comes to analyzing and testifying about emoji.
Presenting at the Techno Security and Digital Investigations conference in Myrtle Beach in June 2019, Farley explained that emoji present two distinct challenges.
First, like all language, emoji can complicate communication. The recipient of a text with an emoji may not interpret it the way the sender intended. If that can happen between two people, then it can happen between the investigators, attorneys, judges, and jurors responsible for determining guilt or innocence in a trial.
Second, emoji in digital evidence can present additional, more technical complications for forensic examiners. Forensic tools don’t always render the emoji in question, or depending on the acquisition device, may not render them in the way the sender viewed them when choosing them.
At that point, Farley said, interpretation can end up being a matter of one witness’ word against another’s. Currently no legal documentation exists to help judges and juries interpret emoji. Some judges may strike emoji evidence entirely, refusing to allow it to be considered alongside other communication. And no scientific standards exist for the collection or analysis of emoji relative to other forms of digital communication.
Emoji: The Technical Basics
Emoji are tiny pictographs that consist of human and animal faces, symbols, flags, food items, plants, leaves, flowers and a variety of other items. First created in 1999 by a Japanese artist, emoji have come to rely on Unicode, which itself was developed in response to the need for non-English character sets. (For a more detailed description of how Unicode came to be, read this brief Medium article.)
Unicode’s code points are expressed in four- to six-digit hexadecimal code, preceded by a U+ (so that the final result is something like U+005449). In his presentation, Farley explained how emoji encodings result from endianness and pictographic languages, as well as the operating systems on which they appear. The Unicode byte order mark — “FE FF” or “FF FE” — tells the recipient computer which endianness to use, defaulting to big endian encoding if no byte order mark is present.
Currently, more than 1.1 million code points are listed in the Unicode lexicon, which has been expanded from UTF-8, to UTF-16, to UTF-32. Farley said the lexicon of Unicode emoji grows every year, including even obscure symbols such as Egyptian hieroglyphs, alchemy symbols from the Middle Ages, and so on.
Among the operating systems, said Farley, Apple’s Macintosh and iOS offer the best Unicode emoji and libraries. Until Windows 10, Microsoft came in as a “distant second,” and based on his 2018 research, Farley notes font libraries need to be installed for Linux even to make it into the game.
Because Unicode is supported by all operating systems, Unicode-based emoji render whether senders and recipients are on Apple, Windows, iOS, or Android platforms. Therein, however, lies the rub: Unicode only suggests — but doesn’t define — an emoji’s appearance. Apart from black-and-white and color standards, how the representation of, say, a “grinning face with smiling eyes” will work is largely up to the authors.
So, while recipients will see a Unicode-encoded emoji regardless of platform, what’s represented across Apple, Google, Windows, Samsung, LG, HTC, Twitter, Facebook, Mozilla, and others might look very different.
A 2018 study on emoji rendering differences stated: “Through a survey of 710 Twitter users who recently posted an emoji-bearing tweet, we found that at least 25% of respondents were unaware that the emoji they posted could appear differently to their followers. Additionally, after being shown how one of their tweets rendered across platforms, 20% of respondents reported that they would have edited or not sent the tweet.”
And that is if the emoji render at all. An emoji encoded on iOS that has no equivalent in Android, for example, might appear only as a blank block. That means that emoji available to a suspect may not render on a victim’s device, so that if the latter is all a forensic examiner has, they may be missing important contextual clues.
How Emoji Are Used
Most people are familiar with emoji from messaging via text and social media. They can contextualize sentiment by adding, say, a smile or a wink; or they might replace words altogether.
The most common usage isn’t the sole usage, however. Farley says Windows offers support for using an emoji as an account name, and among Apple products, emoji can even be used in password creation.
Of course, usernames and passwords aren’t subject to the same interpretive challenges as a text message or social media post. A somewhat humorous 2017 article described the author’s “ongoing struggle to master emoji”: to use these pictographs in a way that others can understand.
In part, that’s because emoji have no grammatical structure that might help to standardize their usage. One study concluded:
“In the case of emoji-only sequencing, they appear to lack the characteristics of complex grammar, instead relying on linear patterning motivated by the meanings of the emoji themselves…. In line with this, emoji appear to be effective for communicative multimodal interactions, often using text and image relationships similar to those between speech and gestures.”
Another pair of researchers concurred with the use of emoji to indicate nonverbal gestures. Gawne and McCulloch wrote: “To best understand emoji, we need to appreciate them for their current function…. They are not necessarily composed of meaningful units, nor do they necessarily build up into more complex units of meaning, like language does. Rather, like gesture, emoji are context-sensitive and have far more flexibility in use than language.”
That concept seems to be behind changes to community standards around the use of sexually charged emoji on Facebook and Instagram. Updated in July and enacted in September 2019 in an apparent effort to limit “sexual solicitation,” the standards ban “Suggestive Elements” such as “[commonly used] sexual emoji or emoji strings… alongside an implicit or indirect ask for nude imagery, sex or sexual partners, or sex chat conversations,” according to the New York Post.
“Communicative multimodal interactions” aren’t all suggestive, of course. In his presentation, Farley said emoji can be useful to communicate high-level ideas without language mastery between less literate people, or those communicating across languages and cultures.
However, what means one thing in one community can mean something totally different elsewhere. In a LinkedIn article, consultant Martin Nikel wrote that for example: “Emojii use in South Africa shows some significant differences in interpretation to other western cultures. Therefore a conversation between two people from different regional or cultural backgrounds may carry different intent and received understanding.” These issues may need to factor into an investigation.
Emoji And The Law
Eric Goldman, a Santa Clara University law professor and blogger, reported that the number of US cases referring to emoji as evidence increased from 33 in 2017 to 53 in 2018, which accounted for 30 percent of the all-time number of opinion references to emoji.
Although Goldman noted none of these rulings were “substantive” with regard to emoji, to pair this trend with the research on ambiguous grammar and cross-cultural communication makes it easier to see what Goldman called “lawsuits in the making.”
A 2017 article at The Fashion Law posed the following questions:
“When a text — such as a text message, email, or social media post — containing an emoji is presented as evidence, is the emoji significant and unambiguous enough to be presented to the jury? Are some emoji significant, but others not? And if they are important, how is a court to share such information with a jury? By sight? By sound?”
Joseph Remy, a prosecutor and National White Collar Crime Center (NW3C) Advisory Board member, says little case law is on the books currently to guide prosecutors, judges, or investigators on emoji evidence, but that other types of cases are instructive.
“Thirty years ago, the issue was with drugs and drug slang,” Remy explains. “Now we’re attacking the same problem, just in a different way.” In other words, emoji are very similar to the kind of coded language used in the drug trade.
Remy refers to a 2002 case, U.S. v. Garcia 291 F.3d 127 (2002), regarding a coded verbal conversation. Prosecutors in that case brought in a government informant to testify that the conversation used code words from the asbestos removal industry — where both he and the defendant worked — to arrange a drug deal.
In Garcia, the defendant’s convictions were vacated because the witness hadn’t laid a sufficient foundation for how the defendant knew the code related to drugs and not asbestos. Citing U.S. v. Yannotti, 541 F.3d 112, 126 n.8 (2d Cir. 2008), the Second Circuit provided key advice to law enforcement regarding lay versus expert testimony:
“An undercover agent whose infiltration of a criminal scheme has afforded him particular perceptions of its methods of operation may offer helpful lay opinion testimony…even as to co-conspirators’ action that he did not witness directly. By contrast, an investigative agent who offers an opinion about the conduct or statements of conspirators based on his general knowledge of similar conduct learned through other investigations, review of intelligence reports, or other special training… must qualify as an expert.”
Using Garcia as guidance, Remy says, investigators are commonly asked to establish how their experience — the number of cases they’ve worked that involved particular elements, for instance — leads them to interpret evidence.
For example, the communications of child predators and traffickers are frequently coded. Emoji usually represent what the predator wants to do to their victim; or in a drug or human trafficking case, what they’re selling.
However, Remy says that since emoji generally consist only of a small portion of the conversation, they can be interpreted based on an investigator’s experience and context. In other words, an investigator who works enough of the same kinds of cases involving emoji can gain the experience needed to interpret the coded language.
Even so, depending on the type of case, emoji evidence can be highly subjective, which was the crux of a free-speech case, Elonis v. U.S. 575 U.S. (2015). There, the U.S. Supreme Court held that to convict a person of making threats to others would require proof of the defendant’s subjective intent to threaten, not simply a reasonable person’s objective belief that they had been threatened.
Julia Greenberg, writing for Wired in 2015, stated:
“For investigators, attorneys, and jurors trying to determine, or prove, the intent of a phrase, as in the Elonis case, it’s often much more complicated than : ) means this past sentence was pleasing. Maybe the person was being polite. Maybe they were trying to dull a blow. Maybe it’s an evil grin and the person is being ironic. Without the context of who is using the symbol, who received it, and an understanding of how those two people—or the people in their community—typically use it, the intent may not immediately be clear.”
Indeed, Greenberg added, “…what we mean when we use language is never crystal clear, and never has been.” Or as Nikel put it for LinkedIn: “Acronyms and turns of phrase, intent, interpretation… how well does the reviewer really understand communication between two parties… even without emoji?”
These issues are important given emoji such as gas pumps, which in 2015-2016, says Remy, referred to marijuana among teens in the northeastern US, but can also refer to a certain sex act in human trafficking cases, and more innocently to flatulence or the simple act of pumping gas.
Yet emoji interpretation is unlikely to require qualified expert testimony. While Remy doesn’t write off the possibility for a case where emoji content is significant or material, Goldman was quoted in The Verge in early 2019 saying: “Emoji usually have dialects. They draw meaning from their context. You could absolutely talk about emoji as a phenomenon, but as for what a particular emoji means, you probably wouldn’t go to a linguist. You would probably go to someone who’s familiar with that community.”
That’s why a digital forensic examiner could testify to what hexadecimal code converts to, says Remy, but not an opinion on what the emoji’s meaning is. That interpretation would, again, be left to the investigator’s ability to put the emoji in the context of the surrounding conversation.
Even so, Farley says: “The domains of knowledge necessary are formidable…. if your case has a significant emoji-related component, you should probably be up to speed on what the forensic basis for them are… [and] plan on and practice being a digital concepts instructor as well.”
Encountering Emoji In Digital Evidence
Websites like EmojiStats.org and emojitracker.com track real-time emoji hits on platforms like Twitter and Facebook Messenger, and while they don’t offer any information regarding the emoji’s context, they can provide a good sense of the emoji you might be most likely to encounter.
Just like with mobile apps, however, often you’ll encounter some emoji that are unsupported by forensic tools. Farley’s presentation described how tools, relying as they do on operating systems’ encoding, may not properly render emoji. Even with Unicode support and interpretation, in order to be rendered, emoji have to be preloaded in sets. With no built-in palette of what the code means, these emoji can’t be represented on screen.
This can affect how the investigator interprets what was said. For example, a gun could be represented as a water pistol and may or may not match what the user would see.
A potential additional wrinkle is the Unicode Private User Area, which Farley said enables private parties to create their own emoji (often associated with proprietary logos). That makes it possible to use emoji no one else is familiar with.
Calling the concept “passive cryptography,” Farley says the PUA requires too deep a level of technical knowledge for most bad actors to leverage it. “Encryption would accomplish nearly the same thing at this point and is much easier to use.”
Forensic tool support extends beyond rendering and reporting to keyword searches, and potentially other areas, such as artificial intelligence detection tools. “From a long-term perspective,” Farley adds, “as emoji are based on the Unicode standard, whatever the support ‘traditional’ languages receive, the same level of support should be accorded to emoji as well.” That includes live memory support to help capture emoji used in or as passwords, at least on Apple products.
No vendor representatives were available for comment as to the degree of support their tools do or don’t provide, though Farley speculates, “…there has not been much of a demand for this type of support by the end-user community…. If the [forensic] tools are rendering most of the emoji seen for most of the cases worked and the judicial system isn’t raising a ruckus, then everybody’s happy,” giving vendors no reason to spend limited resources on emoji artifacts.
Remy adds that emoji support may be viewed as “too big a task” across so many platforms, so that vendor support depends on a critical mass of cases demanding it. What would critical mass be? “When the contents of a message in digital format consists of a majority of the conversation in emoji such that the code cannot be interpreted using context, previous interaction and/or experience,” Remy says.
Right now, because those contents can often be interpreted based on other messages, vendor support isn’t needed. Remy believes it’s simply too soon to tell whether emoji communications are just trendy, or whether they’ll catch on. If they do, though, he anticipates that vendor support might take the form of a language pack, available for purchase like any language other than the examiner’s primary selected language.
Could forensic examiners code their own forensic tools? “If examiners are writing code, they must have a good grasp of what the software is doing with the evidence,” Farley says. “If they are unaware of why there are “strange” characters mixed in with their Unicode and they ignore them, that’s a problem.
“For many examiners, using premade libraries like Pymoji.py will be sufficient because this shows they have a baseline understanding of what they are examining and will have the wherewithal to investigate anomalies such as Unicode characters which don’t render after a Python conversion.”
For the moment, emoji are used largely to enhance conversation, not to conduct it. In other words, they aren’t completely irrelevant as evidence, but because most digital messages still rely on words, they probably aren’t pivotal, either.
Still, as online communication continues to evolve, it makes sense for digital forensic examiners — and the investigators and prosecutors they work with — to be aware of the practical and legal challenges they may encounter in the future.