Many major news stories begin their journey to public consciousness via social media. Witness the cellphone video shot by a bystander showing the killing of George Floyd and the videos of the Capitol insurgency of Jan. 6, 2021. With its vast and growing palette of digital tools, such open source intelligence has become a forensic art, applying to both journalism and criminal investigations.
To be clear, open source investigation does not make use of hacking. Open source refers to the ready public availability of the platforms and applications used in an investigation, from recognizable things like social media sites and internet search engines to more specialized tools like Fingerprinting Organizations with Collected Archives (used for metadata analysis) and Shodan (a search engine for internet-connected devices like refrigerators and robotic vacuum cleaners).
The open source revolution began with the simultaneous appearance of smartphones, social media platforms and an explosion of applications for gathering and analyzing digital material — all available for the first time between 2005 and 2008. Since then, the geography of online space has rapidly grown.
These now-indispensable tools for visual investigations include publicly available cellphone videos, satellite imagery and social media posts, but also aircraft trackers, maritime vessel trackers, weapons and munitions databases, archiving tools, webcams (in many public spaces worldwide), reverse image search tools and many others from which to draw. (For a more complete list, BBC Africa Eye offers a useful forensics dashboard for journalists.) Reporting on such events as riots, rallies and presidential inaugurations? Crowd-counting tools (e.g., mapchecking.com) are now go-tos for a rought estimate of crowd numbers.
Not all crowd sources are so formal, though. There’s a steady online presence of amateur sleuths contributing such things as monitoring air and marine traffic in search of anomalous activity. (Donald and Melania Trump’s flight during Christmas 2018 to visit U.S. troops in Iraq under a false call sign, for example, was quickly identified by amateur investigators, even while the flight was in the air.) Other enthusiasts watch — and post information on — the movements of war ships. And there are some brave souls who dedicate their free time to archiving material posted by violent jihadists before its removal by content moderators.
Then there are tools that haven’t yet made it to the mainstream. Tesla, for example, with its aspirations for fully self-driving cars, each with eight cameras pointing in different directions, adds another level of density to our visual saturation and could well become a source of video recordings for investigators. The landscape of digital investigative tools is constantly shifting as applications and platforms come and go, but the direction is toward specialization, richness and variety.
Some have taken the lead in adapting to this area of innovation. Major news outlets like The New York Times, CNN, the BBC Africa Eye, The Guardian and Der Spiegel have teams dedicated to what they call visual forensics or visual investigations. In their “about” pages, they use words like “sleuthing,” “trust,” “rigor” and “accountability journalism” to describe their dedication to truth through digital forensics.
Other agencies combine the roles of news outlet and investigative non-governmental organization. ProPublica and The Intercept have used digital investigations in features on such topics as political corruption, police misconduct and corporate malfeasance.
Arguably the most influential journalistic nonprofit is the Amsterdam-based Bellingcat. Since its founding in 2014, Bellingcat has drawn the world’s attention to its innovative use of open source techniques with several high-profile investigations, including the downing of flight MH17 by a Russian missile over occupied Ukraine and the poisoning of Sergei Skripal by Russian agents in the United Kingdom.
Digital investigations by journalists do not have to be the exclusive prerogative of well-resourced outlets making use of specialized teams. Even outlets that shade into the realm of tabloids have made use of open source investigation techniques. The Daily Beast, with a series of online stories, revealed the identities of white supremacists who participated in the Charlottesville rally in 2017.
One of the main things open source investigations provide is the capacity to explore regions that would otherwise be difficult or impossible to access. You do not have to be a war reporter risking life and liberty to reach many sites of conflict. Travel to Timbuktu, Mali, the setting of my study of war crimes in my book “#HumanRights” and my novel “The Memory Seeker,” would have been perilous, even with the nearby presence of UN peacekeepers, but Google Street View still provided explorable imagery from 2018 and 2019 around the Djinguereber Mosque.
From this, it was possible for me to see not only the exterior of the mosque with its mud-daub construction, but such details as what people wore (who knew Crocs would find their way that far?), what street hawkers were selling, where people hung their laundry, what kind of motorcycles people were driving (including one in the process of being repaired at the roadside). The near total absence of women and girls in public spaces was telling.
Exploring this potential further, I found many images in Google Maps from Aleppo, Syria, with astonishing evidence of life continuing amid the destruction of war. Similar imagery was available from Mariupol and other cities in Ukraine. This may not all find its way into a journalist’s report, but a few minutes of perusal can still provide the beginning of a sense of a place.
Of course, there’s a risk in using open source material.
Journalists increasingly struggle against falsehoods intended to compromise their work. Whether you call them “deepfakes” or “misleading visuals,” the increasingly potent tools for pranks and propaganda are (or should be) constantly present in investigators’ minds. They kill stories and reputations. With the advent of powerful artificial intelligence platforms, faked imagery will be more difficult to detect — or the digital arms race between those creating faked imagery and those exposing it will intensify.
There is no complete set of rules for video authentication of open source material. It involves classic forms of investigative ingenuity, simple things like noting regional dialects in accents, comparing the hands on someone’s watch with the time stamp on the video or inconsistency in the length of shadows to reveal a composite image cobbled together from different source material. Videos are most often geolocated by comparing features — such things as buildings, roads, mountain ranges and trees — with satellite imagery from Google Earth and Maxar, as well as geolocated photographs from Google Maps.
How can a video be authenticated by journalists and other investigators when the source is suspect? Surprisingly often, people implicate themselves in war crimes and with videos uploaded to social media. A good example comes from the work of the International Criminal Court, assisted by investigators from Bellingcat, leading up to the indictment of Mahmoud al-Werfalli for the war crime of murder in 2017. While serving as leader of the Al-Saiqa brigade under the command of Khalifa Haftar in Libya, al-Werfalli posted videos of himself and his soldiers executing prisoners in seven “incidents” involving 33 victims. In the video referred to by the ICC as “Incident 7,” the prisoners are dressed in orange jumpsuits, kneeling, their hands tied behind their backs, as al-Werfalli stares into the camera and gives orders to his men to open fire. The video depicted a mass extrajudicial execution and a clear war crime. But how could investigators be sure the video wasn’t fake? And how could skeptical judges be persuaded to admit this new form of digital evidence?
The solution in this case offers a useful illustration of how authentication can be accomplished. Investigators downloaded videos of the murders from Facebook (using fbDown.net and Helper Tools for Instagram, among others), assembling video stills into a single panoramic image (Agisoft Metashape), and posting a message to “the crowd” — using the crowd-verification platform Check — to help them identify the location. One Libyan participant identified it as an abandoned Chinese construction site on the outskirts of Benghazi. Satellite imagery (from TerraServer — alas, no longer available — and Google Earth Pro) confirmed the exact location of the site and the position of the camera used to film the incident.
Investigators did this simply by matching the precise location of walls and half-constructed apartments in the background of the video with satellite imagery. Going over satellite imagery from previous weeks revealed a cluster of spots on the ground in the view taken immediately after the incident that was not there the week before. Zooming in to a ground-level view and comparing it to the video imagery revealed the dark spots (likely blood) to be a precise match of the location of the bodies of prisoners killed in the incident. This was an important confirmation that the incident had taken place and the digital evidence was authentic.
Finally, a tool that calculates the time a video is taken by the shadows cast (SunCalc) confirmed that the executions took place at 6:37 a.m. on July 17, 2017. The ICC, in its indictment of al-Werfalli, referred specifically to the analysis of Incident 7, with its “indicia of authenticity,” as an important contribution to the investigation leading to the indictment. This story, like many others from the ICC, ends with impunity. Briefly detained pending extradition to the Netherlands then released, al-Werfalli was later killed in battle and confirmed deceased, with proceedings against him formally terminated by the ICC on June 15, 2022.
There is nothing in the investigation of al-Werfalli that involved technology or skills beyond the reach of even local newsrooms. By definition, open source technology is available to anyone. And as for skill, a bit of training and some digital dexterity can produce astonishing results, to the point that amateurs are getting into the game.
A good example of engaged amateurs — which quickly became a story in itself — comes from the insurgency in the U.S. Capitol building on Jan. 6, which produced a mountain of visual evidence. When investigators put out calls to their followers to collect it, the response was overwhelming. Bellingcat, CitizenLab and the r/DataHoarders subcommunity on Reddit were among the organizations that reached out for evidence and received in return vast troves of visual data. The New York Times Visual Investigations team used such material, most of it filmed by rioters themselves, to track key participants and produce a groundbreaking story on the riot, complete with original visuals.
As open source material becomes more important, it’s likely that we’ll see the witnesses taking greater risks. The Whistle, based at Cambridge University, is one of the organizations dedicated to helping witnesses perform the difficult juggling act of recording useful material and staying safe while preserving the material they provide.
There’s also the growing challenge resulting from the arrival of ChatGPT and other large language models (LLMs), which have recently thrown open the doors of digital innovation and walked through with a brazenness that has taken many by surprise. It is difficult at this point to see how far and in what ways this publicly available AI tool will take us. Paths are already being forged by AI in war crimes investigations. Syrian Archive, with its 3.5 million videos, for example, has trained machine-learning algorithms to sift through its database in search of cluster munitions, using the same simple binary — cluster bomb/not cluster bomb — now applied toward medical diagnoses, as in cancer/not cancer. LLMs take the power of AI a step or two further, in ways that could well revolutionize journalism. They are, for example, able to take textual data from a wide range of sources, including social media posts, blogs, forums and news articles, to extract patterns, identify sentiments and predict future trends.
One should beware of using LLMs for information, though. They have a tendency to “hallucinate,” to fill in knowledge gaps with falsehoods. If, for example, I had relied on ChatGPT’s description of the landscape in the interior of British Columbia, I would have described the Thompson River as a lake. There would’ve been no coming back from that error.
A few years ago, Bellingcat founder Eliot Higgins told me, “We’re building the plane as we’re flying it.” Now, however, the model of journalism based on digital investigation appears fully airworthy. Visual investigations teams in major outlets are developing a new style of reporting, and many of their methods are available to everyone.
There may be no keeping up with all the emerging tools, but curiosity and experimentation with them can take journalists a long way toward deeper, more effective investigative reporting. ,
Ronald Niezen is a professor of practice in sociology and political science/international relations at the University of San Diego. He is the author of many books, including “#HumanRights” and the novel “The Memory Seeker,” for which he received training in open source investigation from Bellingcat, Berkeley’s Center for Human Rights and the International Institute for Criminal Investigations.