Scores of Facebook posts from the days before and after the January 6 Capitol Hill riots in Washington are missing.
The posts disappeared from Crowdtangle, a tool owned by Facebook that allows researchers to track what people are saying on the platform, according to academics from New York University and Université Grenoble Alpes.
The lost posts — everything from innocuous personal updates to potential incitement to violence to mainstream news articles — have been unavailable within Facebook’s transparency system since at least May, 2021. The company told POLITICO that they were accidentally removed from Crowdtangle because of a limit on how Facebook allows data to be accessed via its technical transparency tools. It said that the error had now been fixed.
Facebook did not address the sizeable gap in its Crowdtangle data publicly until contacted by POLITICO, despite ongoing pressure from policymakers about the company’s role in helping spread messages, posts and videos about the violent insurrection, which killed five people. On Friday, U.S. lawmakers ordered the company to hand over reams of internal documents and data linked to the riots, including details on how misinformation, which targeted the U.S. presidential election, had spread.
It is unclear how many posts are still missing from Crowdtangle, when they will be restored, and if the problem solely affects U.S. content or material from all of Facebook’s 2.4 billion users worldwide. The academics who discovered the problem estimate that tens of thousands of Facebook posts are currently missing.
The failure to disclose the lost posts, which was due to a technical error, comes at a difficult time for Facebook and its efforts to promote transparency around what people see within its network.
After an internal battle, the company is currently dismantling the Crowdtangle team after researchers and journalists used the tool repeatedly to trace how far-right, extremist and false content circulated widely across both Facebook and Instagram. The tech giant also published its own report this month on what content was most widely viewed during the second quarter of this year, primarily highlighting viral spam and links to mainstreams sites like YouTube.
But after the New York Times was handed details about the most widely viewed posts from the first three months of the year, Facebook was forced to disclose similar statistics for that period. They showed that misinformation around COVID-19 was still among the most popular content on the site despite the company’s efforts to clamp down on it.
The latest episode underscores longstanding concerns about transparency on Facebook.
“Researchers do assume that they are getting all the public content from Facebook pages that are indexed by Crowdtangle,” said Edelson. “Those assumptions have been violated in this case.”
In response to POLITICO, Facebook said it had now fixed the error related to the missing Crowdtangle data, and that all the original posts were still available directly via Facebook. A spokesperson also said that roughly 80 percent of the missing posts flagged by both NYU and Université Grenoble Alpes researchers should not have been available on Crowdtangle, either because they had subsequently deleted or made private by Facebook users. She declined to comment on how many posts, in total, had gone missing from the Crowdtangle platform.
“We appreciate the researchers bringing these posts to our attention,” said the Facebook spokesperson.
‘Something was clearly wrong’
The researchers first discovered the missing posts after comparing two versions of a Crowdtangle database of Facebook content produced by U.S. media outlets between September 2020 and January 2021.
After the Capitol Hill riots, the academics said they had planned to analyze what type of content Facebook had removed related to the insurrection to meet its content moderation policies. But they soon discovered that up to 30 percent of the posts collected in the weeks around the January 6 riots — roughly from December 28, 2020 to January 11, 2021 — from the second Crowdtangle database were missing compared to the original.
“We came up tens of thousands of posts short. We knew something was clearly wrong,” said Edelson. “We were able to find some of the posts that we couldn’t find on Crowdtangle, but we were able to find that they were still available on Facebook. That’s when we knew, OK, this isn’t us, there is some kind of real bug here.”
It is unclear how extensive the problem with the Crowdtangle data is.
Facebook did not comment on how many posts were still missing from the system, and POLITICO’s review of the academics’ work found that less than half of the roughly 50,000 missing posts were currently available via the transparency tool. The remaining Facebook content was no longer accessible, either because it had been deleted or made private on the global platform, and therefore was not automatically collected on Crowdtangle.
The academics flagged the issue to Facebook on August 3 — hours before the company suspended Edelson and two other researchers’ accounts, including their access to Crowdtangle, for their separate work around political ads.
The researchers said they had not heard back from the company about the missing data, even though academics, journalists and policymakers continue to use the transparency tool in efforts to uncover what happened during the Capitol Hill riots.
“Obviously, my situation with Facebook is not ideal. But I think even leaving aside questions of who has permission to access Crowdtangle data and other forms of Facebook transparency data, I think, at this point, Facebook has lost a tremendous amount of credibility,” said Edelson. “And I don’t really know how they are going to get it back.”
This article is part of POLITICO’s premium Tech policy coverage: Pro Technology. Our expert journalism and suite of policy intelligence tools allow you to seamlessly search, track and understand the developments and stakeholders shaping EU Tech policy and driving decisions impacting your industry. Email [email protected] with the code ‘TECH’ for a complimentary trial.