Facebook and its parent company Meta have piles of sensitive user data from tracking people across the internet, no clear idea where it’s all stored, who precisely has access to it, or even what information it contains. .
This is a high-level excerpt from a recently unsealed court transcript featuring the testimony of two senior Facebook engineers who were enlisted by a US district court to help clarify Facebook’s data retention practices. ‘company.
Testimony from Facebook CTO Eugene Zarashaw and Software Engineering Manager Steven Elia was first reported by The Intercept.
Facebook made the experts available in an ongoing lawsuit sparked by the Cambridge Analytica data scandal, in which vast amounts of user data was secretly harvested and exploited by a company linked to the presidential campaign. of Donald Trump in 2016.
A court-appointed special master, Daniel Garrie, has the unenviable task of figuring out where Facebook stores personal data in its 55 subsystems — something neither Zarashaw nor Elia could really answer.
Presented with a list of these systems, no engineer recognized what they all were, let alone the data they collected.
When asked where Facebook stores a user’s activity on the platform, data obtained from third parties about user activities outside of Facebook, and other inferred user data, again, neither knew the answer.
“I don’t believe there is a single person who can answer that question,” Zarashaw told the court. “It would take a significant team effort to even be able to answer that question.”
Zarashaw added that Facebook tends to build pieces of infrastructure, “and then leave them running for anyone in the company to use them.” Other teams then “end up using other pieces of infrastructure as underlying storage,” making it difficult to fully account for who is doing what.
“It would take multiple teams on the advertising side to track down exactly the ― where the [user] data stream,” added Zarashaw. “I would be surprised if there was even one person who could conclusively answer this narrow question.”
Additionally, Zarashaw told the court that Facebook had a “somewhat odd engineering culture” in that it often didn’t generate documentation that others could refer to later.
“In fact, the code is its own design document,” the engineer said. “For what it’s worth, that [was] terrifying for me when I first arrived too.
A spokesperson for Meta vehemently disputed the idea that it has internal random data tracking policies.
“Our systems are sophisticated, and it should come as no surprise that no engineer in the company can answer every question about where every user information is stored,” the company said in a statement.
“We have one of the most comprehensive privacy programs in place to oversee data usage across our operations and to carefully manage and protect people’s data. We have made ― and continue to make ― significant investments to meet our privacy commitments and obligations, including extensive data controls.
The transcript reinforces an April Motherboard report, based on a leaked internal document from the company’s Ads and Commercial Products team, which suggested that Facebook is structurally unable to adequately regulate user data due to of how the business is built.
“We do not have an adequate level of control and explainability over how our systems use data, and therefore cannot confidently make controlled policy changes or external commitments such as ‘we don’t will not use data X for purposes Y “”, the document reads. “And yet, that’s exactly what regulators expect of us, increasing our risk of errors and misrepresentations.”
The 15-page document compares Facebook user data to an ink bottle that has been emptied into a body of water.
“You pour this ink into a lake of water (our open data systems, our open culture)…and it flows…everywhere. How do I put this ink back in the bottle? How do you organize it again, so that it only flows to the authorized places of the lake? »