The Big Data Problem (No, Not That One)

Jan 31, 2022 | children online, Internet Safety

I’ve written a lot about the big data problems that beset the digital environment; about the lack of transparency in the way our data is collected and how it is used (and how Big Tech are belatedly trying to show they now care about our privacy); about the need to educate children about how their private data may be harvested.

But, one feature of data collection has gone somewhat under the radar: the data that Big Tech hold on how we use their platforms. This data would be of enormous use to academics and researchers in working out what the impact of using the platforms might be having on our mental health, or on our safety. And strangely enough, Big Tech aren’t all that keen to release it.

Calls to release data to researchers

One of the recent recommendations from the Royal Society’s report on the online information environment was that “social media platforms should establish ways to allow independent researchers access to data in a privacy compliant and secure manner”.

And academics working in the field of the impact of digital technologies on adult and children’s mental health have explained the difficulties of working out what is really going on, in for example the area of cyberbullying, by pointing out how so much of the data that they need to study it, resides with social media platforms.

“If I wanted to study bullying in 1980, I could go to a school, a playground, the local bowling alley, and I could basically capture a snapshot in the life of an adolescent. If I want to study bullying in 2021, I couldn’t get a full picture of exactly what’s happening because the social media companies hold all the data.“

Professor Andrew Przybylski, Oxford University

UK academics were delighted to see in the report from the Joint Committee reviewing the UK’s Draft Online Safety Bill a recommendation that online service providers be ‘encouraged’ to share relevant data with external researchers studying online safety.

Glad the work paid off and the Joint Committee Report on the Online Safety Bill now highlights the need for researchers to access social media data to understand Online Harms. It also suggests Ofcom should get the powers to enable this to take place. https://t.co/eRYAhOfun3 pic.twitter.com/oybawPTc6I
— Amy Orben (@OrbenAmy) December 14, 2021

But encouraging is not mandating, no-one is holding their breath that any of this data is going to be released to researchers anytime soon.

The recent Instagram research leaks by Facebook whistleblower Frances Haughen lifted the lid on just how eye-opening, and valuable, access to internal user data could be.

Frances Haughen, Facebook whistleblower, testifying to UK Parliament

Haughen shared data from internal Facebook presentations on how teens use Instagram which indicated how its algorithms were deliberately leading them to anorexia-related content and detrimentally affecting their mental health. Campaigner Ian Russell, father of 14-year old Molly Russell who took her life in November 2017 after viewing suicide and self-harm content on Instagram, has gone further and said that the teen’s use of the app “helped kill my daughter”.

One of the most pressing concerns of the parents I speak to in schools all over the world is how much use of social media might be negatively impacting their children. And I have to say that we don’t really know.

We have plenty of studies showing strong correlation between heavy use of social media and negative mental health outcomes, but nothing that shows definitive causation. And that’s because the data that academics really need resides behind the opaque walls of the social media platforms themselves.

Big Tech issue vehement denials whenever the latest study hits the press, appearing to ‘prove’ harm caused by use of their platforms, but none of them are backing this up with data of their own. Which would make any reasonable observer wonder, why not? Better science on the issue will only be possible with better data, and Facebook (Meta) et al hold the key.

My Brain Has Too Many Tabs Open by Tanya Goodin

For more on data privacy, over-sharing and other problems of the digital world pick up a copy of my new book My Brain Has Too Many Tabs Open (or buy one for someone who needs it).

← Is Double-Entry Bookkeeping the Key to Eliminating Online Misinformation? It's Not All Fun and Games on Safer Internet Day 2022 →