Business

Implement differential privacy to power up data sharing and cooperation

Published

3 years ago

February 24, 2022

Ram Iyer

Maxime Agostini
Contributor

Maxime Agostini is the co-founder and CEO of Sarus, a privacy company supported by Y Combinator that lets organizations leverage confidential data for analytics and machine learning.

Tianhui Michael Li is the founder of The Data Incubator, an eight-week fellowship to help Ph.D.s and postdocs transition from academia into industry. It was acquired by Pragmatic Institute. Previously, he headed monetization data science at Foursquare and has worked at Google, Andreessen Horowitz, J.P. Morgan, and D.E. Shaw.

More posts by this contributor

Traditionally, companies have relied upon data masking, sometimes called de-identification, to protect data privacy. The basic idea is to remove all personally identifiable information (PII) from each record. However, a number of high-profile incidents have shown that even supposedly de-identified data can leak consumer privacy.

In 1996, an MIT researcher identified the then-governor of Massachusetts’ health records in a supposedly masked dataset by matching health records with public voter registration data. In 2006, UT Austin researchers re-identifed films watched by thousands of individuals in a supposedly anonymous dataset that Netflix had made public by combining it with data from IMDB.

In a 2022 Nature article, researchers used AI to fingerprint and re-identify more than half of the mobile phone records in a supposedly anonymous dataset. These examples all highlight how “side” information can be leveraged by attackers to re-identify supposedly masked data.

These failures led to differential privacy. Instead of sharing data, companies would share data processing results combined with random noise. The noise level is set so that the output does not tell a would-be attacker anything statistically significant about a target: The same output could have come from a database with the target or from the exact same database but without the target. The shared data processing results do not disclose information about anybody, hence preserving privacy for everybody.

To implement differential privacy, one should not start from scratch, as any implementation mistake could be catastrophic for the privacy guarantees.

Operationalizing differential privacy was a significant challenge in the early days. The first applications were primarily the provenance of organizations with large data science and engineering teams like Apple, Google or Microsoft. As the technology becomes more mature and its cost decreases, how can all organizations with modern data infrastructures leverage differential privacy in real-life applications?

Differential privacy applies to both aggregates and row-level data

When the analyst cannot access the data, it is common to use differential privacy to produce differentially private aggregates. The sensitive data is accessible through an API that only outputs privacy-preserving noisy results. This API may perform aggregations on the whole dataset, from simple SQL queries to complex machine learning training tasks.

A typical setup for leveraging personal data with differential privacy guarantees. Image Credits: Sarus

One of the disadvantages of this setup is that, unlike data masking techniques, analysts no longer see individual records to “get a feel for the data.” One way to mitigate this limitation is to provide differentially private synthetic data where the data owner produces fake data that mimics the statistical properties of the original dataset.

Related Topics:

Up Next

Daily Crunch: Overnight, Russia’s invasion puts Ukrainian tech industry on a war footing

Don't Miss

Promise’s flexible payment platform for government debts grows fast, raises $25M to keep growing

Entertainment7 days ago

I went to the ‘Severance’ pop-up in Grand Central Station. It was wild.

Entertainment6 days ago

What’s new to streaming this week? (Jan. 17, 2025)

Entertainment6 days ago

Explainer: Age-verification bills for porn and social media

Entertainment5 days ago

If TikTok is banned in the U.S., this is what it will look like for everyone else

Entertainment5 days ago

‘Night Call’ review: A bad day on the job makes for a superb action movie

Entertainment5 days ago

How ‘Grand Theft Hamlet’ evolved from lockdown escape to Shakespearean success

Entertainment5 days ago

‘September 5’ review: a blinkered, noncommittal thriller about an Olympic hostage crisis

Entertainment5 days ago

‘Back in Action’ review: Cameron Diaz and Jamie Foxx team up for Gen X action-comedy

The Televisor

Implement differential privacy to power up data sharing and cooperation

Business

Implement differential privacy to power up data sharing and cooperation

Differential privacy applies to both aggregates and row-level data

10 Sundance films you should know about now

What drives John Cena? The ‘What Drives You’ host speaks out

‘The Brutalist’ AI backlash, explained

OnePlus 13 review: A great option if you’re sick of the usual flagships

‘Night Call’ review: A bad day on the job makes for a superb action movie

How ‘Grand Theft Hamlet’ evolved from lockdown escape to Shakespearean success

If TikTok is banned in the U.S., this is what it will look like for everyone else

‘One of Them Days’ review: Keke Palmer and SZA are friendship goals

‘Back in Action’ review: Cameron Diaz and Jamie Foxx team up for Gen X action-comedy

‘September 5’ review: a blinkered, noncommittal thriller about an Olympic hostage crisis

What’s new to streaming this week? (Dec. 27, 2024)

2025’s public domain works and how you can use them, from Popeye to ‘The Sound and the Fury’

Beyoncé’s Christmas halftime show on Netflix: What to know about the NFL event

The greatest ’90s films on Prime Video

How to watch ‘Wicked’ at home: Release date, streaming deals, and more

CES 2025 highlights: 12 new gadgets you can buy already

‘American Primeval’ review: Can Netflix’s grimy Western mini-series greatest ‘Yellowstone’?

Tesla launched the new Model Y in China. Here’s what you need to know

Eight ways Mark Zuckerberg changed Meta ahead of Trump’s inauguration

Meta ditches fact-checking for community notes ahead of second Trump term

10 Sundance films you should know about now

What drives John Cena? The ‘What Drives You’ host speaks out

‘The Brutalist’ AI backlash, explained

OnePlus 13 review: A great option if you’re sick of the usual flagships

‘Night Call’ review: A bad day on the job makes for a superb action movie

How ‘Grand Theft Hamlet’ evolved from lockdown escape to Shakespearean success

If TikTok is banned in the U.S., this is what it will look like for everyone else

‘One of Them Days’ review: Keke Palmer and SZA are friendship goals

‘Back in Action’ review: Cameron Diaz and Jamie Foxx team up for Gen X action-comedy

‘September 5’ review: a blinkered, noncommittal thriller about an Olympic hostage crisis

Trending

The Televisor

Implement differential privacy to power up data sharing and cooperation

Differential privacy applies to both aggregates and row-level data

You may like

10 Sundance films you should know about now

What drives John Cena? The ‘What Drives You’ host speaks out

‘The Brutalist’ AI backlash, explained

OnePlus 13 review: A great option if you’re sick of the usual flagships

‘Night Call’ review: A bad day on the job makes for a superb action movie

How ‘Grand Theft Hamlet’ evolved from lockdown escape to Shakespearean success

If TikTok is banned in the U.S., this is what it will look like for everyone else

‘One of Them Days’ review: Keke Palmer and SZA are friendship goals

‘Back in Action’ review: Cameron Diaz and Jamie Foxx team up for Gen X action-comedy

‘September 5’ review: a blinkered, noncommittal thriller about an Olympic hostage crisis

What’s new to streaming this week? (Dec. 27, 2024)

2025’s public domain works and how you can use them, from Popeye to ‘The Sound and the Fury’

Beyoncé’s Christmas halftime show on Netflix: What to know about the NFL event

The greatest ’90s films on Prime Video

How to watch ‘Wicked’ at home: Release date, streaming deals, and more

CES 2025 highlights: 12 new gadgets you can buy already

‘American Primeval’ review: Can Netflix’s grimy Western mini-series greatest ‘Yellowstone’?

Tesla launched the new Model Y in China. Here’s what you need to know

Eight ways Mark Zuckerberg changed Meta ahead of Trump’s inauguration

Meta ditches fact-checking for community notes ahead of second Trump term

10 Sundance films you should know about now

What drives John Cena? The ‘What Drives You’ host speaks out

‘The Brutalist’ AI backlash, explained

OnePlus 13 review: A great option if you’re sick of the usual flagships

‘Night Call’ review: A bad day on the job makes for a superb action movie

How ‘Grand Theft Hamlet’ evolved from lockdown escape to Shakespearean success

If TikTok is banned in the U.S., this is what it will look like for everyone else

‘One of Them Days’ review: Keke Palmer and SZA are friendship goals

‘Back in Action’ review: Cameron Diaz and Jamie Foxx team up for Gen X action-comedy

‘September 5’ review: a blinkered, noncommittal thriller about an Olympic hostage crisis

Trending