The Evolution of Privacy Enhancing Technologies (PETs) Trends in 2022

January 18, 2022
by Merry Marwig, CIPP/US

This post is part of G2's 2022 digital trends series. Read more about G2’s perspective on digital transformation trends in an introduction from Tom Pringle, VP, market research, and additional coverage on trends identified by G2’s analysts.

Analyzing risky data in the cloud with privacy-enhancing technologies (PETs)   

2022 TRENDS PREDICTION

Data privacy technology is swiftly adapting to meet B2B company data privacy operationalization needs. Innovations in data privacy software and privacy-enhancing technologies (PETs) will grow by seven times in 2022. 

PETs such as differential privacy and homomorphic encryption will find new use cases by enterprise-level corporations beyond their current mostly academic or governmental use cases. 

The growth of PETs is especially important given that recently proposed federal legislation for online privacy legislation in the US has listed exceptions for privacy-preserving computing methods.

One issue companies face is how to perform data analysis on risky datasets, i.e., datasets that include sensitive information, especially data stored in the cloud, while securing this data and honoring the privacy requests of the people whose data it regards. 

Failing to secure risky datasets and identifying individuals via personally identifying information (PII) can have adverse effects, such as negatively impacting their reputation, insurability, employment, or financial status, or even leading to personal harm. 

Examples of personally identifying information (PII)

Companies can mitigate these risks by using synthetic data, which are artificial datasets created to mimic real datasets. But for those who require data analysis on the actual datasets, can utilize two new PETs—differential privacy and homomorphic encryption—that give companies greater tools beyond existing data masking, de-identification, and encryption tools used to protect sensitive data.

Analyzing risky datasets with differential privacy

Differential privacy builds on the business case for anonymizing datasets, but in a more secure way. 

The problem with de-identified datasets, where some identifying information is removed from the dataset, is that the data within them sometimes can be reidentified. Some typical privacy-related attacks on de-identified datasets are:

  • Differencing attack: A bad actor uses background knowledge they have about a person to see if their data is included in a dataset to learn additional information, often sensitive information, about the person. 
  • Reconstruction attack: Dataset reconstruction attacks happen when someone combines other datasets to reconstruct the original dataset.

For example, in the mid 1990s, the Massachusetts Group Insurance Commission released what they thought was an anonymized healthcare dataset showing hospital visits of state employees. A student accurately identified the governor's information in this dataset, deducing which individual records belonged to the governor. 

What is Differential Privacy?    

Differential privacy solves reidentification problems by introducing noise, or randomized results, into the dataset while still maintaining the validity of the analytical results. Introducing noise does not eliminate reconstruction or differencing risks, but it makes it nearly impossible to identify specific people’s data within the dataset with certainty. 

Differential privacy is made possible by complex mathematics. However, at a fundamental level, this is how differential privacy works:

Quote by Ulfar Erlingsson, a security researcher at Google, explaining differential privacy

Using differential privacy, organizations can more securely analyze risky datasets. For example, Tumult Labs used differential privacy to analyze Internal Revenue Service (IRS) income datasets to see how a college degree impacts a person's earning potential. Combining a person’s income with the name of the university they attended makes a reconstruction attack easy by cross referencing this information against other datasets. However, using differential privacy to insert noise into the dataset adds an element of uncertainty. 

Large commercial organizations such as Amazon use differential privacy to analyze customers’ personalized shopping preferences, Facebook uses it for behavioral advertising targeting analysis while complying with global data privacy regulations, and Apple collects information on words people type, but that are not in Apple’s word dictionaries

Analyzing risky datasets with homomorphic encryption

Another way to protect sensitive data when analyzing datasets is through encryption

Most encryption techniques today focus on data-in-transit and data-at-rest. So sensitive data is secure when it is being transmitted or is sitting in cloud storage. But what about data in use? To use encrypted data, it needs to be decrypted, analyzed, and re-encrypted. And any time that sensitive data is unencrypted, it becomes a security risk.     

What is Homomorphic Encryption?

Homomorphic encryption allows data to remain encrypted in use while being analyzed. This enables users to store encrypted data and run operations on it without decrypting it. It allows users to confidentially query the dataset without showing their intent, as well. Here, the data structure remains the same, so the computational results will be the same whether the data was encrypted or not. 

There are different types of homomorphic encryption, which differ based on how the mathematical functions are used:

  • Partially homomorphic encryption: This allows one mathematical operation—either addition or multiplication—to be performed an unlimited number of times on the dataset.
  • Somewhat homomorphic encryption: This allows one operation, addition or multiplication, up to a certain complexity, a limited number of times.
  • Fully homomorphic encryption: This is still under development but would be able to use both addition and multiplication any number of times.

Homomorphic encryption is suited for companies that store encrypted data in the cloud, as it does not risk the data during analysis. Other use cases include healthcare-related analysis, such as sharing sensitive patient records with researchers or analyzing data from highly-regulated industries like financial services

Homomorphic encryption does have drawbacks, namely its speed or lack thereof. But more and more companies are now investing in this space. For example, IBM released a Homomorphic Encryption toolkit in 2020. Microsoft offers an open-source homomorphic encryption library, as well. Other companies in the homomorphic encryption space include Enveil and Zama. 

Security is the first priority for today’s software buyers 

The B2B buying market as a whole has signaled that it is finally getting serious about security. For example, in a recent report by G2 on software buyer behavior, security is cited as the topmost factor for mid-market and enterprise technology buyers. And when markets demand security and privacy technology, innovation delivers.

The three factors which are most important when purchasing software for small business, mid-market, and enterprise

While PETs like differential privacy and homomorphic encryption have historically been used only by governments, academic researchers, and the largest companies, I believe these tools will become mainstream for enterprise companies that handle sensitive customer data. And we will start seeing increased B2B-related innovations and commercialization of this technology in the coming year. 

Innovation in data privacy technology is growing swiftly. MIT released a comprehensive study of technological change, highlighting the fastest growing technology. Most technology improves at a rate of 25% per year. A query on their search portal showed that data privacy technology is improving at 178% annually. In comparison, one of the most talked-about technology areas, cloud computing, is improving at just a clip faster, at a rate of 229% annually. 

Graph showing data privacy tech is innovating...7X faster than most tech

Given the evident interest and investments in cloud computing, I think we ought to pay some attention to these PETs, as well.

Want to learn more about Encryption Software? Explore Encryption products.

Merry Marwig, CIPP/US
MMC

Merry Marwig, CIPP/US

Merry Marwig is a senior research analyst at G2 focused on the privacy and data security software markets. Using G2’s dynamic research based on unbiased user reviews, Merry helps companies best understand what privacy and security products and services are available to protect their core businesses, their data, their people, and ultimately their customers, brand, and reputation. Merry's coverage areas include: data privacy platforms, data subject access requests (DSAR), identity verification, identity and access management, multi-factor authentication, risk-based authentication, confidentiality software, data security, email security, and more.