Introducing G2.ai, the future of software buying.Try now

Data Tokenization

by Merry Marwig, CIPP/US
What is data tokenization and why is it important? Our G2 guide can help you understand data tokenization, how it’s used by industry professionals, and the benefits of data tokenization.

What is data tokenization?

Data tokenization is a process applied to datasets to protect sensitive information, most commonly data at rest. This technique replaces sensitive data with non-sensitive, stand-in data known as token data. The token data will be in the same format as the original data. For example, tokenized credit card numbers would still show in a sixteen-digit format.

With data tokenization, non-sensitive token data remains in the dataset, while the token’s reference to the original sensitive data is often stored securely outside of the system in a token server. When the original sensitive data is needed again, the token data’s relationship with the original sensitive data can be looked up on the token server; this is called a detokenization process.

Types of data tokenization

Companies have two options for data tokenization, which differ in the speeds needed for detokenization.

  • Vault tokenization: Vault tokenization stores the relationship data between the original sensitive data and its corresponding token in a secure, separate token server vault to be referenced when the original data is needed. Detokenizing this data can take a while, so if detokenization has to happen at scale, companies can consider vaultless tokenization.
  • Vaultless tokenization: Vaultless tokenization does not utilize a token server vault but rather keeps the data where it is but applies tokenization using cryptographic devices. Companies might choose this method to avoid long processing times to lookup relationships in a token server vault.

Benefits of using data tokenization

Data tokenization techniques are most commonly used in the payment processing industry. The Payment Card Industry Data Security Standard (PCI-DSS) requires that sensitive data such as credit card numbers be protected and that tokenization is recognized as a method to achieve this. However, data tokenization can be used to protect any kind of sensitive data.

One common use case for data tokenization methods is protecting patients' health information.

Impacts of using data tokenization

The most common impact of using data tokenization techniques is to secure sensitive information. 

  • Reduce threat vectors: Tokenizing data reduces the ability of bad actors to misuse sensitive data.
  • Reduce the need for advanced security controls: Using tokenized data can reduce the amount of sensitive data that requires more advanced security controls.

Data tokenization vs. data masking

Data tokenization is more commonly used to protect data at rest. This technique can introduce wait times when having to reference the relationship between the token data and the original sensitive data in the token server vault.

Data masking is more commonly used to protect data in use, most commonly in production environments for testing or in applications used by employees with limited access to actual data.

Merry Marwig, CIPP/US
MMC

Merry Marwig, CIPP/US

Merry Marwig is a senior research analyst at G2 focused on the privacy and data security software markets. Using G2’s dynamic research based on unbiased user reviews, Merry helps companies best understand what privacy and security products and services are available to protect their core businesses, their data, their people, and ultimately their customers, brand, and reputation. Merry's coverage areas include: data privacy platforms, data subject access requests (DSAR), identity verification, identity and access management, multi-factor authentication, risk-based authentication, confidentiality software, data security, email security, and more.

Data Tokenization Software

This list shows the top software that mention data tokenization most on G2.

KIProtect makes it easy to ensure compliance and security when working with sensitive or personal data.