Hashing data for Audience Studio

Many customers wish to bring their CRM records into the platform to match them with the anonymous data in the platform. A convenient key to use when matching these records is an email address. As Audience Studio does not allow the import of PII data into the platform, we recommend the hashing of email addresses so they can be ingested. This functionality is also typically used to identify unique users.

Hashing is the process of taking a piece of data, in this case an e-mail address, and passing it through a cryptographic function that produces a constant output that does not resemble the original data. It is also important to not that it is not possible to take the hashed output and “decode” it to reveal the original value. The marketing industry has settled on using the SHA-256 hashing function as an industry standard for companies wanting to match their customers' email addresses without revealing them to other parties.

Hashing issues

While hashing the same value will give you a consistent output it is possible to to have bad matches due to a number of different reasons.
For example, upper and lower case characters will produce a different output as they are not considered to be the same character.

"a" = ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb

vs

"A" = 559aead08264d5795d3909718cdd05abd49572e84fe55590eef31a88a08fdffd

Hashing Best Practices

To get the most consistent hashed values and for compatibility with other systems and partners we recommend the following steps:

  • Trim any whitespace characters from either side of the email (this can be spaces or newline characters)
  • Only include the email address
  • Convert the email to lowercase
  • Do not add any salt
  • Hash with the algorithm SHA256 and set the output to be hexadecimal (with lowercase characters)

example

To verify that you have implemented it correctly the email address below should generate the displayed value:

" TEST@example.com " = 973dfe463ec85785f5f95af5ba3906eedb2d931c24e69824a89ea65dba4e813b

More examples

The following email addresses should all be converted to test@example.com prior to hashing:

Have more questions? Submit a request

0 Comments

Article is closed for comments.