Perform Identity Resolution in Snowflake
LiveRamp's Identity Resolution application in Snowflake allows for the translation of various identifiers to RampIDs. This allows you to resolve device identifiers or email addresses to a persistent pseudonymous identifier for persons and households. Users can also input an individual-based RampID and get back any household-based RampID that might be associated with that individual.
Note
For more information about RampID, see "RampID Methodology".
The following identifiers can be resolved:
Cookies
MAIDs (mobile device IDs)
CTV IDs (Connected TV Device IDs)
Email addresses (SHA-256 hashed)
Person-based, maintained RampIDs (for resolution to household RampIDs)
These capabilities are available within Snowflake through a native app, which creates a share to your account, opening up a view to query the reference data set from within your own Snowflake environment. See "LiveRamp Embedded Identity in Snowflake" for more information.
Overall Steps
After you've set up the identity resolution native app in Snowflake (see "Set Up a Native App" for instructions), perform the following steps to perform identity resolution:
Verify that you've performed all the native app setup tasks in the completion checklist.
Prepare and deploy the input and metadata tables for identity resolution.
Perform identity resolution.
See the sections below for information on performing these tasks.
Completion Checklist for Native App Setup
Before performing a device resolution operation for the first time, verify that you've completed the following tasks to set up an identity resolution native app in Snowflake:
Note
For instructions on completing these tasks, see "Set Up a Native App".
Accepted the terms and conditions of the application.
Installed the appropriate LiveRamp Snowflake native application.
Updated the firewall configuration for authentication calls.
Created and granted permissions to a role that gives the App Share database access to the appropriate tables.
Created logging and metrics tables, added them to a share, and shared that share back to LiveRamp.
After these tasks have been completed, you are ready to prepare the tables and perform the operation. See the sections below for more information.
Prepare the Tables for Identity Resolution
Performing identity resolution requires two tables:
A metadata table, indicated in the code as {{resolution_meta_table}}.
Note
A metadata table can be reused for multiple operations, but a separate metadata table must be prepared for each different identity resolution operation you want to perform. For example, if you’re going to perform identity resolution on both MAIDs and hashed emails, you’ll need a different metadata table for each operation.
An input table, indicated in the code as {{resolution_input_table}}.
Note
An input table needs to be prepared for each identity resolution operation.
To create the input and metadata tables:
Click in the Create input and metadata table procedure.
Click
.Make any necessary changes to the variables:
{{SNOWFLAKE_CONSUMER_INPUT_DATABASE}}
{{SNOWFLAKE_CONSUMER_SCHEMA}}
{{resolution_input_table}}
{{resolution_meta_table}}
You can create these tables inside Snowflake or import the tables into your database using Snowflake’s standard methods. The {{...}} variables may be substituted with your own values. Be sure to reference the names correctly in the metadata table (which has as its default name {{resolution_meta_table}}) and that the column names also match up correctly.
When creating tables, keep the following guidelines in mind:
Every column name must be unique in a table.
Try not to use additional columns in the input tables required for the identity resolution operation.
Having extra columns slows down processing.
Table Naming Guidelines
When naming tables, follow these guidelines:
Table names must use ASCII characters and not contain either spaces or special characters such as !@#$%.
You can use underscores “_” within the name, but not as the initial character.
Consider using the following elements in your table names: type of data or description, a date or timestamp, and an identity designation.
For example, the table name
Identity_TwoButtonSuitsCampaign_impressions_2022-06-01
contains all three element types.
Metadata Table Columns and Descriptions
A metadata table can be reused for multiple operations, but a separate metadata table must be prepared for each different identity resolution operation you want to perform. For example, if you’re going to perform identity resolution on both MAIDs and hashed emails, you’ll need a different metadata table for each operation.
Metadata column names must match those shown in the table below. The column names are not case sensitive, and should not be enclosed in single or double quotation marks.
Column | Description |
---|---|
| Enter either an existing |
| Password/secret for the |
|
|
| Options include:
NoteCurrently, different identifiers have to be separated into separate input tables and only one option above can be chosen for each metadata table. |
| Enter the column name for the input table column which contains the IDs to be resolved. |
| For email resolution only, you can specify the maximum number of RampIDs to be returned for each hashed email address. Enter an integer between 1 and 15 to specify the maximum number of RampID results returned per input identifier (to return only the “best match”, returning 1 RampID is sufficient). For identifiers other than hashed email, do not include this column. |
Input Table Columns and Descriptions
The column names for the input table can be whatever you want to use, as long as the column name for the identifiers matches the values specified in the “TARGET_COLUMN” column of the metadata table.
Column | Sample | Description |
---|---|---|
|
| Can be one of the following identifiers: cookie, MAID, CTV ID, or SHA-256 hashed email (for resolution to RampID), or maintained RampID (for resolution to household RampID). |
| Male | For email address resolution, you can include columns with attribute data. These columns will be returned in the output table. Any attribute columns included in an input table used for device resolution will not be returned in the output table. |
The output table is created by the operation that you run. For an example, see the sections below.
Formatting Guidelines for Email Address Hashing
Follow these best practices for hashing email addresses:
Email addresses should be uppercased prior to hashing
Use SHA-256, hex-encoding string to be lowercased, character set UTF-8
Perform Identity Resolution for Device Identifiers
You perform a device resolution operation by running the device resolution procedure and then checking that the output has succeeded. You then open the output table to check the results.
Note
To perform identity resolution for email addresses, see the “Perform Identity Resolution for Email Addresses” section below.
To resolve the device identifiers in the input table:
Select the input table containing the identifiers to resolve in the left-hand Database Objects navigation pane.
Click on the magnifying glass icon tab shown in the figure below.
Figure 6. Device resolution results.Run the resolution operation specific for your database and schema to convert your input table values using the parameters in the metadata table for conversion to RampIDs in the output table.
Locate the
lr_resolution_and_trancoding
procedure shown below and click on that code block.call <native_app_database_name>.lr_app_schema.lr_resolution_and_transcoding ( ‘<consumer_database_name>.<consumer_schema_name>.<INPUT_TABLE>’, ‘<consumer_database_name>.<consumer_schema_name>.<META_TABLE>’, ‘<OUTPUT_TABLE>’, ‘<consumer_database_name>.<consumer_schema_name>.<LOGGING_TABLE>’, ‘<consumer_database_name>.<consumer_schema_name>.<METRICS_TABLE>’ );
Ensure that you have correctly entered the names of the database objects you are using into the procedure above.
Click
.The resolution operation runs to completion.
Open the LiveRamp app worksheet.
Locate the
Check for output
procedure shown below and click in that block.// Check for output /* Check for output */ call <native_app_database_name>.lr_app_schema.check_for_output( '<OUTPUT_TABLE>' ); /* END SECTION */
Click Run.
Wait until Snowflake returns a status message of
success
orerror
.Once the app returns a
success
message, the output should be displayed in the native app database underlr_app_schema
.
The results end up in the output table in the same database, which is the COOKIES_OUTPUT
, as shown in the example in step 2 above. The output table has the following fields: DEVICE_ID
(original input), and RAMPID
(converted values).
Column | Sample | Description |
---|---|---|
|
| Original identifier passed in. |
|
| Returns the resolved RampID in your domain encoding. |
Perform Identity Resolution for Email Addresses
The email resolution process allows for translation of a known identifier (SHA-256 hashed email) to a pseudonymous identifier (RampID) while minimizing the risk of re-identification.
The identity resolution process for email addresses operates similarly to device resolution with a key difference in the data output: instead of returning the original identifier mapped to its associated RampID, in email address resolution any attribute columns are returned but the original hashed email addresses are removed from the output.
Note
To perform identity resolution for device identifiers, see the “Perform Identity Resolution for Device Identifiers” section above.
Data running through email resolution will pass through a privacy filter which removes the email addresses and reswizzles the table, meaning any attributes you need to keep associated with the identifier need to be included in the input table. For more information, see the "Privacy Filter" section below.
To resolve email identifiers:
Select the input table containing the emails addresses to resolve in the left-hand Database Objects navigation pane.
Click on the magnifying glass icon tab.
Run the resolution operation specific for your database and schema to convert your input table values using the parameters in the metadata table for conversion to RampIDs in the output table.
Locate the
lr_resolution_and_trancoding
procedure shown below and click on that code block.call <native_app_database_name>.lr_app_schema.lr_resolution_and_transcoding ( ‘<consumer_database_name>.<consumer_schema_name>.<INPUT_TABLE>’, ‘<consumer_database_name>.<consumer_schema_name>.<META_TABLE>’, ‘<OUTPUT_TABLE>’, ‘<consumer_database_name>.<consumer_schema_name>.<LOGGING_TABLE>’, ‘<consumer_database_name>.<consumer_schema_name>.<METRICS_TABLE>’ );
Ensure that you have correctly entered the names of the database objects you are using into the procedure above.
Click
.The resolution operation runs to completion.
Open the LiveRamp app worksheet.
Locate the
Check for output
procedure shown below and click in that block./* Check for output */ call <native_app_database_name>.lr_app_schema.check_for_output( '<OUTPUT_TABLE>' ); /*
Click
.
Wait until Snowflake returns a status message of success
or error
. Once the app returns a success
message, the output should be displayed in the native app database under lr_app_schema
.
The results end up in the output table in the same database, with the following fields: RAMPID
(resolved email data), and ATTRIBUTES
(based on other data passed through the service).
Column | Sample | Description |
---|---|---|
Ramp_ID | XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRDXYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMT | Returns the resolved RampID in the client’s domain. |
Attribute 1 | Male | Any attribute columns passed through the service are returned. |
Privacy Filter
To minimize the risk of re-identification (the ability to tie an email address directly to a RampID), the service includes the following processes:
Column Values: The process evaluates the combination of all the column values on a per row basis for unique values. If a particular combination of column values occurs 3 or fewer times, the rows containing those column values will not be matchable and will not be returned in the output table.
>5% of the table unmatchable: If, based on column value uniqueness, >5% of the file rows are unmatchable, the job will fail.
Number of Unique RampIDs: If fewer than 100 unique RampIDs would be returned, the job will fail.
Reswizzle full table: Upon completion, the full table will be reswizzled to return the rows
RampID | attribute 1 | attribute 2 | attribute n
in a different order than what was submitted in the input table.