Perform Identity Resolution in Snowflake
LiveRamp's Identity Resolution application in Snowflake allows for the translation of various identifiers to RampIDs. This allows you to resolve personally-identifiable information (PII) or device identifiers to a persistent pseudonymous identifier for persons and households. You can also input an individual-based RampID and get back any household-based RampID that might be associated with that individual.
LiveRamp's Identity Resolution application in Snowflake allows for the translation of various identifiers to RampIDs. This allows you to resolve personally-identifiable information (PII) or device identifiers to a persistent pseudonymous identifier for persons and households. You can also input an individual-based RampID and get back any household-based RampID that might be associated with that individual.
Note
For more information about RampID, see "RampID Methodology".
Once you've translated your data to RampIDs, you can then share that data to your LiveRamp account for activation. For more information, see "Share Data from Snowflake to Your LiveRamp Account".
The following identifiers can be resolved:
Names
Postal addresses
Email addresses
Phone numbers
Cookies
MAIDs (mobile device IDs)
CTV IDs (Connected TV Device IDs)
CIDs (custom identifiers)
Person-based, maintained RampIDs (for resolution to household RampIDs)
These capabilities are available within Snowflake through a native app, which creates a share to your account, opening up a view to query the reference data set from within your own Snowflake environment. See "LiveRamp Embedded Identity in Snowflake" for more information.
Overall Steps
After you've set up the identity resolution native app in Snowflake (see "Set Up a Native App" for instructions), perform the following steps to perform identity resolution:
Verify that you've performed all the native app setup tasks in the completion checklist.
Prepare and deploy the appropriate input and metadata tables for identity resolution.
To resolve PII identifiers (such as name, address, phone, and/or email), follow the instructions on input table creation in the “Input Table Columns for PII Resolution” section below.
To resolve SHA-256 hashed email addresses only, follow the instructions on input table creation in the “Input Table Columns for Email-Only Resolution” section below.
Note
For email-only data, this execution type may be more performant than the PII resolution process, depending on your warehouse set up.
To resolve device identifiers (such as mobile device IDs, CTV IDs, or cookies) or to resolve an individual RampID to a household RampID, follow the instructions on input table creation in the “Input Table Columns for Device Resolution” section below.
To resolve custom identifiers (CIDs), follow the instructions on input table creation in the “Input Table Columns for CID Resolution” section below.
Perform the appropriate identity resolution process, depending on the identifiers being resolved:
View the output table.
See the sections below for information on performing these tasks.
Completion Checklist for Native App Setup
Before performing an identity resolution operation for the first time, verify that you've completed the following tasks to set up an identity resolution native app in Snowflake:
Note
For instructions on completing these tasks, see "Set Up a Native App".
Accepted the terms and conditions of the application.
Installed the appropriate LiveRamp Snowflake native application.
Updated the firewall configuration for authentication calls.
Created and granted permissions to a role that gives the App Share database access to the appropriate tables.
Created logging and metrics tables, added them to a share, and shared that share back to LiveRamp.
After these tasks have been completed, you are ready to prepare the tables and perform the operation. See the sections below for more information.
Prepare the Tables for Identity Resolution
Performing identity resolution requires two tables:
A metadata table, indicated in the code as
$customer_meta_table_name
.Note
A metadata table can be reused for multiple operations, but a separate metadata table must be prepared for each different identity resolution operation you want to perform. For example, if you’re going to perform identity resolution on both MAIDs and hashed emails, you’ll need a different metadata table for each operation.
An input table, indicated in the code as
$customer_input_table_name
.Note
An input table needs to be prepared for each identity resolution operation and can only contain one type of identifiers.
To create the input and metadata tables:
Click in the Create input and metadata table procedure.
set customer_db_name = 'RESOLUTION_DEMO'; set customer_schema_name=concat($customer_db_name,'.','public'); set customer_input_table_name=concat($customer_schema_name,'.','RESOLUTION_INPUT_TABLE'); set customer_meta_table_name=concat($customer_schema_name>'.','RESOLUTION_META_TABLE'); set customer_metrics_table_name=concat($customer_schema_name,'.','METRICS_TABLE'); set customer_logging_table_name=concat($customer_schema_name,'.','LOGGING_TABLE'); set customer_output_table_name='RESOLUTION_OUTPUT_TABLE';
Make any necessary changes to the variables:
RESOLUTION_DEMO
: The name of your database.public
: The name of the schema that holds the tables for identity resolution.RESOLUTION_INPUT_TABLE
: The name of the input table to use for the operation.RESOLUTION_META_TABLE
: The name of the metadata table to use for the operation.METRICS_TABLE
: The name of the metrics table to use for the operation.LOGGING_TABLE
: The name of the logging table to use for the operation.RESOLUTION_OUTPUT_TABLE
: The name of the output table that will be created after the identity resolution operation has been run.
Click
.
You can create these tables inside Snowflake or import the tables into your database using Snowflake’s standard methods. The variables should be replaced with your own values. Make sure to reference the names correctly in the metadata table and make sure that the column names also match up correctly.
When creating tables, keep the following guidelines in mind:
Every column name must be unique in a table.
Try not to use additional columns in the input tables required for the identity resolution operation.
Having extra columns slows down processing.
Per Snowflake guidelines, table names cannot begin with a number.
Metadata Table Columns and Descriptions
A metadata table can be reused for multiple operations, but a separate metadata table must be prepared for each different identity resolution operation you want to perform. For example, if you’re going to perform identity resolution on both MAIDs and hashed emails, you’ll need a different metadata table for each operation.
Metadata column names must match those shown in the table below. The column names are not case sensitive, and should not be enclosed in single or double quotation marks.
See the table below for a list of the metadata table columns and descriptions:
Example | ||
---|---|---|
| liveramp_client | |
| 84159be2-ab93-4bf8-24c9-2g123ef08815 | |
|
|
|
|
| Options include:
NoteCurrently, each identifier type has to be separated into its own input table (including CID types) and only one option above can be chosen for each metadata table. |
| { “name”: [ “FIRSTNAME”, “LASTNAME” ], “streetAddress”: [ “ADDRESSLINE” ], “city”: ”CITY”, “state”: “STATE”, “zipCode”: “ZIPCODE” } | Enter the column name for the input table column(s) which contains the IDs to be resolved. For PII resolution, enter a string that maps each identifier type to the column names for that identifier, separated by commas and enclosed with curly brackets, as shown in the example. When including multiple column names per identifier type, enclose those column names in straight brackets. Inputs for PII can include:
NoteIf resolving street addresses, zipcode is a required field along with either name or address. |
| 5 | For email or PII resolution only, you can specify the maximum number of RampIDs to be returned. Enter an integer between 1 and 10 to specify the maximum number of RampID results returned per input identifier (to return only the “best match”, returning 1 RampID is sufficient). For other identifiers (device identifiers or CIDs), do not include this column. |
Input Table Columns and Descriptions
The column names for the input table can be whatever you want to use, as long as the column name(s) for the identifiers matches the values specified in the TARGET_COLUMN
column of the metadata table. Do not use any column names that are the same as the columns names returned in the output table for the identity resolution operation you're going to run.
See the sections below for suggested input table columns and descriptions for each resolution type.
The output table is created by the operation that you run. For an example, see the sections in "View the Output Table" below.
Input Table Columns for PII Resolution
The PII resolution process operates similarly to device identifier and CID resolution, with a key difference in the data output: instead of returning the original identifier mapped to its associated RampID, data running through PII resolution will pass through a privacy filter which removes the PII and reswizzles the table. Because of this, any attributes you need to keep associated with the identifier need to be included in the input table. For more information, see the "Privacy Filter" section below.
Note
This offering includes running our Identity Graph on demand. Our testing indicates that a warehouse size of 4XL that is not being used for any other operations performs best for average workloads, but this is dependent on your individual setup. Contact your LiveRamp representative to understand the performance implications of your setup.
When resolving email data only, using the email-only resolution operation can provide higher throughput compared to full PII resolution. This is dependent on warehouse and setup, so talk with your LiveRamp team to determine the best approach for the use case.
These column names cannot be used in the input table for PII resolution:
RampID
Rank
Filter Name
See the table below for a list of the suggested input table columns and descriptions for PII resolution.
Suggested Column Name | Example | |
---|---|---|
| John | You can include separate First Name and Last Name columns or you can combine first name and last name in one column (such as “Name”). |
| Doe | You can include separate First Name and Last Name columns or you can combine first name and last name in one column (such as “Name”). |
| 123 Main St | |
| Apt 1 | You can include separate Address 1 and Address 2 columns or you can combine all street address information in one column (such as “Address”). |
| Smalltown | When matching on address, City is optional. |
| CA |
|
| 12345 |
|
| john@email.com |
|
| 555-123-4567 |
|
| For PII resolution, you can include columns with attribute data. These columns will be returned in the output table (for more information, see the "View the PII Resolution Output Table" section below). |
Input Table Columns for Email-Only Resolution
The email-only resolution process operates similarly to device identifier and CID resolution, with a key difference in the data output: instead of returning the original identifier mapped to its associated RampID, data running through email-only resolution will pass through a privacy filter which removes the PII and reswizzles the table. Because of this, any attributes you need to keep associated with the identifier need to be included in the input table. For more information, see the "Privacy Filter" section below.
Note
When resolving email data only, using email-only resolution can provide higher throughput compared to full PII resolution. This is dependent on warehouse and setup, so talk with your LiveRamp team to determine the best approach for the use case.
To perform identity resolution across additional PII touchpoints, see the “View the PII Resolution Output Table” section above.
See the table below for a list of the suggested input table columns and descriptions for email-only resolution.
Suggested Column Name | Example | Description |
---|---|---|
| a64bleicQ3MEYck2YtNUIyco1500QjJGLUszlUGlNjgtQjQ3QUEwMTNEMTA1CgNE |
|
| Male | For email address resolution, you can include columns with attribute data. These columns will be returned in the output table (for more information, see the "View the Email-Only Resolution Output Table" section below). |
Input Table Columns for Device Resolution
The device resolution operation can be used for the following purposes:
To translate device identifiers (cookies, MAIDs, and CTV IDs) into individual RampIDs
To translate individual RampIDs into their associated household RampIDs
See the tables below for a list of the suggested input table columns and descriptions for these device resolution options.
Note
Each device resolution input table should contain only one identifier column (either a device identifier or a maintained RampID).
You can include columns with attribute data, but these columns will not be returned in the output table.
See the table below for a list of the suggested input table columns and descriptions for translating device identifiers.
Suggested Column Name | Example | Description |
---|---|---|
| 1f4d256c-1f08-41f6-a108-bbe511de9497 | Can be one of the following identifiers:
|
See the table below for a list of the suggested input table columns and descriptions for translating individual RampIDs into their associated household RampIDs.
Suggested Column Name | Example | Description |
---|---|---|
| XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd |
|
Input Table Columns for CID Resolution
See the table below for a list of the suggested input table columns and descriptions for CID resolution.
Note
You can include columns with attribute data, but these columns will not be returned in the output table.
Suggested Column Name | Example | Description |
---|---|---|
| b916clarib la1;blNj10gtQjQ3QUEwMTNEMTcaktboEc0g9022cxoiaklr20185 | The CID for translation to a RampID |
Perform the Identity Resolution Operation
Once you’ve prepared the metadata table and the input table, you’re ready to perform the identity resolution operation.
You perform an identity resolution operation by running the identifier resolution procedure shown below, which includes checking that the output has succeeded. You can then view the output table to check the results.
The output tables vary somewhat, depending on the type of identfiers being resolved.
To perform the identity resolution operation:
On your worksheet, switch to the native app database and LR_APP_SCHEMA under it.
Locate the
lr_resolution_and_trancoding
procedure shown below and click on that code block.call lr_resolution_and_transcoding ( $customer_input_table_name, $customer_meta_table_name, $output_table_name, $customer_logging_table_name, $customer_metrics_table_name );
Click
.The resolution operation runs to completion.
// Check for output /* Check for output */ - call check_for_output( $output_table_name ); /* END SECTION */
Click
.Once Snowflake returns a status message of
success
, you can view the output table in the native app database underlr_app_schema
. See the sections below for more information.Note
If an
error
status message is returned, run theCheck for output
procedure again.
The results end up in the output table in the same database, with the fields shown in the appropriate section below.
View the Output Table
The identity resolution results end up in the output table in the same database you specified previously.
Once you've confirmed that the output table has been generated, see the appropriate section below for information on the output table format for the type of identity resolution operation that was run.
View the PII Resolution Output Table
The PII resolution process passes the input table through a privacy filter which removes the PII and reswizzles the table (in addition to other operations). Because of this, any attributes you need to keep associated with the identifier need to be included in the input table. For more information, see the "Privacy Filter" section below.
Identity resolution of PII provides supplemental match metadata for additional insight into customer data that can provide powerful signals for making decisions based on RampIDs.
For PII resolution, the output table includes the fields shown in the table below.
Column | Sample | Description |
---|---|---|
| XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd | Returns the resolved RampID in your domain encoding. |
| Male | Any attribute columns passed through the service are returned. |
| 1 | Provides insight on the “best” RampID for any given input (the lower the number, the “better” the RampID). If no maintained RampID is found, this value will be "null". |
| name_phone | Returns the filter name where the match occurred, which will be one of the following options:
If no maintained RampID is found, this value will be "null". |
View the Email-Only Resolution Output Table
The email-only resolution process operates similarly to device identifier and CID resolution, with a key difference in the data output: instead of returning the original identifier mapped to its associated RampID, data running through email-only resolution will pass through a privacy filter which removes the PII and reswizzles the table. Because of this, any attributes you need to keep associated with the identifier need to be included in the input table. For more information, see the "Privacy Filter" section below.
For email-only resolution, the results end up in the output table in the same database, with the following fields (as shown below):
RAMPID
(resolved email data)ATTRIBUTES
(based on other data passed through the service).
Column | Sample | Description |
---|---|---|
| XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd | Returns the resolved RampID in your domain encoding. |
| Male | Any attribute columns passed through the service are returned. |
View the Device Identifier Resolution Output Table
For device identifier resolution, the results end up in the output table in the same database, with the following fields (as shown below):
DEVICE_ID
(original input)RAMPID
(converted values).
| 93abc799-a0a5-40b5-80dd-d2ab61d4d072 | |
| XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd |
View the CID Resolution Output Table
For CID resolution, the results end up in the output table in the same database, with the following fields (as shown below):
CID_ID
(original input)RAMPID
(converted values).
Column | Sample | Description |
---|---|---|
| 93abc799-a0a5-40b5-80dd-d2ab61d4d072 | Original identifier passed in. |
| XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd | Returns the resolved RampID in your domain encoding. |
Privacy Filter
To minimize the risk of re-identification (the ability to tie PII directly to a RampID), the service includes the following processes when resolving PII identifiers (PII resolution or email-only resolution):
Column Values: The process evaluates the combination of all the column values on a per row basis for unique values. If a particular combination of column values occurs 3 or fewer times, the rows containing those column values will not be matchable and will not be returned in the output table.
>5% of the table unmatchable: If, based on column value uniqueness, >5% of the file rows are unmatchable, the job will fail.
Number of Unique RampIDs: If fewer than 100 unique RampIDs would be returned, the job will fail.
Reswizzle full table: Upon completion, the full table will be reswizzled to return the rows
RampID | attribute 1 | attribute 2 | attribute n
in a different order than what was submitted in the input table.