Perform Identity Resolution in Snowflake

Abstract

LiveRamp's Identity Resolution capability in Snowflake allows for the translation of various identifiers to RampIDs. This allows you to resolve personally-identifiable information (PII) or device identifiers to a persistent pseudonymous identifier for persons and households. You can also input an individual-based RampID and get back any household-based RampID that might be associated with that individual.

Note

For more information about RampID, see "RampID Methodology".

Once you've translated your data to RampIDs, you can then share that data to your LiveRamp account for activation. For more information, see "Share Data from Snowflake to Your LiveRamp Account".

The following identifiers can be resolved:

Names
Postal addresses
Email addresses
Phone numbers
Cookies
MAIDs (mobile device IDs)
CTV IDs (Connected TV Device IDs)
Datavant Tokens
CIDs (custom identifiers)
Person-based, maintained RampIDs (for resolution to household RampIDs)

Based on the type of identifier you’re resolving, you might receive one RampID per identifier or multiple RampIDs per identifier:

For PII touchpoints (in PII or email resolution), you can choose to receive from 1 to 10 RampIDs (if available).
For cookie and mobile device ID resolution, typically one RampID is returned (but shared touchpoints with more than one RampID can exist).
For CTV identifiers, it is common to receive multiple individual RampIDs per identifier.
For Datavant Tokens, multiple RampIDs can be returned per row. Additional RampID matches are more likely to be found as the number of Datavant Tokens provided per row increases.
For existing CID syncs, the number of RampIDs will vary.
When resolving individual RampIDs to household RampIDs, only one household RampID is returned.
When resolving a universe dataset with deconfliction, multiple touchpoints can be used and RampIDs will be returned that are ranked most relevant based on how active the linkages are in the digital ecosystem and other factors. This can be particularly useful for large datasets where overconnected RampIDs can introduce noise into analytics. For more information, see the “Deconfliction Options” section below.

These capabilities are available within Snowflake through a native app, which creates a share to your account, opening up a view to query the reference data set from within your own Snowflake environment. See "LiveRamp Embedded Identity in Snowflake" for more information.

Performing identity resolution with the LiveRamp Identity native app requires the creation of two tables:

A metadata table, indicated in the sample SQL as customer_meta_table_name.
An input table, indicated in the sample SQL as customer_input_table_name.

Deconfliction Options

Deconfliction refers to an optional identity resolution configuration that optimizes the fit of the LiveRamp graph to your definition of a customer, which is particularly beneficial for universe datasets.

Deconfliction attempts to reduce the number of “conflicting” RampIDs for each individual record in a universe dataset. A RampID is considered conflicting if the same RampID is linked to the record for another user (such as a family member) in your system. Deconfliction attempts to include only the RampIDs that are most relevant to an individual record and reduce the number of “shared” RampIDs.

To utilize deconfliction, you’ll need to provide a CID (custom ID) for each record, and for the configuration to drive the highest value, the full universe is required to deconflict across the table. The output you receive will include the best fidelity RampIDs for each CID.

There are four deconfliction options you can choose from:

Standard: This configuration returns the RampIDs that are determined to be most relevant and removes CIDs that are determined to be duplicates (based on linking to the same RampID). This is the default option and covers most advertiser use cases.
Maximized first-party fidelity: This configuration returns the RampIDs that are determined to be most relevant but preserves additional CIDs even if there are RampIDs that indicate LiveRamp could consolidate them. This option is ideal for publishers and other data owners.
Household expansion: This configuration allows clients with a household-based CID to resolve and deconflict with individual RampIDs that represent the full household. This option is ideal for retailers with household-based loyalty programs, as an example.
Event integrity: This configuration preserves all CIDs, meaning identity conflicts are minimized but not eliminated. This option is ideal for CIDs that represent transaction data, as an example.

For more information on deconfliction, see “Using Deconfliction on a Universe Dataset”.

Overall Steps

After you've set up the LiveRamp Identity native app in Snowflake (see "Set Up the LiveRamp Native App in Snowflake" for instructions), perform the following steps to perform identity resolution:

Create the input table(s) for the appropriate identity resolution operation. When resolving a full universe dataset with deconfliction, mixed job types are an option. This allows for multiple input tables and/or multiple identifiers to be processed in one call. For more information, see the “Input Table Columns for a Universe Dataset with Deconfliction” section below.
Note
An input table needs to be prepared for each identity resolution operation and can only contain one type of identifier. When resolving a full universe dataset with deconfliction, mixed job types are an option. This allows for multiple input tables and/or multiple identifiers to be processed in one call. For more information, see the “Input Table Columns for Universe Dataset Resolution with Deconfliction” section below.
Specify the variables to be used in the calls.
Create the metadata table for the appropriate identity resolution operation.
Note
A metadata table can be reused for multiple operations, but a separate metadata table must be prepared for each different job type you want to perform. For example, if you’re going to perform identity resolution on MAIDs and hashed emails, you’ll need a different metadata table for each operation. If you plan to resolve your entire universe with deconfliction, the metadata table can include configurability for multiple input tables and identifier types. For more information, see the “Create the Metadata Table for Universe Dataset Resolution with Deconfliction” section below.
Set up permissions for the tables to be used for identity resolution.
Perform the appropriate identity resolution process, depending on the identifiers being resolved.
View the output table.

See the sections below for information on performing these tasks.

Note

The LiveRamp Identity native app is parameterized and relies on variables set by the user with the sample SQL in the Execution worksheet. When executing an identity resolution operation, those variables must be set during each active session.

Create the Input Table for Identity Resolution

An input table needs to be prepared for each identity resolution operation. For universe dataset resolution, one or multiple tables can be used in the process. For more information, see the “Input Table Columns for Universe Dataset Resolution with Deconfliction” section below.

When creating tables, keep the following guidelines in mind:

The column names for the input table can be whatever you want to use, as long as the names match the values specified in the metadata table.
Do not use any column names that are the same as the column names returned in the output table for the identity resolution operation you're going to run.
Every column name must be unique in a table.
Try not to use additional columns in the input tables required for the identity resolution operation as having extra columns slows down processing.
Per Snowflake guidelines, table names cannot begin with a number.

See the sections below for suggested input table columns and descriptions for each resolution type.

The output table is created by the operation that you run. For an example, see the sections in "View the Output Table" below.

Input Table Columns for PII Resolution (Without Deconfliction)

The standard PII resolution process passes the data through a privacy filter which removes the PII and reswizzles the table. Because of this, any attributes you need to keep associated with the identifier need to be included in the input table. For more information, see the "Privacy Filter" section below.

Note

This offering includes running our Identity Graph on demand. Our testing indicates that a warehouse size of 2XL that is not being used for any other operations performs best for average workloads, but this is dependent on your individual setup. Contact your LiveRamp representative to understand the performance implications of your setup.
When resolving email data only, using the email-only resolution operation can provide higher throughput compared to full PII resolution. This is dependent on warehouse and setup, so talk with your LiveRamp team to determine the best approach for the use case.
Utilizing hashed attributes requires a LiveRamp Data Ethics review and an attestation. We will also work with your team to confirm the separation of known and pseudonymous data prior to enabling permissions.

These column names cannot be used in the input table for PII resolution:

RampID
__lr_rank
__lr_filter_ name

See the table below for a list of the suggested input table columns and descriptions for PII resolution.

Suggested Column Name	Example	Description
`first_name`	John	You can include separate First Name and Last Name columns or you can combine first name and last name in one column (such as "Name").
`last_name`	Doe	You can include separate First Name and Last Name columns or you can combine first name and last name in one column (such as "Name").
`address_1`	123 Main St
`address_2`	Apt 1	You can include separate Address 1 and Address 2 columns or you can combine all street address information in one column (such as "Address").
`city`	Smalltown	When matching on address, City is optional.
`state`	CA	When matching on address, State is optional. If including State, must be a two-character, capitalized abbreviation ("CA", not "California" or "Ca").
`zip`	12345	Required when matching on addresses. Can be in 5-digit format or 9-digit format (ZIP+4).
`email`	john@email.com	Plaintext emails only. Only one email per input row is permitted. Other emails must be dropped or included in an additional row. If you include an additional row, repeat the values for the name fields for the best match rates. All emails must meet these requirements: Have characters before and after the "@" sign Contain a period character (".") Have characters after the period character Examples of valid emails include: a@a.com A@A.COM email@account.com EMAIL@ACCOUNT.COM email@sub.domain.com EMAIL@SUB.DOMAIN.COM
`phone`	555-123-4567	Plaintext phone numbers only. Only one phone number per input row is permitted. Other phone numbers must be dropped or included in an additional row. If you include an additional row, repeat the values for the name fields for the best match rates. All phone numbers must meet these requirements: Can be more than 10 characters if leading numbers over 10 characters are “0” or “1” If no leading numbers are used, must be 10 characters long Can contain hyphens ("-"), parentheses ("(" or ")"), plus signs ("+"), and periods (".") Examples of valid phone numbers include: 8668533267 866.853.3267 (866) 853-3267 8668533267 +1 (866) 853-3267 +18668533267 18668533267 1111111118668533267 08668533267 Examples of invalid phone numbers include: 987654321 (fewer than 10 characters) 98765432109 (more than 10 characters) 1234567890 (after removing the leading "1", less than 10 characters remain) 0987654321 (after removing the leading "0", less than 10 characters remain)
`attribute_1`		For PII resolution, you can include columns with attribute data. These columns will be returned in the output table (for more information, see the "View the PII Resolution Output Table (Without Deconfliction)" section below). If you specify that an attribute column should be hashed, it will appear in the output table with a prefix of "hashed_". The input table must not include a column with the same name as the name of the hashed column in the output table.

Input Table Columns for Email-Only Resolution

The standard email-only resolution process operates similarly to PII resolution. Any attributes you need to keep associated with the identifier need to be included in the input table. For more information, see the "Privacy Filter" section below.

Note

When resolving email data only, using email-only resolution can provide higher throughput compared to full PII resolution. Talk with your LiveRamp team to determine the best approach for the use case.
To perform identity resolution across additional PII touchpoints, see the “View the PII Resolution Output Table (Without Deconfliction)” section above.

See the table below for a list of the suggested input table columns and descriptions for email-only resolution.

Suggested Column Name	Example	Description
`hashed_email`	8c9775a5999b5f0088008c0b26d7fe8549d5c80b0047784996a26946abac0cef	SHA-256, MD5, and SHA-1 hashed emails accepted. Email addresses should be lowercased and UTF-8 encoded prior to hashing. After hashing, convert the resulting hash into lowercase hexadecimal representation. For an example of hashing in Snowflake, see the "Snowflake Hashing Example" section below.
`attribute_1`	Male	For email address resolution, you can include columns with attribute data. These columns will be returned in the output table (for more information, see the "View the Email-Only Resolution Output Table" section below).

Input Table Columns for Device ID Resolution

The device ID resolution operation can be used for the following purposes:

To translate device identifiers (cookies, MAIDs, and CTV IDs) into individual RampIDs
To translate individual RampIDs into their associated household RampIDs

See the tables below for a list of the suggested input table columns and descriptions for these device ID resolution options.

Note

Each device ID resolution input table should contain only one identifier column (either a device identifier or a maintained RampID).
You can include columns with attribute data, but these columns will not be returned in the output table.

See the table below for the suggested input table column and description for translating device identifiers.

Suggested Column Name	Example	Description
`device_identifier`	1f4d256c-1f08-41f6-a108-bbe511de9497	Can be one of the following identifiers: Cookie MAID CTV ID

See the table below for a list of the suggested input table columns and descriptions for translating individual RampIDs into their associated household RampIDs.

Suggested Column Name	Example	Description
`RampID`	XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd	The RampID for translation to a Household RampID. Must be a maintained RampID (to have an associated with a Household RampID).

Input Table Columns for Datavant Token Resolution

See the table below for the suggested input table column and description for Datavant Token Resolution. This process enables the retrieval of an existing Datavant Token to RampID mapping, hosted by LiveRamp.

Note

You can include columns with attribute data, but these columns will not be returned in the output table.

Suggested Column Name	Example	Description
`token_{token_number}`	5qkWMLqO2fXdcO/G+bhU6+ppJEfFRONVE9EwdLk7qy0=	A Datavant transit token from your site to ‘liveramp’. These are created by transforming your site-specific tokens to transit tokens using Datavant’s Transform functionality. The number of token columns are determined by the configuration you’ve set up in Datavant’s portal. The following token types are accepted: token_1 token_2 token_3 token_4 token_6 token_7 token_8 token_9 token_12 token_14 token_15 token_17 token_18 token_20 token_21 token_22 token_23 token_24 token_25 token_26 token_29 token_30
`attribute_1`	orange	For Datavant resolution, you can include columns with attribute data. These columns will be returned in the output table.

Suggested Column Name

Example

Description

token_{token_number}

5qkWMLqO2fXdcO/G+bhU6+ppJEfFRONVE9EwdLk7qy0=

A Datavant transit token from your site to ‘liveramp’. These are created by transforming your site-specific tokens to transit tokens using Datavant’s Transform functionality. The number of token columns are determined by the configuration you’ve set up in Datavant’s portal.

The following token types are accepted:

token_1
token_2
token_3
token_4
token_6
token_7
token_8
token_9
token_12
token_14
token_15
token_17
token_18
token_20
token_21
token_22
token_23
token_24
token_25
token_26
token_29
token_30

attribute_1

orange

For Datavant resolution, you can include columns with attribute data. These columns will be returned in the output table.

Input Table Columns for CID Matching

See the table below for the suggested input table column and description for CID matching. This process enables the retrieval of an existing CID to RampID mapping, hosted by LiveRamp.

Note

You can include columns with attribute data, but these columns will not be returned in the output table.

Suggested Column Name	Example	Description
`cid`	b916clarib la1;blNj10gtQjQ3QUEwMTNEMTcaktboEc0g9022cxoiaklr20185	The CID for translation to a RampID

Input Table Columns for Universe Dataset Resolution with Deconfliction

This process allows for resolution of a universe dataset with a deconfliction configuration, minimizing conflicts across the entire dataset. The output from this process is a deconflicted hashed CID to RampID mapping.

When resolving a universe dataset, you can use one input table or multiple input tables. Each input table can include one identifier type or multiple identifier types (including PII, hashed email, or MAIDs). The examples below can be used for the specific situations listed but you can create an input table that uses any combination of these identifiers.

Note

You can include columns with attribute data, but these columns will not be returned in the output table.

The requirements for each identifier type (listed in the relevant sections above) apply here. For example, any PII-based records must include zip in order to match on address and hashed emails must be lowercased and UTF-8 encoded prior to hashing.

.See the table below for a list of the suggested input table columns for a job that will contain both hashed emails and MAIDs with deconfliction.

Suggested Column Name	Example	Description
`cid`	g221lariab la8;blNj10gtQjQ3QUEwMTNEMTcaktboEc0g9022cxoiaklr91054	A custom identifier that represents an individual in the dataset. Data format is UTF-8 compliant alphanumeric string of up to 256 characters
`hashed_email`	8c9775a5999b5f0088008c0b26d7fe8549d5c80b0047784996a26946abac0cef	SHA-256, MD5, and SHA-1 hashed emails accepted. Email addresses should be lowercased and UTF-8 encoded prior to hashing. After hashing, convert the resulting hash into lowercase hexadecimal representation. For an example of hashing in Snowflake, see the "Snowflake Hashing Example" section below.
`device_identifier`	1f4d256c-1f08-41f6-a108-bbe511de9497	MAIDs are the only supported device identifier at this time

See the table below for a list of the suggested input table columns for a job that will contain plaintext PII.

Suggested Column Name	Example	Description
`cid`	`cid`	A custom identifier that represents an individual in the dataset. Data format is UTF-8 compliant alphanumeric string of up to 256 characters.
`first_name`	John	You can include separate First Name and Last Name columns or you can combine first name and last name in one column (such as "Name").
`last_name`	Doe	You can include separate First Name and Last Name columns or you can combine first name and last name in one column (such as "Name").
`address_1`	123 Main St
`address_2`	Apt 1	You can include separate Address 1 and Address 2 columns or you can combine all street address information in one column (such as "Address").
`city`	Smalltown	When matching on address, City is optional.
`state`	CA	When matching on address, State is optional. If including State, must be a two-character, capitalized abbreviation ("CA", not "California" or "Ca").
`zip`	12345	Required when matching on addresses. Can be in 5-digit format or 9-digit format (ZIP+4).
`email`	john@email.com	Plaintext emails only. Only one email per input row is permitted. Other emails must be dropped or included in an additional row. If you include an additional row, repeat the values for the name fields for the best match rates. All emails must meet these requirements: Have characters before and after the “@” sign Contain a period character (“.”) Have characters after the period character Examples of valid emails include: a@a.com A@A.COM email@account.com EMAIL@ACCOUNT.COM email@sub.domain.com EMAIL@SUB.DOMAIN.COM
`phone`	555-123-4567	Plaintext phone numbers only. Only one phone number per input row is permitted. Other phone numbers must be dropped or included in an additional row. If you include an additional row, repeat the values for the name fields for the best match rates. All phone numbers must meet these requirements: Can be more than 10 characters if leading numbers over 10 characters are “0” or “1” If no leading numbers are used, must be 10 characters long Can contain hyphens ("-"), parentheses ("(" or ")"), plus signs ("+"), and periods (".") Examples of valid phone numbers include: 8668533267 866.853.3267 (866) 853-3267 8668533267 +1 (866) 853-3267 +18668533267 18668533267 1111111118668533267 08668533267 Examples of invalid phone numbers include: 987654321 (fewer than 10 characters) 98765432109 (more than 10 characters) 1234567890 (after removing the leading "1", less than 10 characters remain) 0987654321 (after removing the leading "0", less than 10 characters remain)

Snowflake Hashing Example

The following code snippet shows an example of hashing emails in Snowflake:

-- SHA-1
SELECT SHA1(TRIM(LOWER('  LiveRamp@example.com  '))); 
-- Expected result: 91ac4ee2ca1782581f12d865a6779eb179f8b22a

-- MD5
SELECT MD5(TRIM(LOWER('  LiveRamp@example.com  ')));
-- Expected result: 39c324aa0c7a3ee896884fe0cf086f0c

-- SHA256
SELECT SHA2(TRIM(LOWER('  LiveRamp@example.com  '))); 
-- Expected result: 28324c709525ec8eda8aac51dfb36730262bc3051402250131c4c81fa453df8c

Specify the Variables

To specify the variables to be used for the operation:

Open the Execution Steps worksheet with the sample SQL for execution.
Update the following variables in the sample SQL that is shown below and then run the SQL:
- DATABASE: The name of your database.
- PUBLIC: The name of the schema that holds the tables.
- INPUT_TABLE: The name of the input table(s) to use for the operation.
- META_TABLE: The name of the metadata table to use for the operation.
- OUTPUT_TABLE: The name of the output table that will be created after the operation has been run.
- IDENTITY_RESOLUTION_AND_TRANSCODING: The name of the database the native app is loaded to.

--Update this section with the appropriate variables
set customer_db_name = 'DATABASE';
set customer_schema_name = concat($customer_db_name, '.', 'PUBLIC');
set customer_input_table_name = concat($customer_schema_name, '.', 'INPUT_TABLE');
-- If there are multiple input tables add additional variables
set customer_input_table_name_2 = concat($customer_schema_name, '.', 'INPUT_TABLE_2');
set customer_meta_table_name = concat($customer_schema_name, '.', 'META_TABLE');
set output_table_name = 'OUTPUT_TABLE';

-- Name of the installed application
set application_name = 'IDENTITY_RESOLUTION_AND_TRANSCODING';

Create the Metadata Table

A metadata table can be reused for multiple operations, but a separate metadata table must be prepared for each different identity resolution operation you want to perform. For example, if you’re going to perform identity resolution on both MAIDs and hashed emails, you’ll need a different metadata table for each operation.

Note

For the universe dataset with deconfliction operation, the metadata table can be reused for multiple operations and can include multiple identifier types. For example, if the universe dataset includes both MAIDs and hashed emails, you can create one metadata table that specifies multiple input tables.

See the sections below for instructions on creating the metadata table.

Create the Metadata Table for PII Resolution (Without Deconfliction)

To create the metadata table for PII resolution:

Update the following variables in the sample SQL from the Execution worksheet shown below and then run the SQL:
- <client_id>: Enter either an existing client ID or a new one provided in implementation.
- <client_secret>: Enter the password/secret for the client ID.
- <up to 4 name column names>: Enter the names of the columns in the input table to be used for the “name” element. Each input table column name should be enclosed in double-quotes. Enter a maximum of 4 name columns. If entering multiple column names, separate the column names with commas.
- <up to 7 address column names>: Enter the names of the columns in the input table to be used for the "address" element. Each input table column name should be enclosed in double-quotes. Enter a maximum of 7 address columns. If entering multiple column names, separate the column names with commas.
- <city column>: Enter the name of the column to be used for the "city" element.
- <state column>: Enter the name of the column to be used for the "state" element.
- <zipcode column>: Enter the name of the column to be used for the "zipcode" element.
- <phone column>: Enter the name of the column to be used for the "phone" element.
- <email column>: Enter the name of the column to be used for the "email" element.
- up to 10 hashed attribute column names: Enter the names of the attribute column(s) in the input table that should be passed through to the output in hashed format. Each input table column name should be enclosed in double quotes. Enter a maximum of 10 hashed attribute columns. If entering multiple column names, separate the column names with commas.

Note

You can specify the number of RampIDs to be returned for each set of PII, to a maximum of 10 RampIDs per PII set. By default, 1 RampID will be returned per PII set. To change this, replace the “1” in the “as limit_val” line at the bottom of the SQL sample shown below with your desired number.

-- FORFor PII Update the parameters here for the metadata table
-- Not all PII types need to be present, remove unused entries from the target_columns JSON
create or replace table identifier($customer_meta_table_name) as
select
    TO_VARCHAR(DECRYPT(ENCRYPT('<client_id>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_id,
    TO_VARCHAR(DECRYPT(ENCRYPT('<client_secret>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_secret,
    'resolution' as execution_mode,
    'pii' as execution_type,
    parse_json($$
    {
      "name": ["<up to 4 name column names>"],
      "streetAddress": ["<up to 7 address column names>"],
      "city": "<city column>",
      "state": "<state column>",
      "zipCode": "<zipcode column>",
      "phone": "<phone column>",
      "email": "<email column>",
      "hashedAttributes": ["<up to 10 hashed attribute column names>"]
    }
    $$) as target_columns,
    1 as limit;

Example of PII Resolution Metadata Table SQL

The populated SQL with the suggested input column names might look like the example shown below:

-- FOR PII Update the parameters here for the metadata table
-- Not all PII types need to be present, remove unused entries from the target_columns JSON
create or replace table identifier($customer_meta_table_name) as
select
    TO_VARCHAR(DECRYPT(ENCRYPT('liveramp_client', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_id,
    TO_VARCHAR(DECRYPT(ENCRYPT('84159be2-ab93-4bf8-24c9-2g123ef08815', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_secret,
    'resolution' as execution_mode,
    'pii' as execution_type,
    parse_json($$
    {
      "name": ["first_name",”"last_name”"],
      "streetAddress": ["address_1",”"address_2"],
      "city": "city",
      "state": "state",
      "zipCode": "zip",
      "phone": "phone",
      "email": "email",
      "hashedAttributes": ["cid"]
    }
    $$) as target_columns,
    1 as limit;

Create the Metadata Table for Email Resolution

To create the metadata table for email resolution:

Update the following variables in the sample SQL from the Execution worksheet shown below and then run the SQL:
- <client_id>: Enter either an existing client ID or a new one provided in implementation.
- <client_secret>: Enter the password/secret for the client ID.
- <column to be resolved>: Enter the name of the column containing the email addresses to be resolved.

Note

You can specify the number of RampIDs to be returned for each email, to a maximum of 10 RampIDs per email. By default, 1 RampID will be returned per email. To change this, replace the "1" in the "as limit_val" line at the bottom of the SQL sample shown below with your desired number.

-- email
-- FOR EMAIL Update the parameters here for the metadata table
create or replace table identifier($customer_meta_table_name) as
select
    TO_VARCHAR(DECRYPT(ENCRYPT('<client_id>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_id,
    TO_VARCHAR(DECRYPT(ENCRYPT('<client_secret>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_secret,
    'resolution' as execution_mode,
    'email' as execution_type,
    '<column to be resolved>' as target_column,
    1 as limit;

Create the Metadata Table for Device ID Resolution

To create the metadata table for device ID resolution:

Update the following variables in the sample SQL from the Execution worksheet shown below and then run the SQL:
- <client_id>: Enter either an existing client ID or a new one provided in implementation.
- <client_secret>: Enter the password/secret for the client ID.
- <execution_type>: Enter the execution type:
  - Enter Cookies to resolve cookies
  - Enter MAID to resolve mobile device IDs
  - Enter CTV to resolve CTV IDs
  - Enter HHLink to resolve individual RampIDs into household RampIDs
- <column to be resolved>: Enter the name of the column containing the device identifiers (cookies, MAIDs, or CTV IDs) or RampIDs to be resolved.

--FOR DEVICE OR CID RESOLUTION Update the parameters here for the metadata table
create or replace table identifier($customer_meta_table_name) as
select
    TO_VARCHAR(DECRYPT(ENCRYPT('<client_id>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_id,
    TO_VARCHAR(DECRYPT(ENCRYPT('<client_secret>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_secret,
    'resolution' as execution_mode,
    '<execution_type>' as execution_type,
    '<column to be resolved>' as target_column;

Create the Metadata Table for Datavant Token Resolution

To create the metadata table for Datavant Token resolution:

Update the following variables in the sample SQL from the Execution worksheet shown below and then run the SQL:
- <client_id>: Enter either an existing client ID or a new one provided in implementation.
- <client_secret>: Enter the password/secret for the client ID.
- <token_type>: Enter type of token. Valid values are the following:
  - token_1
  - token_2
  - token_3
  - token_4
  - token_6
  - token_7
  - token_8
  - token_9
  - token_12
  - token_14
  - token_15
  - token_17
  - token_18
  - token_20
  - token_21
  - token_22
  - token_23
  - token_24
  - token_25
  - token_26
  - token_29
  - token_30
- <token_col>: Enter the name of the token column that contains tokens of the token type

--FOR DEVICE OR CID RESOLUTION Update the parameters here for the metadata table
create or replace table identifier($customer_meta_table_name) as
select
    TO_VARCHAR(DECRYPT(ENCRYPT('', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_id,
    TO_VARCHAR(DECRYPT(ENCRYPT('', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_secret,
    'resolution' as execution_mode,
    'datavant' as execution_type,
    parse_json($$
    {
      "": "",
	"": "",
      ... (add key value pair for each token type to token column name being provided)
    }
    $$) as target_columns;

Create the Metadata Table for CID Matching

To create the metadata table for CID matching to an existing CID sync with LiveRamp:

Update the following variables in the sample SQL from the Execution worksheet shown below and then run the SQL:
- <client_id>: Enter either an existing client ID or a new one provided in implementation.
- <client_secret>: Enter the password/secret for the client ID.
- <execution_type>: Enter the CID name provided to you by LiveRamp and specific to the mapping created.
- <column to be resolved>: Enter the name of the column containing the CIDs to be mapped.

--FOR DEVICE OR CID RESOLUTION Update the parameters here for the metadata table
create or replace table identifier($customer_meta_table_name) as
select
    TO_VARCHAR(DECRYPT(ENCRYPT('<client_id>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_id,
    TO_VARCHAR(DECRYPT(ENCRYPT('<client_secret>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8') as client_secret,
    'resolution' as execution_mode,
    '<execution_type>' as execution_type,
    '<column to be resolved>' as target_column;

Create the Metadata Table for Universe Dataset Resolution with Deconfliction

To create the metadata table for universe dataset resolution with deconfliction, including the ability to use one or multiple input tables:

Update the following variables in the sample SQL from the Execution worksheet shown below and then run the SQL:
- <client_id>: Enter either an existing client ID or a new one provided in implementation.
- <client_secret>: Enter the password/secret for the client ID.
- inputs: Enter the details regarding the input tables and target columns. The sample SQL shows two input tables, one containing PII and the other containing hashed emails and MAIDs as specified by the targetColumns. Additional tables can be added as separate entries in the inputs array. Each table can have a mix of any of the supported column types, although all must contain a cid column, and the name of the cid column must be the same in all tables.
- <deconflictionConfig>: Enter one of the four deconfliction options listed below:
  - standard: This configuration returns the RampIDs that are determined to be most relevant and removes CIDs that are determined to be duplicates (based on linking to the same RampID). This is the default option and covers most advertiser use cases.
  - maximized_1P_fidelity: This configuration returns the RampIDs that are determined to be most relevant but preserves additional CIDs even if there are RampIDs that indicate LiveRamp could consolidate them. This option is ideal for publishers and other data owners.
  - hh_expansion: This configuration allows clients with a household-based CID to resolve and deconflict with individual RampIDs that represent the full household. This option is ideal for retailers with household-based loyalty programs, as an example.
  - event_integrity: This configuration preserves all CIDs, meaning identity conflicts are minimized but not eliminated. This option is ideal for CIDs that represent transaction data, as an example.
  Note
  For more information on deconfliction, see “Using Deconfliction on a Universe Dataset”.

CREATE OR REPLACE TABLE identifier($customer_meta_table_name) as
SELECT {
       'clientId': TO_VARCHAR(DECRYPT(ENCRYPT('<client_id>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8'),
       'clientSecret': TO_VARCHAR(DECRYPT(ENCRYPT('<client_secret>', 'HideFromLogs'), 'HideFromLogs'), 'utf-8'),
       'executionMode': 'resolution',
       'executionType': 'cid_mapping',
       'inputs': [
           {
               'inputTable': $customer_input_table_name,
               'targetColumns': {
                   -- For PII deconfliction  
                   'name': ['<up to 4 name column names>'],
                   'streetAddress': ['<up to 7 address column names>'],
                   'city': '<city column>',
                   'state': '<state column>',
                   'zipCode': '<zipcode column>',
                   'phone': '<phone column>',
                   'email': '<email column>',
                   -- CID column
                   'cid': 'cid_col'
               }
           },
           {
               'inputTable': $customer_input_table_name_2,
               'targetColumns': {
                   -- For Hashed Email deconfliction
                   'hashed_email': '<hashed email column>',
                   -- For Maid deconfliction
                   'maid': '<maid column>',
                   -- CID column
                   'cid': 'cid_col'
               }
           }
       ],
       'config': {'deconflictionConfig': '<deconfliction configuration>'},
       'outputTable': $output_table_name
   }::variant as config;

Set Up Permissions

To set up the permissions for the tables used for translation, run the SQL in the Execution Steps worksheet shown below:

Note

This SQL utilizes the variables that were set up in the “Specify the Variables” section above.

--The remainder of the commands should be run for ALL JOB TYPES, please switch to the Native App database and schema and execute the procedure.  Once completed, please run check_for_output for the output table to be written in the appropriate Job Schema

grant usage on database identifier ($customer_db_name) to application identifier($application_name);
grant usage on schema identifier ($customer_schema_name) to application identifier($application_name);
grant select on table identifier ($customer_input_table_name) to application identifier($application_name);
--If there are multiple input tables grant permission for all of them here
grant select on table identifier ($customer_input_table_name_2) to application identifier($application_name)
grant select on table identifier ($customer_meta_table_name) to application identifier($application_name);


use database identifier ($application_name);
use schema lr_app_schema;

Perform the Identity Resolution Operation

Once you’ve completed the previous steps, you’re ready to perform the identity resolution operation.

You perform an identity resolution operation by running the identifier resolution procedure shown below. You can then view the output table to check the results.

The output tables vary somewhat, depending on the type of identifiers being resolved.

To perform the identity resolution operation:

Locate the appropriatelr_resolution_and_trancoding procedure shown below and run that SQL:
- For all job types except resolving a universe dataset:
```
call lr_resolution_and_transcoding(
	$customer_input_table_name,
	$customer_meta_table_name,
	$output_table_name,
);
```
- For resolving a universe dataset:
```
call lr_resolution_and_transcoding($customer_meta_table_name);
```
The operation runs to completion.

Once the app returns a success message, the output should be displayed in the native app database under lr_app_schema.

If Snowflake returns a status message of Error, check the error message for any information to help you fix the issue and then try running the operation again. For some issues, the error message will direct you to contact LiveRamp Support. For more information, see "Snowflake Operation Error Codes".

The results end up in the output table in the same database, with the fields shown in the appropriate section below.

View the Output Table

The identity resolution results end up in the output table in the application database under the schema “lr_job_schema”.

Once you've confirmed that the output table has been generated, see the appropriate section below for information on the output table format for the type of identity resolution operation that was run.

If for any reason you need to drop the output table, update the parameters in the following command and run:

call DROP_OUTPUT_TABLE(
    '<table_name>'
);

View the PII Resolution Output Table (Without Deconfliction)

The PII resolution process passes the input table through a privacy filter which removes the PII and reswizzles the table (in addition to other operations). Because of this, any attributes you need to keep associated with the identifier need to be included in the input table. For more information, see the "Privacy Filter" section below.

Identity resolution of PII provides supplemental match metadata for additional insight into customer data that can provide powerful signals for making decisions based on RampIDs.

For PII resolution, the output table includes the fields shown in the table below.

Column	Example	Description
`RampID`	XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd	Returns the resolved RampID in your domain encoding.
`attribute_1`	Male	Any attribute columns passed through the service are returned.
`hashed_cid`	63889cfb9d3cbe05d1bd2be5cc9953fd	Any hashed attribute columns passed through the service are returned with their values MD5 hashed.
`__lr_rank`	1	Provides insight on the match cascade level associated with the identifiers. If no maintained RampID is found, this value will be "null".
`__lr_filter_name`	name_phone	Returns the filter name where the match occurred, which will be one of the following options: `name_address_zip` `name_email` `name_phone` `partial_name_email` `partial_name_phone` `strict_name` (name + zip) `email` `phone` `last_name_address` If no maintained RampID is found, this value will be "null".

View the Email-Only Resolution Output Table

The email-only resolution process operates similarly to PII resolution. Any attributes you need to keep associated with the identifier need to be included in the input table. For more information, see the "Privacy Filter" section below.

For email-only resolution without deconfliction, the output table includes the fields shown in the table below.

Column	Example	Description
`RampID`	XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd	Returns the resolved RampID in your domain encoding.
`attribute_1`	Male	Any attribute columns passed through the service are returned.

View the Device Identifier Resolution Output Table

For device identifier resolution, the output table includes the fields shown in the table below:

Column	Example	Description
`device_id`	93abc799-a0a5-40b5-80dd-d2ab61d4d072
`RampID`	XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd

View the Datavant Token Resolution Output Table

For Datavant Token resolution, the output table includes the fields shown in the table below:

Column	Example	Description
`RampID`	XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd

View the CID Matching Output Table

For CID matching against an existing CID sync with LiveRamp, the output table includes the fields shown in the table below:

Column	Sample	Description
`cid_id`	93abc799-a0a5-40b5-80dd-d2ab61d4d072	Original identifier passed in.
`RampID`	XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd	Returns the resolved RampID in your domain encoding.

View the Universe Dataset Resolution Output Table

For universe dataset resolution with deconfliction, the output table includes the fields shown in the table below.

Column	Example	Description
`hashed_cid`	93abc799-a0a5-40b5-80dd-d2ab61d4d072	The CID passed into the process will be returned as an MD5-hashed CID.
`RampID`	XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd	Returns the resolved and deconflicted RampID in your domain encoding.

For household expansion deconfliction configurations, additional metadata is provided to indicate RampIDs returned that are part of the same household, but not necessarily the same person-based identifier from the PII input. The output table for this configuration type includes the fields as shown in the table below.

Column

Example

Description

hashed_cid

93abc799-a0a5-40b5-80dd-d2ab61d4d072

The CID passed into the process will be returned as an MD5-hashed CID.

RampID

XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd

Returns the resolved and deconflicted RampID in your domain encoding.

is_appended

TRUE

Returns the value “TRUE” for RampIDs appended because they belong to the same household but are not directly associated with the PII input.

Returns the value “FALSE” if the RampID represents the individual associated with the PII input.

Privacy Filter

To minimize the risk of re-identification (the ability to tie PII directly to a RampID), the service includes the following processes when resolving PII identifiers (PII resolution or email-only resolution):

Column Values: The process evaluates each column value on a per-row basis for unique values. If any attribute occurs 3 or fewer times, the rows containing those column values will not be matchable and will not be returned in the output table.
Note
This check does not apply to hashed attributes.
>5% of the table unmatchable: If based on column value uniqueness, >5% of the file rows are unmatchable, the job will fail.
Number of Unique RampIDs: If fewer than 100 unique RampIDs would be returned, the job will fail.
Reswizzle full table: Upon completion, the full table will be reswizzled to return the rows RampID | attribute_1 | attribute_2 | attribute_n in a different order than what was submitted in the input table.

Note

When resolving a universe dataset with deconfliction, attributes are not preserved so the privacy filter is not applied. Any PII and hashed email input to a universe dataset with deconfliction still requires at least 100 unique input rows per identifier per file.

In this section:

Perform Identity Resolution in Snowflake

Note

Deconfliction Options

Overall Steps

Note

Note

Note

Create the Input Table for Identity Resolution

Input Table Columns for PII Resolution (Without Deconfliction)

Note

Input Table Columns for Email-Only Resolution

Note

Input Table Columns for Device ID Resolution

Note

Input Table Columns for Datavant Token Resolution

Note

Input Table Columns for CID Matching

Note

Input Table Columns for Universe Dataset Resolution with Deconfliction

Note

Snowflake Hashing Example

Specify the Variables

Create the Metadata Table

Note

Create the Metadata Table for PII Resolution (Without Deconfliction)

Note

Example of PII Resolution Metadata Table SQL

Create the Metadata Table for Email Resolution

Note

Create the Metadata Table for Device ID Resolution

Create the Metadata Table for Datavant Token Resolution

Create the Metadata Table for CID Matching

Create the Metadata Table for Universe Dataset Resolution with Deconfliction

Note

Set Up Permissions

Note

Perform the Identity Resolution Operation

View the Output Table

View the PII Resolution Output Table (Without Deconfliction)

View the Email-Only Resolution Output Table

View the Device Identifier Resolution Output Table

View the Datavant Token Resolution Output Table

View the CID Matching Output Table

View the Universe Dataset Resolution Output Table

Privacy Filter

Note

Note

Search results