Skip to main content

Perform Identity Resolution in Snowflake

LiveRamp's Identity Resolution application in Snowflake allows for the translation of various identifiers to RampIDs. This allows you to resolve device identifiers or email addresses to a persistent pseudonymous identifier for persons and households. Users can also input an individual-based RampID and get back any household-based RampID that might be associated with that individual.

Note

For more information about RampID, see "RampID Methodology".

The following identifiers can be resolved:

  • Cookies

  • MAIDs (mobile device IDs)

  • CTV IDs (Connected TV Device IDs)

  • Email addresses (SHA-256 hashed)

  • Person-based, maintained RampIDs (for resolution to household RampIDs)

These capabilities are available within Snowflake through a native app, which creates a share to your account, opening up a view to query the reference data set from within your own Snowflake environment. See "LiveRamp Embedded Identity in Snowflake" for more information.

Overall Steps

After you've set up the identity resolution native app in Snowflake (see "Set Up a Native App" for instructions), perform the following steps to perform identity resolution:

  1. Verify that you've performed all the native app setup tasks in the completion checklist.

  2. Prepare and deploy the input and metadata tables for identity resolution.

  3. Perform identity resolution.

See the sections below for information on performing these tasks.

Completion Checklist for Native App Setup

Before performing a device resolution operation for the first time, verify that you've completed the following tasks to set up an identity resolution native app in Snowflake:

Note

For instructions on completing these tasks, see "Set Up a Native App".

  • Accepted the terms and conditions of the application.

  • Installed the appropriate LiveRamp Snowflake native application.

  • Updated the firewall configuration for authentication calls.

  • Created and granted permissions to a role that gives the App Share database access to the appropriate tables.

  • Created logging and metrics tables, added them to a share, and shared that share back to LiveRamp.

After these tasks have been completed, you are ready to prepare the tables and perform the operation. See the sections below for more information.

Prepare the Tables for Identity Resolution

Performing identity resolution requires two tables:

  • A metadata table, indicated in the code as {{resolution_meta_table}}.

    Note

    A metadata table can be reused for multiple operations, but a separate metadata table must be prepared for each different identity resolution operation you want to perform. For example, if you’re going to perform identity resolution on both MAIDs and hashed emails, you’ll need a different metadata table for each operation.

  • An input table, indicated in the code as {{resolution_input_table}}.

    Note

    An input table needs to be prepared for each identity resolution operation.

To create the input and metadata tables:

  1. Click in the Create input and metadata table procedure.

  2. Click Run.

  3. Make any necessary changes to the variables:

    {{SNOWFLAKE_CONSUMER_INPUT_DATABASE}}

    {{SNOWFLAKE_CONSUMER_SCHEMA}}

    {{resolution_input_table}}

    {{resolution_meta_table}}

You can create these tables inside Snowflake or import the tables into your database using Snowflake’s standard methods. The {{...}} variables may be substituted with your own values. Be sure to reference the names correctly in the metadata table (which has as its default name {{resolution_meta_table}}) and that the column names also match up correctly.

When creating tables, keep the following guidelines in mind:

  • Every column name must be unique in a table.

  • Try not to use additional columns in the input tables required for the identity resolution operation.

  • Having extra columns slows down processing.

Table Naming Guidelines

When naming tables, follow these guidelines:

  • Table names must use ASCII characters and not contain either spaces or special characters such as !@#$%.

  • You can use underscores “_” within the name, but not as the initial character.

  • Consider using the following elements in your table names: type of data or description, a date or timestamp, and an identity designation.

    For example, the table name Identity_TwoButtonSuitsCampaign_impressions_2022-06-01 contains all three element types.

Metadata Table Columns and Descriptions

A metadata table can be reused for multiple operations, but a separate metadata table must be prepared for each different identity resolution operation you want to perform. For example, if you’re going to perform identity resolution on both MAIDs and hashed emails, you’ll need a different metadata table for each operation.

Metadata column names must match those shown in the table below. The column names are not case sensitive, and should not be enclosed in single or double quotation marks.

Column

Description

CLIENT_ID

Enter either an existing CLIENT_ID or a new one provided in implementation.

CLIENT_SECRET

Password/secret for the CLIENT_ID.

EXECUTION_MODE

Resolution.

EXECUTION_TYPE

Options include: 

  • Cookies

  • MAID

  • CTV

  • Email

  • HHLink (to resolve individual RampIDs into household RampIDs)

Note

Currently, different identifiers have to be separated into separate input tables and only one option above can be chosen for each metadata table.

TARGET_COLUMN

Enter the column name for the input table column which contains the IDs to be resolved.

LIMIT

For email resolution only, you can specify the maximum number of RampIDs to be returned for each hashed email address.

Enter an integer between 1 and 15 to specify the maximum number of RampID results returned per input identifier (to return only the “best match”, returning 1 RampID is sufficient).

For identifiers other than hashed email, do not include this column.

Table 4. Metadata table columns.


Input Table Columns and Descriptions

The column names for the input table can be whatever you want to use, as long as the column name for the identifiers matches the values specified  in the “TARGET_COLUMN” column of the metadata table.

Column

Sample

Description

Device identifier , hashed email, or RampID

1f4d256c-1f08-41f6-a108-bbe511de9497

Can be one of the following identifiers: cookie, MAID, CTV ID, or SHA-256 hashed email (for resolution to RampID), or maintained RampID (for resolution to household RampID).

Attribute 1

Male

For email address resolution, you can include columns with attribute data. These columns will be returned in the output table.

Any attribute columns included in an input table used for device resolution will not be returned in the output table.

Table 5. Input table columns for resolution.


The output table is created by the operation that you run. For an example, see the sections below.

Formatting Guidelines for Email Address Hashing

Follow these best practices for hashing email addresses:

  • Email addresses should be uppercased prior to hashing

  • Use SHA-256, hex-encoding string to be lowercased, character set UTF-8

Perform Identity Resolution for Device Identifiers

You perform a device resolution operation by running the device resolution procedure and then checking that the output has succeeded. You then open the output table to check the results.

Note

To perform identity resolution for email addresses, see the “Perform Identity Resolution for Email Addresses” section below.

To resolve the device identifiers in the input table:

  1. Select the input table containing the identifiers to resolve in the left-hand Database Objects navigation pane.

  2. Click on the magnifying glass icon tab shown in the figure below.

    Device resolution results.
    Figure 6. Device resolution results.


    Run the resolution operation specific for your database and schema to convert your input table values using the parameters in the metadata table for conversion to RampIDs in the output table.

  3. Locate the lr_resolution_and_trancoding procedure shown below and click on that code block.

    call <native_app_database_name>.lr_app_schema.lr_resolution_and_transcoding
    (
        ‘<consumer_database_name>.<consumer_schema_name>.<INPUT_TABLE>’,
        ‘<consumer_database_name>.<consumer_schema_name>.<META_TABLE>’,
        ‘<OUTPUT_TABLE>’,
        ‘<consumer_database_name>.<consumer_schema_name>.<LOGGING_TABLE>’,
        ‘<consumer_database_name>.<consumer_schema_name>.<METRICS_TABLE>’
    );
  4. Ensure that you have correctly entered the names of the database objects you are using into the procedure above.

  5. Click Run.

    The resolution operation runs to completion.

  6. Open the LiveRamp app worksheet.

  7. Locate the Check for output procedure shown below and click in that block.

    // Check for output
    /*
    Check for output
    */
    call <native_app_database_name>.lr_app_schema.check_for_output(	'<OUTPUT_TABLE>'
    );
    /*
    END SECTION
    */
  8. Click Run.

  9. Wait until Snowflake returns a status message of success or error.

    Once the app returns a success message, the output should be displayed in the native app database under lr_app_schema.

The results end up in the output table in the same database, which is the COOKIES_OUTPUT, as shown in the example in step 2 above. The output table has the following fields: DEVICE_ID (original input), and RAMPID (converted values).

Column

Sample

Description

Device_id

93abc799-a0a5-40b5-80dd-d2ab61d4d072

Original identifier passed in.

Ramp_ID

XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRDXYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMT

Returns the resolved RampID in your domain encoding.

Table 6. Output table columns for device resolution.


Perform Identity Resolution for Email Addresses

The email resolution process allows for translation of a known identifier (SHA-256 hashed email) to a pseudonymous identifier (RampID) while minimizing the risk of re-identification.

The identity resolution process for email addresses operates similarly to device resolution with a key difference in the data output: instead of returning the original identifier mapped to its associated RampID, in email address resolution any attribute columns are returned but the original hashed email addresses are removed from the output.

Note

  • To perform identity resolution for device identifiers, see the “Perform Identity Resolution for Device Identifiers” section above.

  • Data running through email resolution will pass through a privacy filter which removes the email addresses and reswizzles the table, meaning any attributes you need to keep associated with the identifier need to be included in the input table. For more information, see the "Privacy Filter" section below.

To resolve email identifiers:

  1. Select the input table containing the emails addresses to resolve in the left-hand Database Objects navigation pane.

  2. Click on the magnifying glass icon tab.

    Run the resolution operation specific for your database and schema to convert your input table values using the parameters in the metadata table for conversion to RampIDs in the output table.

  3. Locate the lr_resolution_and_trancoding procedure shown below and click on that code block.

    call <native_app_database_name>.lr_app_schema.lr_resolution_and_transcoding
    (
        ‘<consumer_database_name>.<consumer_schema_name>.<INPUT_TABLE>’,
        ‘<consumer_database_name>.<consumer_schema_name>.<META_TABLE>’,
        ‘<OUTPUT_TABLE>’,
        ‘<consumer_database_name>.<consumer_schema_name>.<LOGGING_TABLE>’,
        ‘<consumer_database_name>.<consumer_schema_name>.<METRICS_TABLE>’
    );
  4. Ensure that you have correctly entered the names of the database objects you are using into the procedure above.

  5. Click Run.

    The resolution operation runs to completion.

  6. Open the LiveRamp app worksheet.

  7. Locate the Check for output procedure shown below and click in that block.

    /*
    Check for output
    */
    call <native_app_database_name>.lr_app_schema.check_for_output(	'<OUTPUT_TABLE>'
    );
    /*
    
  8. Click Run.

Wait until Snowflake returns a status message of success or error. Once the app returns a success message, the output should be displayed in the native app database under lr_app_schema.

The results end up in the output table in the same database, with the following fields: RAMPID (resolved email data), and ATTRIBUTES (based on other data passed through the service).

Column

Sample

Description

Ramp_ID

XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRDXYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMT

Returns the resolved RampID in the client’s domain.

Attribute 1

Male

Any attribute columns passed through the service are returned.

Privacy Filter

To minimize the risk of re-identification (the ability to tie an email address directly to a RampID), the service includes the following processes:

  • Column Values: The process evaluates the combination of all the column values on a per row basis for unique values. If a particular combination of column values occurs 3 or fewer times, the rows containing those column values will not be matchable and will not be returned in the output table.

  • >5% of the table unmatchable: If, based on column value uniqueness, >5% of the file rows are unmatchable, the job will fail.

  • Number of Unique RampIDs: If fewer than 100 unique RampIDs would be returned, the job will fail.

  • Reswizzle full table: Upon completion, the full table will be reswizzled to return the rows RampID | attribute 1 | attribute 2 | attribute n in a different order than what was submitted in the input table.