Skip to main content

Perform Identity Resolution Through ADX

LiveRamp’s Identity Resolution in the Amazon Data Exchange (ADX) allows you to resolve device identifiers or email addresses to RampIDs, LiveRamp’s persistent pseudonymous identifier for persons and households. Identity resolution allows you to have a more holistic view of your data at an individual or household level. A common use case for identity resolution includes resolution of device-based exposure logs from DSPs into RampIDs, driving more accurate insights and analytics.

You can also input an individual-based RampID and get back any household-based RampID that might be associated with that individual.

You can access LiveRamp Identity Resolution within the AWS Marketplace, meaning identity resolution can be performed within AWS. For more information on LiveRamp Identity in ADX, see “LiveRamp Identity in the ADX Marketplace”.

This service leverages LiveRamp’s Identity Graph, connecting fragmented consumer touchpoints to a person or household based view.

The following identifiers can be resolved:

  • Cookies

  • MAIDs (mobile device IDs)

  • CTV IDs (Connected TV Device IDs)

  • CIDs (custom identifiers)

  • Email addresses (SHA-256 hashed)

  • Person-based, maintained RampIDs (for resolution to household RampIDs)

Based on the type of identifier you’re resolving, you might receive one RampID per identifier or multiple RampIDs per identifier. Typically for cookie and mobile device ID resolution, one RampID is returned, given that the devices are not normally shared. Also, when resolving individual RampIDs to household RampIDs, only one household RampID is returned. However, for CTV identifiers it is common to receive multiple individual RampIDs per identifier.

When resolving hashed email addresses, you can choose to receive from 1 to 15 associated RampIDs, if available.

Overall Steps

Before performing identity resolution, you must perform the steps to enable LiveRamp Identity in the ADX Marketplace. For information on performing these steps, see “LiveRamp Identity in the ADX Marketplace”.

After you’ve performed the steps to enable LiveRamp Identity in ADX, perform the following steps to perform identity resolution:

  1. Format the input data file and load it into your AWS S3 input location.

  2. Initiate identity resolution by calling the LiveRamp Workflows API endpoint.

  3. Initiate output file delivery by calling the LiveRamp Polling API endpoint.

After you initiate file delivery, LiveRamp delivers the resolved output file(s) to the specified S3 output location and associated usage metrics are reported to AWS for billing.

See the sections below for more information on performing these steps.

Format the Input Data File

See the sections below for information on formatting the input data file.

Input File Formatting Guidelines

Identity resolution input data files should be formatted as CSV files. When creating input data files, follow these additional guidelines:

  • Include a header row in the first line of every file. Files cannot be processed without headers.

  • Include only one of the following allowed identifier types per file:

    • Cookies

    • Mobile device IDs (MAIDs)

    • CTV IDs

    • CIDs (custom identifiers)

    • SHA-256 hashed email addresses

    • Individual maintained RampIDs

      Note

      If the input file contains individual RampIDs, those will be resolved to household RampIDs.

  • You can name your columns however you want, but every column name must be unique in a table.

  • The first column in the input file must be the column that contains the identifiers to be resolved.

  • When performing identity resolution on multiple files in one job, make sure the identifier column headers are the same in every file and that they match the value given for the “target_column” parameter in the call to initiate identity resolution.

  • Try not to include additional columns. Having extra columns slows down processing.

    Note

    For device or CID resolution, additional columns (such as attribute data columns) can be included in the input file, but only the input identifiers and RampIDs will be returned in the output file. For email address resolution, any additional columns will be returned in the output file, but the email addresses will be removed and the row order randomized.

  • Formatting device identifiers:

    • Cookies: Do not modify (for example, by changing casing) cookie values.

    • Mobile device IDs:

      • Mobile device IDs should be downcased and hyphenated. For example: 1f4d256c-1f08-41f6-a108-bbe511de9497

      • Plaintext AAID and IDFA can be included together. LiveRamp can match off of both IDs at the same time as long as they are in plaintext.

File Format Example

See the table below for an example of how to format an input data file.

Column

Example

Description

Device identifier, CID, hashed email, or RampID

1f4d256c-1f08-41f6-a108-bbe511de9497

Can be one of the following identifiers: cookie, MAID, CTV ID, CID, or SHA-256 hashed email (for resolution to RampID), or maintained RampID (for resolution to household RampID).

Attribute 1

Male

For email address resolution, you can include columns with attribute data. These columns will be returned in the output file.

Any attribute columns included in an input file used for device or CID resolution will not be returned in the output file.

Formatting Guidelines for Email Address Hashing

Follow these best practices for hashing email addresses:

  • Email addresses should be uppercased prior to hashing

  • Use SHA-256, hex-encoding string to be lowercased, character set UTF-8

Initiate Identity Resolution

Once your data files have been prepared and placed into your S3 bucket, initiate the identity resolution process. This is done by making a call to the LiveRamp Workflows ADX API that follows the format of the example curl command shown below.

Note

Only include the "match_limit" parameter for email address resolution, where you want to specify the maximum number of RampID results returned per input identifier (default is “1”).

Https Curl Call Examples

See below for the format of an https curl call:

curl --location --request POST 
'https://<data-exchange-url>/adx/job/start' \
--header 'Content-Type: application/json' \
--header 'x-amzn-dataexchange-data-set-id: <data-set-id>' \
--header 'x-amzn-dataexchange-revision-id: <revision-id>' \
--header 'x-amzn-dataexchange-asset-id: <asset-id>' \
--header 'x-amzn-dataexchange-http-method: POST' \
--data-raw '{
        "httpMethod": "POST",
        "input_s3": "<Input S3 bucket>",
        "file_format": "csv",
        "file_pattern": "<Regex pattern for input files>",
        "workflow_type": "<Resolution type>",
        "workflow_sub_type": "<Resolution sub type>",
        "target_column": <Identifier column header>,
        "client_id": "<Client ID>",
        "client_secret": "<Client Secret>",
        "cross_region": “true”
        "match_limit": "<# of RampIDs returned>"

}'

See below for an example of what a populated https curl call might look like:

curl --location --request POST 
'https://<data-exchange-url>/adx/job/start' \
--header 'Content-Type: application/json' \
--header 'x-amzn-dataexchange-data-set-id: <data-set-id>' \
--header 'x-amzn-dataexchange-revision-id: <revision-id>' \
--header 'x-amzn-dataexchange-asset-id: <asset-id>' \
--header 'x-amzn-dataexchange-http-method: POST' \
--data-raw '{
        "httpMethod": "POST",
        "input_s3": "s3://my-input-bucket-name<Input S3 bucket>",
        "file_format": "csv",
        "file_pattern": "resolution_input_2.*[.]csv",
        "workflow_type": "resolution",
        "workflow_sub_type": "CTV",
        "target_column": "DEVICE_ID",
        "client_id": "my-client-id",
        "client_secret": "my-client-secret",
        "cross_region": “true”

}'

AWS CLI Call Examples

See below for the format of an AWS CLI call:

aws dataexchange send-api-asset \
  --data-set-id <data-set-id> \
  --revision-id <revision-id> \
  --asset-id <asset-id> \
  --request-headers ‘x-api-key=XXXX-XXXX-XXXX-XXX-<client_id>’ \
  --method POST \
  --path "/adx/job/start" \
  --body "{\"input_s3\": \"<Input S3 bucket>", \"file_format\": \"csv\", \"file_pattern\": \"<Regex pattern for input files>*[.]csv\", \"workflow_type\": \"device_resolution\", \"workflow_sub_type\": \"<Resolution Sub type>\", \"target_column\": \"<Identifier column header>", \"client_id\": \"<Client ID>", \"client_secret\": \"<Client_sectret>", \"cross_region\": \"true\" }"

See below for an example of what a populated AWS CLI call might look like:

aws dataexchange send-api-asset \
  --data-set-id <data-set-id> \
  --revision-id <revision-id> \
  --asset-id <asset-id> \
  --method POST \
 --request-headers ‘x-api-key=XXXX-XXXX-XXXX-XXX-<client_id>’ \  
--path "/adx/job/start" \
  --body "{\"input_s3\": \"s3://my-input-bucket-name\", \"file_format\": \"csv\", \"file_pattern\": \"resolution_input_2.*[.]csv\", \"workflow_type\": \"resolution\", \"workflow_sub_type\": \"email\", \"target_column\": \"device_id\", \"client_id\": \"my-client-id\", \"client_secret\": \"my-client-secret\", \"cross_region\": \"true\", \"match_limit\": \"1\" }"

Example Responses

The following is an example of a response for a successful job submission:

{
   "Job ID": "9863C6588358503285051D4F0BC83_AWS_US_EAST_1",
   "Status": "ADX Start job submitted"
}

In addition to the response received for a successful job submission, you might get one of the following responses in the status parameter:

  • "ADX Start Job Lambda function failed to locate the S3 bucket region"

  • "ADX Start Job Lambda function failed to process the request for Job ID"

  • "ADX Start Job Lambda function failed to extract AWS Canonical ID from S3 bucket for Job ID"

  • "ADX API received an error response while authenticating for Job ID "

  • "ADX API failed to fetch an auth token for Job ID "

Initiate Output File Delivery

Once you’ve initiated the identity resolution process, you must make a poll job request to initiate the delivery of the output file to the output S3 bucket after processing is complete. This is done by making a call that follows the format of the example curl command shown below: 

Note

It is recommended that polling be done programmatically at recurring intervals until the processing is complete and the output file has been delivered.

Https Curl Call Examples

See below for the format of an https curl call:

curl --location --request POST 
'https://<data-exchange-url>/adx/job/poll' \
--header 'Content-Type: application/json' \
--header 'x-amzn-dataexchange-data-set-id: <data-set-id>' \
--header 'x-amzn-dataexchange-revision-id: <revision-id>' \
--header 'x-amzn-dataexchange-asset-id: <asset-id>' \
--header 'x-amzn-dataexchange-http-method: POST' \
--data-raw '{
        "httpMethod": "POST",
        "job_id": "<Job ID>",
        "output_s3": "<Output S3 bucket>",
        "file_format": "csv",
        "client_id": "<Client ID>",
        "client_secret": "<Client Secret>"
}'

See below for an example of what a populated https curl call might look like:

curl --location --request POST 
'https://<data-exchange-url>/adx/job/poll' \
--header 'Content-Type: application/json' \
--header 'x-amzn-dataexchange-data-set-id: <data-set-id>' \
--header 'x-amzn-dataexchange-revision-id: <revision-id>' \
--header 'x-amzn-dataexchange-asset-id: <asset-id>' \
--header 'x-amzn-dataexchange-http-method: POST' \
--data-raw '{
        "httpMethod": "POST",
        "job_id": "JOB_ID_123",
        "output_s3": "s3://<my-output-bucket-name>\",
        "aws_key_id": "<AWS Key ID>",
        "aws_secret_key": "<AWS Secret Key>",
        "file_format": "csv",
        "client_id": "<my-client-id>",
        "client_secret": "<my-client-secret>"
}'

AWS CLI Call Examples

See below for the format of an AWS CLI call:

aws dataexchange send-api-asset \             
 --data-set-id <data-set-id> \
 --revision-id <revision-id> \
 --asset-id <asset-id> \
 --method POST \
 --request-headers ‘x-api-key=XXXX-XXXX-XXXX-XXX-<client_id>’ \
 --path "/adx/job/poll" \
 --body "{\"job_id\": \"<Job ID>", \"output_s3\": \"<Output S3 bucket", \"file_format\": \"csv\", \"client_id\": \"<Client ID\>", \"client_secret\": \"<Client Secret>", \"cross_region\": \"true\" }"

See below for an example of what a populated AWS CLI call might look like:

aws dataexchange send-api-asset \             
 --data-set-id <data-set-id> \
 --revision-id <revision-id> \
 --asset-id <asset-id> \
 --method POST \
--request-headers ‘x-api-key=XXXX-XXXX-XXXX-XXX-<client_id>’ \
 --path "/adx/job/poll" \
 --body "{\"job_id\": \"JOB_ID_123\", \"output_s3\": \"s3://<my-output-bucket-name>\", \"file_format\": \"csv\", \"client_id\": \"<my-client-id>\", \"client_secret\": \"<my-client-secret>\", \"cross_region\": \"true\" }"

Example Responses

The following is an example of a response when processing is complete:

{
   "Job ID": "9863C6588358503285051D4F0BC83_AWS_US_EAST_1",
   "Status": "Output results uploaded to AWS S3 bucket"
}

In addition to the response received when processing is complete, you might get one of the following responses in the status parameter:

  • 'DONE': 'ADX Poll job started for delivering output results. Re-poll later for updated status'

  • 'DELIVERING': 'Upload to AWS S3 in progress. Re-poll later or wait for the delivery notification'

  • 'ERROR': 'Cannot poll job because of error. Please contact support'

  • 'ALERT': 'Cannot poll job because of delay. Please contact support'

  • 'INVALID': 'Cannot poll job because of invalid job id. Please validate input'

  • 'EXCEPTION': 'Cannot poll job because of an exception. Please contact support'

  • 'UNKNOWN': 'Cannot poll job because the start job workflow was not executed. Please contact support'

  • 'DEFAULT': 'Cannot poll job because of an unknown error. Please contact support'

API Parameters

See the tables below for a list of the API header parameters and request parameters.

Authorization Parameters

Authorization Parameter

Data Type

Description

AccessKey

string

IAM Access Key of the subscribed AWS account.

SecretKey

string

IAM Secret Key of the subscribed AWS account.

AWS Region

string

AWS region where the product was subscribed.

Service Name

string

"dataexchange"

Session Token

string

Session token of the subscribed AWS account

Header Parameters

Header Parameter

Data Type

Description

data-set-id

string

Your AWS-provided Data set ID.

revision-id

string

Your AWS-provided Revision ID.

asset-id

string

Your AWS-provided Asset ID.

aws-authorization

string

For information on finding the AWS-provided parameters, see this AWS article.

Request Parameters

Request Parameter

Description

client_id

Either an existing LiveRamp client ID (if you already have Identity API credentials) or a new one provided by LiveRamp

client_secret

Password / secret for the LiveRamp client_ID (either an existing password / secret (if you already have Identity API credentials) or a new one provided by LiveRamp)

workflow_type

“resolution” for all identity resolution processes

workflow_subtype

The type of identifiers being resolved. Options include:

  • Cookies

  • MAID

  • CTV

  • CID (provided to you by LiveRamp and specific to the mapping created)

  • Email

  • HHLink (to resolve individual RampIDs into household RampIDs)

input_s3

S3 directory for input files.

output_s3

S3 directory for output files.

file_format

Specifies the format for input files. Accepted file format is CSV.

file_pattern

Regex pattern for input files.

For example, the pattern ‘input_2.*[.]csv’ would result in the processing of the following files:

input_20.csv

input_221.csv

input_225.csv

target_column

The column header name for the input field which contains the IDs to be resolved. Ex: “DEVICE_ID”

cross_region

“true” or “false”. If “true”, then workloads are processed in the default region (us-east-1) if the target region is unavailable. If “false”, then workloads are not processed in the default region if the target region is unavailable and a status message to enable cross region is returned to the caller.

match_limit

For email resolution only, specify an integer between 1 and 15 to specify the maximum number of RampID results returned per input identifier (to return only the “best match”, returning 1 RampID is sufficient). The default is “1”.

job_id

For polling requests, enter the Job ID returned in the response for the call to initiate identity resolution. The Job ID consists of a unique ID plus your AWS region name.

Identity Resolution Output

The output file(s) from the identity resolution process will be compressed and then written to the specified S3 bucket provided in the poll job request.

The file naming convention for the output file will be "<JOB_ID>_0_0_0.csv.gz"

The Job ID will be a unique ID plus your AWS region name.

Ex: 17697C67E98D4702BEB4ED7B3B0FA_AWS_US_EAST_1_0_0_0.csv.gz

Output File for Device Resolution

The output file for device resolution will follow the format shown in the table below.

Column

Example

Description

Device identifier OR RampID

1f4d256c-1f08-41f6-a108-bbe511de9497

The original identifier included in the input file.

RampID

XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRD

For input files containing device identifiers,the RampID associated with the device identifier.

For input files containing individual RampIDs, the household RampIDs associated with those individual RampIDs.

Note: If multiple RampIDs are associated with a device identifier, multiple lines will be created in the output file.

Output File for CID Resolution

The output file for CID resolution will follow the format shown in the table below.

Column

Example

Description

CID_ID

93abc799-a0a5-40b5-80dd-d2ab61d4d072

The original identifier included in the input file.

RampID

XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRD

The resolved RampID in your domain encoding.

Output File for Email Address Resolution

The output file for email address resolution will follow the format shown in the table below.

Column

Example

Description

RampID 

XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRD

The RampID associated with the email address.

Note

If multiple RampIDs are associated with an email address, multiple lines will be created in the output file.

Attribute 1

Male

The original attribute columns included in the input file.

Privacy Filters

To minimize the risk of re-identification (the ability to tie an email address directly to a RampID), the service includes the following processes:

  • Column Values: The process evaluates the combination of all the column values on a per row basis for unique values. If a particular combination of column values occurs 3 or fewer times, the rows containing those column values will not be matchable and will not be returned in the output table.

  • >5% of the table unmatchable: If, based on column value uniqueness, >5% of the file rows are unmatchable, the job will fail.

  • Number of Unique RampIDs: If fewer than 100 unique RampIDs would be returned, the job will fail.

  • Reswizzle full file: Upon completion, the full file will be reswizzled to return the rows RampID | attribute 1 | attribute 2 | attribute n in a different order than what was submitted in the input file.