Perform Identity Resolution Through ADX

LiveRamp’s Identity Resolution in the Amazon Data Exchange (ADX) allows you to resolve personally identifiable information (PII) or device identifiers to RampIDs, LiveRamp’s persistent pseudonymous identifier for persons and households. Identity resolution allows you to have a more holistic view of your data at an individual or household level.

Note

This article contains information on performing identity resolution with LiveRamp’s Identity offerings through ADX standalone. If you plan to perform identity resolution using AWS Entity Resolution, see “Perform Identity Resolution Using AWS Entity Resolution".
For more information about RampIDs, see "RampID Methodology".

You can also input an individual-based RampID and get back any household-based RampID that might be associated with that individual.

You can access LiveRamp Identity Resolution within the AWS Marketplace, meaning identity resolution can be performed within AWS. For more information on LiveRamp Identity in ADX, see “LiveRamp Identity in the ADX Marketplace”.

This service leverages LiveRamp’s Identity Graph, connecting fragmented consumer touchpoints to a person or household-based view.

The following identifiers can be resolved:

Names
Postal addresses
Email addresses
Phone numbers
Cookies
MAIDs (mobile device IDs)
CTV IDs (Connected TV Device IDs)
CIDs (custom identifiers)
Person-based, maintained RampIDs (for resolution to household RampIDs)

Based on the type of identifier you’re resolving, you might receive one RampID per identifier or multiple RampIDs per identifier:

For PII touchpoints (in PII or email resolution), you can choose to receive from 1 to 10 RampIDs (if available).
For cookie and mobile device ID resolution, typically one RampID is returned (but shared touchpoints with more than one RampID can exist).
For CTV identifiers, it is common to receive multiple individual RampIDs per identifier.
For existing CID syncs, the number of RampIDs will vary.
When resolving individual RampIDs to household RampIDs, only one household RampID is returned.
When resolving a universe dataset with deconfliction, multiple touchpoints can be used and RampIDs will be returned that are ranked most relevant based on how active the linkages are in the digital ecosystem and other factors. This can be particularly useful for large datasets where overconnected RampIDs can introduce noise into analytics. For more information, see the “Deconfliction Options” section below.

Deconfliction Options

Deconfliction refers to an optional identity resolution configuration that optimizes the fit of the LiveRamp graph to your definition of a customer, which is particularly beneficial for universe datasets.

Deconfliction attempts to reduce the number of “conflicting” RampIDs for each individual record in a universe dataset. A RampID is considered conflicting if the same RampID is linked to the record for another user (such as a family member) in your system. Deconfliction attempts to include only the RampIDs that are most relevant to an individual record and reduce the number of “shared” RampIDs.

To utilize deconfliction, you’ll need to provide a CID (custom ID) for each record, and for the configuration to drive the highest value, the full universe is required to deconflict across the table. The output you receive will include the best fidelity RampIDs for each CID.

There are four deconfliction options you can choose from:

Standard: This configuration returns the RampIDs that are determined to be most relevant and removes CIDs that are determined to be duplicates (based on linking to the same RampID). This is the default option and covers most advertiser use cases.
Maximized first-party fidelity: This configuration returns the RampIDs that are determined to be most relevant but preserves additional CIDs even if there are RampIDs that indicate LiveRamp could consolidate them. This option is ideal for publishers and other data owners.
Household expansion: This configuration allows clients with a household-based CID to resolve and deconflict with individual RampIDs that represent the full household. This option is ideal for retailers with household-based loyalty programs, as an example.
Event integrity: This configuration preserves all CIDs, meaning identity conflicts are minimized but not eliminated. This option is ideal for CIDs that represent transaction data, as an example.

For more information on deconfliction, see “Using Deconfliction on a Universe Dataset”.

Overall Steps

Before you can perform identity resolution, you must perform the steps to enable LiveRamp Identity in the ADX Marketplace. For information on performing these steps, see “LiveRamp Identity in the ADX Marketplace”.

After you’ve performed the steps to enable LiveRamp Identity in ADX, perform the following steps to perform identity resolution:

Note

To avoid errors, you might want to verify that your setup has been performed correctly before performing the operation. For more information, see the "Checklist to Verify Your Setup for LiveRamp Identity in ADX" section below.

Format the appropriate input data file(s) and load them into your AWS S3 input location.
Note
An input data file needs to be prepared for each standard identity resolution operation and can only contain one type of identifier. When resolving a full universe dataset with deconfliction, you can use multiple input data files and/or a file that includes multiple identifiers to be processed in one call. For more information, see the “File Format for Universe Dataset Resolution with Deconfliction” section below.
Initiate identity resolution by calling the LiveRamp Workflows API endpoint.
Initiate output file delivery by calling the LiveRamp Polling API endpoint.

After you initiate file delivery, LiveRamp delivers the resolved output file(s) to the specified S3 output location and associated usage metrics are reported to AWS for billing.

See the sections below for more information on performing these steps.

Checklist to Verify Your Setup for LiveRamp Identity in ADX

To avoid errors, use the checklists in the sections below to verify that all the necessary native app setup steps have been successfully performed before executing an operation.

AWS Region Alignment

Region in Contract: Confirm that the AWS region you provided to LiveRamp during contract execution is consistent with your actual AWS services.
AWS CLI Region Check: Run aws configure get region to verify the AWS region for the IAM user or profile you're using.

Note

The ADX offer will be made and accepted in US-East-2, and API calls will be made from US-East-2.
The region you provide for your contract is where your buckets need to be. As long as your buckets are in US-East-1, your job (compute at LiveRamp's end for your job) will run in US-East-1 and will not incur cross-region data transfer costs. For bucket regions other than US-East-1 and US-West-2, you need cross region to be true in your incoming request, and it will run in US-East-1.

IAM User and Permissioning

IAM User for ADX: Confirm that there is an IAM user configured specifically for ADX operations.
ADX Permissioning: Confirm that the IAM user has the required permissions for starting and polling jobs in ADX.
S3 Bucket Permissioning: Confirm that the IAM user has been granted read and write permissions for both the input and output S3 buckets.

S3 Bucket Setup

Input Bucket Configuration: Confirm that there is an S3 bucket exclusively dedicated for input files for LiveRamp processing.
Output Bucket Configuration: Confirm that there is a separate S3 bucket dedicated for output files from LiveRamp processing.
Bucket Policy Verification: Confirm that the bucket policies for the input and output buckets are aligned with LiveRamp's required permissions.
Bucket Accessibility Test: Execute aws s3 ls s3://<input-bucket-name> and aws s3 ls s3://<output-bucket-name> to verify IAM user access to the buckets.

Format the Input Data File

An input data file needs to be prepared for each identity resolution operation, other than for resolving a universe dataset. For universe dataset resolution, one or multiple data files can be used in the process. For more information, see the sections below.

Input File Formatting Guidelines

Identity resolution input data files should be formatted as CSV files. When creating input data files, follow these additional guidelines:

Include a header row in the first line of every file. Files cannot be processed without headers.
Unless you’re resolving a universe dataset with deconfliction, include only one input data file per operation and include only one of the following allowed identifier types per file:
- PII
  - Names
  - Postal addresses
  - Plaintext email addresses
  - Phone numbers
- Hashed email addresses (SHA256, MD5, or SHA1)
- Cookies
- Mobile device IDs (MAIDs)
- CTV IDs
- CIDs (custom identifiers)
- Individual maintained RampIDs
  Note
  If the input file contains individual RampIDs, the RampIDs will be resolved to household RampIDs.
If you’re resolving a universe dataset with deconfliction, you can include one or multiple files per job. You must include CIDs (custom identifiers) in each file and can include one or more of the following allowed identifier types per file:
- PII
  - Names
  - Postal addresses
  - Plaintext email addresses
  - Phone numbers
- Hashed email addresses (SHA256, MD5, or SHA1)
- Mobile device IDs (MAIDs)
You can name your columns however you want, but every column name must be unique in a table.
Column names must be alphanumeric (other than underscores) and start with a letter.
Do not use spaces in column names. Use underscores.
The first column(s) in the input file must be the column(s) that contain the identifiers to be resolved.
When performing identity resolution on multiple files in one job, make sure the identifier column headers are the same in every file and that they match the value given for the “target_column” parameter in the call to initiate identity resolution.
Try not to include additional columns. Having extra columns slows down processing.
Note
For device or CID resolution, or for workflows that involve deconfliction, additional columns (such as attribute data columns) can be included in the input file, but only the input identifiers and RampIDs will be returned in the output file. For PII or email resolution, any additional columns will be returned in the output file, but the identifiers will be removed and the row order randomized.
Formatting device identifiers:
- Cookies: Do not modify (for example, by changing casing) cookie values.
- Mobile device IDs:
  - Mobile device IDs should be downcased and hyphenated. For example: 1f4d256c-1f08-41f6-a108-bbe511de9497
  - AAID and IDFA mobile device IDs can be included together as LiveRamp can match off of both plaintext mobile device ID types in the same file.

File Format for PII Resolution

The standard PII resolution process passes the data through a privacy filter which removes the PII and reswizzles the table. Because of this, any attributes you need to keep associated with the identifier need to be included in the input table. For more information, see the "Privacy Filter" section below.

Note

Utilizing hashed attributes requires a LiveRamp Data Ethics review and an attestation. We will also work with your team to confirm the separation of known and pseudonymous data prior to enabling permissions.

These column names cannot be used in the input file for PII resolution:

RampID
__lr_rank
__lr_filter_ name

See the table below for a list of the suggested input file columns and descriptions for PII resolution.

Suggested Column Name	Example	Notes
`first_name`	John	You can include separate First Name and Last Name columns or you can combine first name and last name in one column (such as "Name").
`last_name`	Doe	You can include separate First Name and Last Name columns or you can combine first name and last name in one column (such as "Name").
`address_1`	123 Main St
`address_2`	Apt 1	You can include separate Address 1 and Address 2 columns or you can combine all street address information in one column (such as "Address").
`city`	Smalltown	When matching on address, City is optional.
`state`	CA	When matching on address, State is optional. If including State, must be a two-character, capitalized abbreviation ("CA", not "California" or "Ca").
`zip`	12345	Required when matching on addresses. Can be in 5-digit format or 9-digit format (ZIP+4).
`email`	john@email.com	Plaintext emails only. Only one email per input row is permitted. Other emails must be dropped or included in an additional row. If you include an additional row, repeat the values for the name fields for the best match rates. All emails must meet these requirements: Have characters before and after the "@" sign Contain a period character (".") Have characters after the period character Examples of valid emails include: a@a.com A@A.COM email@account.com EMAIL@ACCOUNT.COM email@sub.domain.com EMAIL@SUB.DOMAIN.COM
`phone`	555-123-4567	Plaintext phone numbers only. Only one phone number per input row is permitted. Other phone numbers must be dropped or included in an additional row. If you include an additional row, repeat the values for the name fields for the best match rates. All phone numbers must meet these requirements: Can be more than 10 characters if leading numbers over 10 characters are “0” or “1” If no leading numbers are used, must be 10 characters long Can contain hyphens ("-"), parentheses ("(" or ")"), plus signs ("+"), and periods (".") Examples of valid phone numbers include: 8668533267 866.853.3267 (866) 853-3267 8668533267 +1 (866) 853-3267 +18668533267 18668533267 1111111118668533267 08668533267 Examples of invalid phone numbers include: 987654321 (fewer than 10 characters) 98765432109 (more than 10 characters) 1234567890 (after removing the leading "1", less than 10 characters remain) 0987654321 (after removing the leading "0", less than 10 characters remain)
`attribute_1`	Gender	For PII resolution, you can include columns with attribute data. These columns will be returned in the output file (for more information, see the "Output File for PII Resolution" section below). If you specify that an attribute column should be hashed, it will appear in the output table with a prefix of "hashed_". The input table must not include a column with the same name as the name of the hashed column in the output table.

File Format for Email-Only Resolution

The standard email-only resolution process operates similarly to PII resolution. Any attributes you need to keep associated with the identifier need to be included in the input table. For more information, see the "Privacy Filter" section below.

Note

When resolving email data only, using email-only resolution can provide higher throughput compared to full PII resolution. Talk with your LiveRamp team to determine the best approach for your use case.
To perform identity resolution across additional PII touchpoints, see the “File Format for PII Resolution” section above.

See the table below for a list of the suggested input table columns and descriptions for email-only resolution.

Suggested Column Name	Example	Description
`hashed_email`	8c9775a5999b5f0088008c0b26d7fe8549d5c80b0047784996a26946abac0cef	SHA-256, MD5, and SHA-1 hashed emails accepted. Email addresses should be lowercased and UTF-8 encoded prior to hashing. After hashing, convert the resulting hash into lowercase hexadecimal representation.
`attribute_1`	Male	For email address resolution, you can include columns with attribute data. These columns will be returned in the output table (for more information, see the "Privacy Filter" section below).

File Format for Device ID Resolution

The device ID resolution operation can be used for the following purposes:

To translate device identifiers (cookies, MAIDs, and CTV IDs) into individual RampIDs
To translate individual RampIDs into their associated household RampIDs

See the tables below for a list of the suggested input file columns and descriptions for these device ID resolution options.

Note

Each device ID resolution input file should contain only one identifier column (either a device identifier or a maintained RampID).
You can include columns with attribute data, but these columns will not be returned in the output file.

See the table below for a list of the suggested input file columns and descriptions for translating device identifiers.

Suggested Column Name	Example	Description
`device_identifier`	1f4d256c-1f08-41f6-a108-bbe511de9497	Can be one of the following identifiers: Cookie MAID CTV ID

See the table below for a list of the suggested input table columns and descriptions for translating individual RampIDs into their associated household RampIDs.

Suggested Column Name	Example	Description
`RampID`	XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd	The RampID for translation to a Household RampID. Must be a maintained RampID (to have an associated with a Household RampID).

File Format for CID Matching

This process enables the retrieval of an existing CID to RampID mapping, hosted by LiveRamp.

See the table below for a list of the suggested input file columns and descriptions for CID matching.

Note

You can include columns with attribute data, but these columns will not be returned in the output file.

Suggested Column Name	Example	Description
`cid`	b916clarib la1;blNj10gtQjQ3QUEwMTNEMTcaktboEc0g9022cxoiaklr20185	The CID for translation to a RampID

File Format for Universe Dataset Resolution with Deconfliction

This process allows for the resolution of a universe dataset with a deconfliction configuration, minimizing conflicts across the entire dataset. The output from this process is a deconflicted hashed CID to RampID mapping.

When resolving a universe dataset, you can use one input data file or multiple input data files. Each input data file can include one identifier type or multiple identifier types (including PII, hashed email, or MAIDs). The examples below can be used for the specific situations listed but you can create an input data file that uses any combination of these identifiers.

Note

You can include columns with attribute data, but these columns will not be returned in the output file.

The requirements for each identifier type (listed in the relevant sections above) apply here. For example, any PII-based records must include zip in order to match on address and hashed emails must be lowercased and UTF-8 encoded prior to hashing.

See the table below for a list of the suggested input data file columns for a job that will contain both hashed emails and MAIDs with deconfliction.

Suggested Column Name	Example	Description
`cid`	g221lariab la8;blNj10gtQjQ3QUEwMTNEMTcaktboEc0g9022cxoiaklr91054	A custom identifier that represents an individual in the dataset. Data format is UTF-8 compliant alphanumeric string of up to 256 characters
`hashed_email`	8c9775a5999b5f0088008c0b26d7fe8549d5c80b0047784996a26946abac0cef	SHA-256, MD5, and SHA-1 hashed emails accepted. Email addresses should be lowercased and UTF-8 encoded prior to hashing. After hashing, convert the resulting hash into lowercase hexadecimal representation.
`device_identifier`	1f4d256c-1f08-41f6-a108-bbe511de9497	MAIDs are the only supported device identifier at this time

See the table below for a list of the suggested input data file columns for a job that will contain plaintext PII:

Suggested Column Name	Example	Notes
`cid`	g221lariab la8;blNj10gtQjQ3QUEwMTNEMTcaktboEc0g9022cxoiaklr91054	A custom identifier that represents an individual in the dataset. Data format is UTF-8 compliant alphanumeric string of up to 256 characters.
`first_name`	John	You can include separate First Name and Last Name columns or you can combine first name and last name in one column (such as "Name").
`last_name`	Doe	You can include separate First Name and Last Name columns or you can combine first name and last name in one column (such as "Name").
`address_1`	123 Main St
`address_2`	Apt 1	You can include separate Address 1 and Address 2 columns or you can combine all street address information in one column (such as "Address").
`city`	Smalltown	When matching on address, City is optional.
`state`	CA	When matching on address, State is optional. If including State, must be a two-character, capitalized abbreviation ("CA", not "California" or "Ca").
`zip`	12345	Required when matching on addresses. Can be in 5-digit format or 9-digit format (ZIP+4).
`email`	john@email.com	Plaintext emails only. Only one email per input row is permitted. Other emails must be dropped or included in an additional row. If you include an additional row, repeat the values for the name fields for the best match rates. All emails must meet these requirements: Have characters before and after the "@" sign Contain a period character (".") Have characters after the period character Examples of valid emails include: a@a.com A@A.COM email@account.com EMAIL@ACCOUNT.COM email@sub.domain.com EMAIL@SUB.DOMAIN.COM
`phone`	555-123-4567	Plaintext phone numbers only. Only one phone number per input row is permitted. Other phone numbers must be dropped or included in an additional row. If you include an additional row, repeat the values for the name fields for the best match rates. All phone numbers must meet these requirements: Can be more than 10 characters if leading numbers over 10 characters are “0” or “1” If no leading numbers are used, must be 10 characters long Can contain hyphens ("-"), parentheses ("(" or ")"), plus signs ("+"), and periods (".") Examples of valid phone numbers include: 8668533267 866.853.3267 (866) 853-3267 8668533267 +1 (866) 853-3267 +18668533267 18668533267 1111111118668533267 08668533267 Examples of invalid phone numbers include: 987654321 (fewer than 10 characters) 98765432109 (more than 10 characters) 1234567890 (after removing the leading "1", less than 10 characters remain) 0987654321 (after removing the leading "0", less than 10 characters remain)

Initiate Identity Resolution

Once your data files have been prepared and placed into your S3 bucket, initiate the identity resolution process. This is done by making a call with the AWS CLI to the LiveRamp Workflows ADX API that follows the format of the examples shown below.

Note

For information on the parameters to include in the call, see the “API Parameters” section below.
Only include the match_limit parameter for PII or email resolution (without deconfliction configured), where you want to specify the maximum number of RampID results returned per input identifier (default is “1”).
Only include the input_columns parameter for email, PII, or CID mapping (universe dataset) resolution. This is where you will specify which columns are the target_column(s) and any attribute columns you wish to pass through to the output table.
Only include the target_columns parameter for PII or CID mapping (universe dataset) resolution. Use this parameter to specify the target PII columns to use for identity resolution. When using the target_columns parameter, do not include the target_column parameter.
Note
For execution types other than PII resolution, use the target_column parameter instead of target_columns.
For information on troubleshooting errors that might occur when performing calls, see "Troubleshoot Calls in ADX".

Once you've received a successful response, make a poll job request to initiate the delivery of the output file to the output S3 bucket (for more information, see the "Initiate Output File Delivery" section below).

AWS CLI Calls to Initiate Identity Resolution

See below for the format of an AWS CLI call to initiate identity resolution (other than when calling to initiate resolution of a universe dataset with multiple files where each file contains different identifiers):

Note

When making a call to initiate resolution of a universe dataset with multiple input files where each file contains different identifiers, see the example below this one.

aws dataexchange send-api-asset \
  --data-set-id <data-set-id> \
  --revision-id <revision-id> \
  --asset-id <asset-id> \
  --method POST \
  --region us-east-2 \
  --path "/adx/job/start" \
    --body '{
                "input_s3": "<Input S3 bucket>",
                "file_format": "csv",
                "file_pattern": "<Regex pattern for input files>[.]csv",
                "workflow_type": "resolution",
                "workflow_sub_type": "<Resolution sub type>",
                "target_column": "<Identifier column header>",
                "client_id": "<Client ID>",
                "client_secret": "<Client secret>",
                "input_columns": {<"Column name": "Column type">},
                "cross_region": "true"
            }'

See below for the format of an AWS CLI call to initiate the resolution of a universe dataset with multiple input files where each file contains different identifiers:

Note

When making a call other than to initiate resolution of a universe dataset with multiple input files where each file contains different identifiers, see the example above.

aws dataexchange send-api-asset \
  --data-set-id <data-set-id> \
  --revision-id <revision-id> \
  --asset-id <asset-id> \
  --method POST \
  --region us-east-2 \
  --path "/adx/job/start" \
    --body '{

                "workflow_type": "resolution",
                "workflow_sub_type": "cid_mapping",
                "client_id": "<Client ID>",
                "client_secret": "<Client secret>",  
                "cross_region": "true",
		"config": {"deconflictionConfig": "<standard, maximized_1P_fidelity, hh_expansion, or event_integrity>"}
		"inputs": [{
              	    "input_s3": "<Input S3 bucket>",
       	            "file_format": "csv",
                    "file_pattern": "<Regex pattern for input files>[.]csv",
                    "input_columns": {<"Column name": "Column type">},
                    "target_columns": {<"key": "Column name">}
                  },
                  {
              	    "input_s3": "<Input S3 bucket>",
       	            "file_format": "csv",
             	    "file_pattern": "<Regex pattern for input files>[.]csv",
                    "input_columns": {<"Column name": "Column type">},
                    "target_columns": {<"key": "Column name">}
                  }
                ]
            }'

See below for examples of what a populated AWS CLI call to initiate translation might look like.

Examples of AWS CLI Calls

See below for examples of AWS CLI calls.

PII Resolution Call Example

aws dataexchange send-api-asset \
    --data-set-id <data-set-id> \
    --revision-id <revision-id> \
    --asset-id <asset-id> \
    --method POST \
    --region us-east-2 \
    --path "/adx/job/start" \
    --body '{
                "input_s3": "s3://my-input-bucket-name",
                "file_format": "csv",
                "file_pattern": "pii_input[.]csv",
                "workflow_type": "resolution",
                "workflow_sub_type" : "PII",
                "target_columns": {
                                    "name": ["FIRSTNAME", "LASTNAME"],
                                    "address": ["ADDRESSLINE", "ADDRESSLINE2"],
                                    "city": "CITY",
                                    "state": "STATE",
                                    "zip": "ZIPCODE",
                                    "email": "EMAIL",
                                    "phone": "PHONE",
                                    "hashedAttributes": ["CID"]
                                  },
                "client_id": "my-client-id",
                "client_secret": "my-client-secret",
                "input_columns": {
                                    "FIRSTNAME": "text",
                                    "LASTNAME": "text",
                                    "ADDRESSLINE": "text",
                                    "ADDRESSLINE2": "text",
                                    "CITY": "text",
                                    "STATE": "text",
                                    "ZIPCODE": "text",
                                    "EMAIL": "text",
                                    "PHONE": "text",
                                    "CID": "text",
                                    "LIKES_DOGS": "text"
                                 },
                "cross_region": "true"
            }'

Email Resolution Call Example

aws dataexchange send-api-asset \
    --data-set-id <data-set-id> \
    --revision-id <revision-id> \
    --asset-id <asset-id> \
    --method POST \
    --region us-east-2 \
    --path "/adx/job/start" \
    --body '{
                "input_s3": "s3://my-input-bucket-name",
                "file_format": "csv",
                "file_pattern": "resolution_input_2.*[.]csv",
                "workflow_type": "resolution",
                "workflow_sub_type": "EMAIL",
                "target_column": "hashed_email",
                "client_id": "my-client-id",
                "client_secret": "my-client-secret",
                "input_columns": {
                                    "hashed_email": "text",
                                    "gender": "text"
                                 },
                "cross_region": "true"
            }'

CTV ID Resolution Call Example

aws dataexchange send-api-asset \
    --data-set-id <data-set-id> \
    --revision-id <revision-id> \
    --asset-id <asset-id> \
    --method POST \
    --region us-east-2 \
    --path "/adx/job/start" \
    --body '{
                "input_s3": "s3://my-input-bucket-name",
                "file_format": "csv",
                "file_pattern": "ctv_input[.]csv",
                "workflow_type": "resolution",
                "workflow_sub_type": "CTV",
                "target_column": "my_ctv_column",
                "client_id": "my-client-id",
                "client_secret": "my-client-secret",
                "cross_region": "true"
            }'

Universe Dataset Resolution Call Example (With Deconfliction)

aws dataexchange send-api-asset \
--data-set-id <data-set-id> \
--revision-id <revision-id> \
--asset-id <asset-id> \
--method POST \
--region us-east-2 \
--path "/adx/job/start" \
--body '{
"input_s3": "s3://my-input-bucket-name",
"file_format": "csv",
"file_pattern": "cid_mapping_input[.]csv",
"workflow_type": "resolution",
"workflow_sub_type" : "CID_MAPPING",
"inputs": [{
"target_columns": {
"name": ["FIRSTNAME", "LASTNAME"],
"address": ["ADDRESSLINE", "ADDRESSLINE2"],
"city": "CITY",
"state": "STATE",
"zip": "ZIPCODE",
"email": "EMAIL",
"phone": "PHONE",
"cid": "CID"
},
"input_columns": {
"FIRSTNAME": "text",
"LASTNAME": "text",
"ADDRESSLINE": "text",
"ADDRESSLINE2": "text",
"CITY": "text",
"STATE": "text",
"ZIPCODE": "text",
"EMAIL": "text",
"PHONE": "text",
"LIKES_DOGS": "text",
"CID": "text"
},
}]
"client_id": "my-client-id",
"client_secret": "my-client-secret",
"cross_region": "true",
"config": {"deconflictionConfig": ["STANDARD"]}
}'

Example Responses for Calls to Initiate Resolution

The following is an example of a response for a successful job submission for a call to initiate identity resolution:

{
    "ResponseHeaders": {
        "Content-Type": "application/json",
        "Content-Length": "97",
        ...
    },
    "Body": "{\"Job ID\": \"E660EC80F3BF4473A120D3CAC890CADC_AWS_US_EAST_1\", \"Status\": \"ADX Start job submitted\"}"
}

Use the Job ID in the poll job request to initiate the delivery of the output file (for more information, see the "Initiate Output File Delivery" section below).

Note

For information on troubleshooting errors that might occur when performing calls, see "Troubleshoot Calls in ADX".

Initiate Output File Delivery

Once you’ve initiated the identity resolution process, you must make a poll job request to initiate the delivery of the output file to the output S3 bucket after processing is complete. One of the parameters you'll need to make that call is the Job ID that was included in the response to the call to initiate identity resolution.

Note

For information on the parameters to include in the call, see the “API Parameters” section below.
It is recommended that polling be done programmatically at recurring intervals until the processing is complete and the output file has been delivered.
For information on troubleshooting errors that might occur when performing calls, see "Troubleshoot Calls in ADX".

AWS CLI Calls to Initiate Delivery

See below for the format of an AWS CLI call used to initiate output file delivery:

aws dataexchange send-api-asset \
    --data-set-id <data-set-id> \
    --revision-id <revision-id> \
    --asset-id <asset-id> \
    --method POST \
    --region us-east-2 \
    --path "/adx/job/poll" \
    --body '{
            "job_id": "<Job ID>",
            "output_s3": "<Output S3 bucket>",
            "file_format": "csv",
            "client_id": "<Client ID>",
            "client_secret": "<Client secret>",
            "cross_region": "true"
            }'

See below for an example of what a populated AWS CLI call used to initiate output file delivery might look like:

aws dataexchange send-api-asset \
    --data-set-id <data-set-id> \
    --revision-id <revision-id> \
    --asset-id <asset-id> \
    --method POST \
    --region us-east-2 \
    --path "/adx/job/poll" \
    --body '{
            "job_id": "JOB_ID_123",
            "output_s3": "s3://my-output-bucket",
            "file_format": "csv",
            "client_id": "my-client-id",
            "client_secret": "my-client-secret",
            "cross_region": "true"
            }'

Example Responses for Calls to Initiate Delivery

The following is an example of a response when processing is complete:

{
    "ResponseHeaders": {
        "Content-Type": "application/json",
        "Content-Length": "158",
        ...
    },
    "Body": "{\"Job ID\": \"E660EC80F3BF4473A120D3CAC890CADC_AWS_US_EAST_1\", \"Status\": \"ADX Poll job started for delivering output results. Re-poll later for updated status\"}"
}

In addition to the response received when processing is complete, you might get one of the following responses in the status parameter:

''Upload to AWS S3 in progress. Re-poll later or wait for the delivery notification'
'Output results uploaded to AWS S3 bucket'

Note

For information on troubleshooting errors that might occur when performing calls, see "Troubleshoot Calls in ADX".

API Parameters

See the tables below for a list of the API header parameters and request parameters.

Header Parameters

Header Parameter	Data Type	Description
data-set-id	string	Your AWS-provided Data set ID.
revision-id	string	Your AWS-provided Revision ID.
asset-id	string	Your AWS-provided Asset ID.

For information on finding the AWS-provided parameters, see this AWS article.

Request Parameters

Request Parameter	Description
client_id	Either an existing LiveRamp client ID (if you already have Identity API credentials) or a new one provided by LiveRamp
client_secret	Password/secret for the LiveRamp client_ID (either an existing password/secret (if you already have Identity API credentials) or a new one provided by LiveRamp)
workflow_type	“resolution” for all identity resolution processes
workflow_sub_type	The type of identifiers being resolved. Options include: `PII` `Email` `Cookies` `MAID` `CTV` `CID` (provided to you by LiveRamp and specific to an existing mapping) `HHLink` (to resolve individual RampIDs into household RampIDs) `CID_MAPPING` (to resolve a universe dataset with deconfliction) Note For all `workflow subtypes` except `CID_MAPPING`, each identifier type has to be separated into its own input data file and only one option above can be chosen for each operation. For the `CID_MAPPING` workflow subtype (for universe dataset resolution), mixed job types can be used within a single operation.
input_s3	S3 directory for input files.
output_s3	S3 directory for output files.
file_format	Specifies the format for input files. The accepted file format is "CSV".
file_pattern	Regex pattern for input files. For example, the pattern ‘input_2.*[.]csv’ would result in the processing of the following files: input_20.csv input_221.csv input_225.csv
target_column	The column header name for the input field which contains the IDs to be resolved. Ex: “DEVICE_ID” Note For PII resolution (with our without deconfliction), leave this parameter out.
input_columns	For PII or email workflows, this includes the target column and any attribute columns you want to pass through into the output table. For example: "input_columns": { "hashed_email": "text", "gender": "text", "last_car": "text" } Note For PII resolution, these column names cannot be used in the input table: RampID __lr_rank __lr_filter_ name
target_columns	A subset of input_columns used in PII resolution jobs. These are the PII elements that will be resolved to create the output RampIDs. "target_columns": { "name": ["name"], "streetAddress": ["address"], "zipCode": "zip", "phone": "phone", "email": "email", "hashedAttributes": ["cid"] } Note Do not include the name of attribute columns you want to pass through unhashed in this parameter. Use the `input_columns` parameter to include those attribute columns.
inputs	For a deconflicted universe dataset run with multiple types of files, the `input_s3`, `file_format`, `file_pattern`, `input_columns` and `target_columns` for each type of file should be specified in the inputs array, as shown below. "inputs": [ { "input_s3": "<Input S3 bucket>", "file_format": "csv", "file_pattern": "<Regex pattern for input files>[.]csv", "input_columns": { <"Column name": "Column type"> }, "target_columns": {<"key": "Column name">} }, { "input_s3": "<Input S3 bucket>", "file_format": "csv", "file_pattern": "<Regex pattern for input files>[.]csv", "input_columns": { <"Column name": "Column type"> }, "target_columns": {<"key": "Column name">} } ]
deconfliction_config	Specify the 'config' request parameter as an object with the type of deconfliction you want to execute for example, {"deconflictionConfig": "STANDARD"}" (for more information, see the "Deconfliction Options" section above). Accepted values include: `standard`: This configuration returns the RampIDs that are determined to be most relevant and removes CIDs that are determined to be duplicates (based on linking to the same RampID). This is the default option and covers most advertiser use cases. `maximized_1P_fidelity`: This configuration returns the RampIDs that are determined to be most relevant but preserves additional CIDs even if there are RampIDs that indicate LiveRamp could consolidate them. This option is ideal for publishers and other data owners. `hh_expansion`: This configuration allows clients with a household-based CID to resolve and deconflict with individual RampIDs that represent the full household. This option is ideal for retailers with household-based loyalty programs, as an example. `event_integrity`: This configuration preserves all CIDs, meaning identity conflicts are minimized but not eliminated. This option is ideal for CIDs that represent transaction data, as an example.
cross_region	“true” or “false”. If “true”, then workloads are processed in the default region (us-east-1) if the target region is unavailable. If “false”, then workloads are not processed in the default region if the target region is unavailable and a status message to enable cross region is returned to the caller.
match_limit	For PII or email resolution only (without deconfliction), specify an integer between 1 and 10 to specify the maximum number of RampID results returned per input identifier (to return only the “best match”, returning 1 RampID is sufficient). The default is “1”.
job_id	For polling requests, enter the Job ID returned in the response for the call to initiate identity resolution. The Job ID consists of a unique ID plus your AWS region name.

View Identity Resolution Output

The output file(s) from the identity resolution process will be compressed and then written to the specified S3 bucket provided in the poll job request.

The file naming convention for the output file will be "<JOB_ID>_0_0_0.csv.gz"

The Job ID will be a unique ID plus your AWS region name.

Ex: 17697C67E98D4702BEB4ED7B3B0FA_AWS_US_EAST_1_0_0_0.csv.gz

Output File for PII Resolution

The standard PII resolution process passes the input table through a privacy filter which removes the PII and reswizzles the table (in addition to other operations). Because of this, any attributes you need to keep associated with the identifier need to be included in the input table. For more information, see the "Privacy Filter" section below.

Identity resolution of PII provides supplemental match metadata for additional insight into customer data that can provide powerful signals for making decisions based on RampIDs.

For PII resolution, the output table includes the fields shown in the table below.

Column	Sample	Description
`RampID`	XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd	Returns the resolved RampID in your domain encoding.
`attribute_1`	Male	Any attribute columns passed through the service are returned.
`hashed_cid`	63889cfb9d3cbe05d1bd2be5cc9953fd	Any hashed attribute columns passed through the service are returned with their values MD5 hashed.
`__lr_rank`	1	Provides insight on the match cascade level associated with the identifiers. If no maintained RampID is found, this value will be "null".
`__lr_filter_name`	name_phone	Returns the filter name where the match occurred, which will be one of the following options: `name_address_zip` `name_email` `name_phone` `partial_name_email` `partial_name_phone` `strict_name` (name + zip) `email` `phone` `last_name_address` If no maintained RampID is found, this value will be "null".

Output File for Email Address Resolution

For email resolution, the output table includes the fields shown in the table below.

Column	Example	Description
`RampID`	XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRD	The RampID associated with the email address. Note If multiple RampIDs are associated with an email address, multiple lines will be created in the output file.
`attribute_1`	Male	The original attribute columns included in the input file.

Output File for Device ID Resolution

The output file for device ID resolution will follow the format shown in the table below.

Column	Example	Description
`device_identifier` OR `RampID`	1f4d256c-1f08-41f6-a108-bbe511de9497	The original identifier included in the input file.
`RampID`	XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRD	For input files containing device identifiers,the RampID associated with the device identifier. For input files containing individual RampIDs, the household RampIDs associated with those individual RampIDs. Note: If multiple RampIDs are associated with a device identifier, multiple lines will be created in the output file.

Column

Example

Description

device_identifier OR RampID

1f4d256c-1f08-41f6-a108-bbe511de9497

The original identifier included in the input file.

RampID

XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRD

For input files containing device identifiers,the RampID associated with the device identifier.

For input files containing individual RampIDs, the household RampIDs associated with those individual RampIDs.

Note: If multiple RampIDs are associated with a device identifier, multiple lines will be created in the output file.

Output File for CID Matching

The output file for CID matching will follow the format shown in the table below.

Column	Example	Description
`cid`	93abc799-a0a5-40b5-80dd-d2ab61d4d072	The original identifier included in the input file.
`RampID`	XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRD	The resolved RampID in your domain encoding.

Output File for Universe Dataset Resolution

The output file for universe dataset resolution will follow the format shown in the table below.

Column	Example	Description
`hashed_cid`	93abc799-a0a5-40b5-80dd-d2ab61d4d072	The CID passed into the process will be returned as an MD5-hashed CID.
`RampID`	XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd	Returns the resolved and deconflicted RampID in your domain encoding.

For household expansion deconfliction configurations, additional metadata is provided to indicate RampIDs returned that are part of the same household, but not necessarily the same person-based identifier from the PII input. The output table for this configuration type includes the fields as shown in the table below.

Column

Example

Description

hashed_cid

93abc799-a0a5-40b5-80dd-d2ab61d4d072

The CID passed into the process will be returned as an MD5-hashed CID.

RampID

XYT999wXyWPB1SgpMUKlpzA013UaLEz2lg0wFAr1PWK7FMhsd

Returns the resolved and deconflicted RampID in your domain encoding.

is_appended

TRUE

Returns the value “TRUE” for RampIDs appended because they belong to the same household but are not directly associated with the PII input.

Returns the value “FALSE” if the RampID represents the individual associated with the PII input.

Privacy Filter

To minimize the risk of re-identification (the ability to tie PII directly to a RampID), the service includes the following processes when resolving PII identifiers (PII resolution or email-only resolution):

Column Values: The process evaluates each column value on a per-row basis for unique values. If any attribute occurs 3 or fewer times, the rows containing those column values will not be matchable and will not be returned in the output table.
Note
This check does not apply to hashed attributes.
>5% of the table unmatchable: If based on column value uniqueness, >5% of the file rows are unmatchable, the job will fail.
Number of Unique RampIDs: If fewer than 100 unique RampIDs would be returned, the job will fail.
Reswizzle full table: Upon completion, the full table will be reswizzled to return the rows RampID | attribute_1 | attribute_2 | attribute_n in a different order than what was submitted in the input table.

Note

When resolving a universe dataset with deconfliction, attributes are not preserved so the privacy filter is not applied. Any PII and hashed email input to a universe dataset with deconfliction still requires at least 100 unique input rows per identifier per file.

In this section:

Perform Identity Resolution Through ADX

Note

Deconfliction Options

Overall Steps

Note

Note

Checklist to Verify Your Setup for LiveRamp Identity in ADX

AWS Region Alignment

Note

IAM User and Permissioning

S3 Bucket Setup

Format the Input Data File

Input File Formatting Guidelines

Note

Note

File Format for PII Resolution

Note

File Format for Email-Only Resolution

Note

File Format for Device ID Resolution

Note

File Format for CID Matching

Note

File Format for Universe Dataset Resolution with Deconfliction

Note

Initiate Identity Resolution

Note

Note

AWS CLI Calls to Initiate Identity Resolution

Note

Note

Examples of AWS CLI Calls

PII Resolution Call Example

Email Resolution Call Example

CTV ID Resolution Call Example

Universe Dataset Resolution Call Example (With Deconfliction)

Example Responses for Calls to Initiate Resolution

Note

Initiate Output File Delivery

Note

AWS CLI Calls to Initiate Delivery

Example Responses for Calls to Initiate Delivery

Note

API Parameters

Header Parameters

Request Parameters

Note

Note

Note

Note

View Identity Resolution Output

Output File for PII Resolution

Output File for Email Address Resolution

Note

Output File for Device ID Resolution

Output File for CID Matching

Output File for Universe Dataset Resolution

Privacy Filter

Note

Note

Search results