Perform Translation Using AWS Entity Resolution
LiveRamp's Translation in the Amazon Data Exchange (ADX) allows for the translation of a RampID from one partner encoding to another using either maintained or derived RampIDs. This allows you to match persistent pseudonymous identifiers to one another and enables use of the data without sharing the sensitive underlying identifiers.
Note
This article contains information on performing translation with LiveRamp’s Identity offerings using AWS Entity Resolution. If you plan to perform translation through ADX standalone, see “Perform Translation Through ADX".
For more information about RampIDs, see "RampID Methodology".
Specifically, RampID translation enables:
Person-based analytics
Increased match rates in data collaboration
Measurement enablement across device types
You can access LiveRamp Identity Resolution using AWS Entity Resolution within the ADX Marketplace, meaning translation can be performed without leaving AWS. See "LiveRamp Identity Using AWS Entity Resolution" for more information.
Overall Steps
To execute a translation operation in AWS Entity Resolution, you perform the following overall steps:
You prepare input data tables with the data to translate.
If not already done, you upload your input data tables to Amazon S3 buckets.
You create AWS Glue tables from the input data tables in your S3 buckets.
You create a schema mapping in AWS Entity Resolution to define the input data you want to translate.
You create an ID mapping workflow in AWS Entity Resolution.
You run the ID mapping workflow.
You view the output.
See the sections below for more information on performing these tasks.
Format the Input Data Table
See the sections below for information on formatting the input data table. Once your tables have been formatted, they must be uploaded to Amazon S3 buckets (see the nstructions from AWS here).
Input Table Formatting Guidelines
Input data tables for translation should be formatted as CSV files. When creating input data tables, follow these additional guidelines:
Include a header row in the first line of every table. Tables cannot be processed without headers.
You can name your columns however you want, but every column name must be unique in a table.
Column names must be alphanumeric (other than underscores) and start with a letter.
Do not use spaces in column names. Use underscores.
The first column(s) in the input table must be the column that contains the RampIDs to be translated.
When performing translation on multiple files in one job, make sure the identifier column headers are the same in every file and that they match the value given for the “target_column” parameter in the call to initiate translation.
Try not to use additional columns in the tables required for the translation operation. Having extra columns slows down processing and all attribute columns will be dropped during translation.
The translation operation can process records containing blank fields.
Only one identity operation is permitted per table and only one target domain can be translated per job.
Format to Use for a Translation Table
See the table below for an example of how to format an input data table for translation.
Suggested Column Name | Example | Description |
---|---|---|
RAMPID | XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRD | RampID (maintained or derived) for translation. |
UNIQUE_ID | 17 | A unique identifier that will be used to distinctly reference each row of your data. You can use your own pseudonymous identifier or a row ID. |
Create AWS Glue Tables
AWS Entity Resolution reads from AWS Glue as the input. After you’ve created your input data tables and saved them to your Amazon S3 buckets, you need to create AWS Glue tables from those input data tables. For more information, see the instructions from AWS here.
Create the Schema Mapping
Before you can run an ID mapping workflow to perform translation, you must create first a schema mapping for AWS Entity Resolution to understand what input fields you want to use. You can bring your own data schema, or blueprint, from an existing AWS Glue data input, or build your custom schema using an interactive user interface or JSON editor.
There are three ways to create a schema mapping in AWS Entity Resolution:
Import existing schema information
Manually define the input
Use a JSON editor to create, paste, or import a schema mapping.
For information on creating a schema mapping, see the instructions from AWS here and follow the additional guidelines listed below.
Schema Mapping Guidelines
When creating the schema mapping, make sure to follow these guidelines:
When entering the target domain:
Enter a partner’s domain when translating from your native encoding to that partner’s domain.
Enter your domain when translating from a partner’s encoding to your native encoding.
Use whichever column contains unique IDs for the "Unique ID" field.
Create the ID Mapping Workflow
After you’ve created your input data table in AWS Glue and created a schema mapping for that table, you can create the ID mapping workflow to be used to perform the translation operation. For information, see the instructions from AWS here.
Run the ID Mapping Workflow
Once all setup steps are complete, follow the instructions from AWS here to run the ID mapping workflow and perform the translation operation.
On the Metrics tab, under Job history, you can view the following:
The Status of the ID mapping workflow job: In progress, Completed, Failed
The total records processed.
The duration of the job.
The Job ID.
After the ID mapping workflow job completes (status is “Completed”), you can go to the Data output tab and then select your Amazon S3 location to view the results.
View Identity Output
The output file(s) from the translation process will be compressed and then written to the specified S3 bucket.
The file naming convention for the output file will be "<JOB_ID>_0_0_0.csv.gz"
The Job ID will be a unique ID plus your AWS region name.
Ex: 17697C67E98D4702BEB4ED7B3B0FA_AWS_US_EAST_1_0_0_0.csv.gz
The output file for translation will follow the format shown in the table below.
Column | Example | Description |
---|---|---|
RampID (original encoding) | XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRD | Returns the original RampID included in the input table. |
Transcoded_identifier (RampIDs in target encoding) | XYT001k0MS00MDc1LUI4NjEtMjlCOUI0MUY3MENBCgNjVGQjE0MTMtRkFBMC00QzlELUJF |
Edit an ID Mapping Workflow
To edit an ID mapping workflow, follow the instructions from AWS here.
Delete an ID Mapping Workflow
To delete an ID mapping workflow, follow the instructions from AWS here.