Skip to main content

Perform Translation Using AWS Entity Resolution

LiveRamp's Translation in the Amazon Data Exchange (ADX) allows for the translation of a RampID from one partner encoding to another using either maintained or derived RampIDs. This allows you to match persistent pseudonymous identifiers to one another and enables use of the data without sharing the sensitive underlying identifiers.

Note

  • This article contains information on performing translation with LiveRamp’s Identity offerings using AWS Entity Resolution. If you plan to perform translation through ADX standalone, see “Perform Translation Through ADX".

  • For more information about RampIDs, see "RampID Methodology".

Specifically, RampID translation enables:

  • Person-based analytics

  • Increased match rates in data collaboration

  • Measurement enablement across device types

You can access LiveRamp Identity Resolution using AWS Entity Resolution within the ADX Marketplace, meaning translation can be performed without leaving AWS. See "LiveRamp Identity Using AWS Entity Resolution" for more information.

Overall Steps

To execute a translation operation in AWS Entity Resolution, you perform the following overall steps:

  1. You prepare input data tables with the data to translate.

  2. If not already done, you upload your input data tables to Amazon S3 buckets.

  3. You create AWS Glue tables from the input data tables in your S3 buckets.

  4. You create a schema mapping in AWS Entity Resolution to define the input data you want to translate.

  5. You create an ID mapping workflow in AWS Entity Resolution.

  6. You run the ID mapping workflow.

  7. You view the output.

See the sections below for more information on performing these tasks.

Format the Input Data Table

See the sections below for information on formatting the input data table. Once your tables have been formatted, they must be uploaded to Amazon S3 buckets (see the nstructions from AWS here).

Input Table Formatting Guidelines

Input data tables for translation should be formatted as CSV files. When creating input data tables, follow these additional guidelines:

  • Include a header row in the first line of every table. Tables cannot be processed without headers.

  • You can name your columns however you want, but every column name must be unique in a table.

  • Column names must be alphanumeric (other than underscores) and start with a letter.

  • Do not use spaces in column names. Use underscores.

  • The first column(s) in the input table must be the column that contains the RampIDs to be translated.

  • When performing translation on multiple files in one job, make sure the identifier column headers are the same in every file and that they match the value given for the “target_column” parameter in the call to initiate translation.

  • Try not to use additional columns in the tables required for the translation operation. Having extra columns slows down processing and all attribute columns will be dropped during translation.

  • The translation operation can process records containing blank fields.

  • Only one identity operation is permitted per table and only one target domain can be translated per job.

Format to Use for a Translation Table

See the table below for an example of how to format an input data table for translation.

Suggested Column Name

Example

Description

RAMPID

XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRD

RampID (maintained or derived) for translation.

UNIQUE_ID

17

A unique identifier that will be used to distinctly reference each row of your data. You can use your own pseudonymous identifier or a row ID.

Create AWS Glue Tables

AWS Entity Resolution reads from AWS Glue as the input. After you’ve created your input data tables and saved them to your Amazon S3 buckets, you need to create AWS Glue tables from those input data tables. For more information, see the instructions from AWS here.

Create the Schema Mapping

Before you can run an ID mapping workflow to perform translation, you must create first a schema mapping for AWS Entity Resolution to understand what input fields you want to use. You can bring your own data schema, or blueprint, from an existing AWS Glue data input, or build your custom schema using an interactive user interface or JSON editor.

There are three ways to create a schema mapping in AWS Entity Resolution:

  • Import existing schema information

  • Manually define the input

  • Use a JSON editor to create, paste, or import a schema mapping.

For information on creating a schema mapping, see the instructions from AWS here and follow the additional guidelines listed below.

Schema Mapping Guidelines

When creating the schema mapping, make sure to follow these guidelines:

  • When entering the target domain:

    • Enter a partner’s domain when translating from your native encoding to that partner’s domain.

    • Enter your domain when translating from a partner’s encoding to your native encoding.

  • Use whichever column contains unique IDs for the "Unique ID" field.

Create the ID Mapping Workflow

After you’ve created your input data table in AWS Glue and created a schema mapping for that table, you can create the ID mapping workflow to be used to perform the translation operation. For information, see the instructions from AWS here.

Run the ID Mapping Workflow

Once all setup steps are complete, follow the instructions from AWS here to run the ID mapping workflow and perform the translation operation.

On the Metrics tab, under Job history, you can view the following:

  • The Status of the ID mapping workflow job: In progress, Completed, Failed

  • The total records processed.

  • The duration of the job.

  • The Job ID.

After the ID mapping workflow job completes (status is “Completed”), you can go to the Data output tab and then select your Amazon S3 location to view the results.

View Identity Output

The output file(s) from the translation process will be compressed and then written to the specified S3 bucket.

The file naming convention for the output file will be "<JOB_ID>_0_0_0.csv.gz"

The Job ID will be a unique ID plus your AWS region name.

Ex: 17697C67E98D4702BEB4ED7B3B0FA_AWS_US_EAST_1_0_0_0.csv.gz

The output file for translation will follow the format shown in the table below.

Column

Example

Description

RampID (original encoding)

XYT999RkQ3MEY1RUYtNUIyMi00QjJGLUFDNjgtQjQ3QUEwMTNEMTA1CgMjVBMkNEMTktRD

Returns the original RampID included in the input table.

Transcoded_identifier (RampIDs in target encoding)

XYT001k0MS00MDc1LUI4NjEtMjlCOUI0MUY3MENBCgNjVGQjE0MTMtRkFBMC00QzlELUJF

Edit an ID Mapping Workflow

To edit an ID mapping workflow, follow the instructions from AWS here.

Delete an ID Mapping Workflow

To delete an ID mapping workflow, follow the instructions from AWS here.