Skip to main content

Running LiveRamp’s Local Encoder In an AWS Environment

Local Encoder enables you to generate securely-encoded RampIDs for your consumer data files within your own cloud environment and then utilize that data for onboarding or addressability use cases, depending on your needs. In this way, your consumer data is never exposed to an external network, while still enabling full use of the LiveRamp solutions. The encoded RampIDs produced by the application cannot be decoded back to the original consumer identifiers.

Local Encoder can be run on any infrastructure that supports running Docker images. The Local Encoder Docker image is currently distributed via the Amazon Elastic Container Registry (ECR).

For more information on Local Encoder, including information on security, use cases, data you can send, and output options, see "LiveRamp Local Encoder".

For information on running Local Encoder using AWS, see the sections below .

Overall Steps

Running the Local Encoder in an AWS environment involves the following overall steps:

  1. You provide LiveRamp with your PGP public key or Keybase username.

  2. LiveRamp provides you with credentials, a terraform file to create the resource in AWS, and a kubernetes archive file.

  3. You set up the infrastructure and configure the desired parameters.

  4. You deploy a kubernetes manifest.

  5. You upload your data files to the appropriate input location.

  6. Local Encoder performs the following operations:

    1. The data is normalized and hygiene is performed.

    2. The identifiers in the data are converted into derived RampIDs.

    3. If appropriate, the derived RampIDs for each record are encoded into secure RampID packets or identity envelopes.

    4. The input identifiers are removed and replaced with the appropriate RampID output type (RampIDs, RampID packets, or identity envelopes).

  7. For Onboarding use cases the following steps are performed:

    1. The output containing RampID packets is delivered to LiveRamp.

    2. LiveRamp decodes the RampID packets into their individual derived RampIDs.

    3. LiveRamp matches those derived RampIDs to their associated maintained RampIDs.

    4. LiveRamp creates the appropriate fields and segments from the segment data in your LiveRamp platform (such as Connect or Safe Haven).

  8. For Addressability use cases the following steps are performed:

    1. The output containing RampIDs or identity envelopes is output to the destination of your choice.

    2. You leverage the output to build a mapping table of your customer IDs to RampIDs or identity envelopes.

Prerequisites

Running the Local Encoder in an AWS environment requires that you have the following prerequisites:

Note

You will use the AWS command line tool to issue commands at your system's command line to perform Amazon ECR and other AWS tasks. We recommend that you have the latest version of the AWS CLI installed. For information about installing the AWS CLI or upgrading it to the latest version, see Installing the AWS Command Line Interface.

Set Up the Infrastructure

Use the terraform sample configuration file that LiveRamp provided to set up the infrastructure if you don’t have an existing cluster.

  1. Unpack the archive file (terraform.zip).

  2. Modify the file “locals.tf”:

    • If desired, use the cluster_name parameter to update the input bucket name.

    • Use the region parameter to define your AWS Region (see Amazon’s AWS region availability for more information).

    • Use the path and version parameters to provide Docker image details.

    • If you have specific subnet definitions, use the vpc parameter to provide VPC details (otherwise, you do not need to change the defaults).

    Sample “locals.tf” file:

    locals {
      cluster_name = "local-encoder-eks-${random_string.this.result}"
    
      region = "[customer AWS region]"
    
      image = {
        path = "461694764112.dkr.ecr.eu-central-1.amazonaws.com/vault-app"
        version = "latest"
      }
    
      vpc = {
        cidr = "10.0.0.0/16"
        private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
        public_subnets  = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
      }
    
    }
    
    resource "random_string" "this" {
      length  = 8
      special = false
    }
  3. Use your IAM credentials to authenticate the Terraform AWS provider, and set the AWS_ACCESS_KEY_ID environment variable and your secret key:

    export AWS_ACCESS_KEY_ID=”<your_aws_access_key_id>”
    export AWS_SECRET_ACCESS_KEY=”<your_aws_access_key>”

    Note

    If you don't have access to IAM user credentials, use another authentication method as described in the AWS provider documentation.

  4. Initialize terraform and plan for changes by running the following command:

    terraform init
    terraform plan

    You’ll see a list of resources that will be created:

    • VPC

    • Security group

    • EKS cluster

    • Storage bucket

  5. Apply the changes by running the following command:

    terraform apply

    This step can take up to 10 minutes depending on whether a new cluster is being created.

    Sample output:

    Configuration output:
    cluster_endpoint = "https://<AWS access key ID>.gr7.ap-southeast-2.eks.amazonaws.com"
    cluster_id = "local-encoder-eks-<random>"
    cluster_name = "local-encoder-eks-<random>"
    cluster_security_group_id = "<cluster security group ID>"
    region = "ap-southeast-2”

    Output parameters:

    • Cluster_endpoint: The cluster endpoint, which includes your AWS access key ID.

    • Cluster_id: The cluster ID, which includes the endpoint name you set and a randomly generated string value.

    • Cluster_name: The cluster name, which includes the endpoint name you set and a randomly generated string value.

    • Cluster_security_group_id: The cluster security group ID, which is generated from the “locals.tf” file.

    • Region: The AWS region you set.

  6. Once the output is displayed, connect your new or existing cluster to your kubeconfig. by running the following command:

    aws eks --region $(terraform output -raw region) update-kubeconfig \
        --name $(terraform output -raw cluster_name)

From this point onwards, kubectl can be used to manage the cluster and deploy kubernetes configuration.

For example: Execute the command: kubectl cluster-info

Deploy a Kubernetes Manifest

Use the sample yaml configuration that LiveRamp provided to deploy the Local Encoder to the target cluster.

  1. Unpack the kubernetes archive file.

  2. Access the credentials that LiveRamp provided to use them in subsequent steps:

    • Local Encoder account ID

    • AWS IAM Access Key ID

    • AWS IAM Secret Access Key (will need to be decrypted)

  3. Set the configuration file “configmap.yaml” using the provided credentials. See the “Optional Configuration Parameters” section below for more information on additional parameters.

    apiVersion: v1
    data:
     LR_VAULT_ACCOUNT_TYPE: awsiam
     LR_VAULT_ACCOUNT_ID: '<Local Encoder account ID>'
     LR_VAULT_LR_AWS_ACCESS_KEY_ID: '<AWS IAM Access Key ID>'
     LR_VAULT_INPUT: 's3://aws-au-chp-vaultapp-input'
     LR_VAULT_OUTPUT: 's3lr://com-liveramp-chp-vaultapp-output-prod'
    
     AWS_ACCESS_KEY: '<YOUR_AWS_ACCESS_KEY>'
     AWS_SECRET_ACCESS_KEY: '<YOUR_AWS_SECRET_ACCESS_KEY>'
     AWS_REGION: '<REGION>'
    kind: ConfigMap
    metadata:
     name: 'local-encoder-config'

    Note

    • LR_VAULT_OUTPUT is the LiveRamp destination bucket.

    • To receive either RampIDs or identity envelopes, you will need to add an additional line. For more information, see the "Configure the Output Type" section below.

  4. Update the “secret.yaml” file using the provided credentials:

    Note

    The secret access key will need to be base64 encoded before storing.

    apiVersion: v1
    data:
     LR_VAULT_LR_AWS_SECRET_ACCESS_KEY: '<AWS IAM Secret Access Key>'
    kind: Secret
    metadata:
     name: 'local-encoder-secret'
    type: Opaque
  5. Once the values are updated, run the following command to deploy:

    kubectl apply -k .

Complete Configuration Activities

See the sections below for information on completing any desired additional configuration activities. For more information on configuration parameters, see the "Configuration Parameters" section below.

Configure the Output Type

The Local Encoder application offers multiple output formats:

  • RampID packets: Used for Onboarding by brands and marketers, these RampID packets can be delivered to LiveRamp, where they can be transformed into RampIDs and used to generate fields and segments in your LiveRamp application. This is the default output type.

  • RampIDs: Used for addressability by publishers and platforms who want to create a RampID mapping.

  • Identity envelopes: Used for addressability by publishers and platforms who want to get RampIDs into the bidstream to safely engage with the programmatic advertising ecosystem.

For more information on the output types available, see "Output Options".

Note

  • If you plan to receive RampIDs or identity envelopes, contact your LiveRamp representative for approval and account configuration changes.

  • Identity envelope output is only available with Version 1.6 of Local Encoder.

After you are approved for RampID or identity envelope output, add the following line to your Kubernetes ConfigMaps (when deploying a Kubernetes manifest) to add an additional variable using the appropriate method listed below:

Note

To receive RampID packets, you do not need to make any changes to the configuration.

  • For RampID output, add the following line to the Kubernetes ConfigMaps:

    LR_VAULT_PACKET_TYPE: unencoded
  • For identity envelope output, add the following line to the Kubernetes ConfigMaps:

    LR_VAULT_ENVELOPES_FLOW=true

Note

When editing the yml file, any formatting issues will prevent the configuration file from working properly. We recommend that you run the file through a YAML validator.

Utilize Encryption

Optional encryption is available with Local Encoder (version 1.5 and greater). This functionality encrypts each row of data before it is sent to LiveRamp for processing.

Note

Adding encryption increases the processing time approximately 20% depending on the size of the file and number of records. LiveRamp recommends limiting file size to 15GB.

To utilize encryption, add the following line to the configuration file, just before the LR_VAULT_OUTPUT parameter: LR_VAULT_PUBLIC_KEY_ENCRYPTION: 'true'.

Format the File

Input files must include identifier fields and (for Onboarding use cases where you're receiving RampID packets) can also include segment data fields if desired.

Before uploading a file to the input location, make sure to format the data according to these guidelines:

  • Include a header row in the first line of every file consistent with the contents of the file. Files cannot be processed without headers.

  • If you want to maintain the ability to sort the output file, you must include a column containing row IDs (“RID”) as the first column of the file.

    Note

    The row identifier column is only required to maintain sort order and should not contain any customer personally-identifiable data.

  • Make sure that the only identifiers included are the allowed identifier touchpoints listed below.

  • If you’re sending data for consumers in multiple countries or if you’re including phone numbers, you must include the appropriate country code column (depending on the method used) to identify the country of each record. For more information, see the “Optional Configuration Parameters” section below.

  • Include a maximum of 500 segment data fields in a single file (for Onboarding use cases where you're receiving RampID packets).

  • Segment data field types can be in the form of a string, numeral, enum, etc.

  • The application supports three file formats: CSV, PSV, and TSV.

  • Files must be rectangular (have the same number of columns for every row).

  • If any values contain the file’s delimiter character (for example, a comma in a .csv file), make sure that your values are contained within quotes.

  • The recommended maximum file size is 20GB.

Allowed Identifier Touchpoints

You can include any of the following allowed identifier touchpoints for translation to RampIDs in both Onboarding and Addressability use cases:

  • Plaintext email address (maximum of three per record)

  • SHA-256 hashed email address (maximum of three per record)

  • Plaintext mobile phone number (maximum of two per record)

  • SHA-256 hashed mobile phone number (maximum of two per record)

  • Plaintext landline phone number (maximum of one per record)

  • SHA-256 hashed landline phone number (maximum of one per record)

Additional Allowed Touchpoints for Onboarding Use Cases

For Onboarding use cases (where you're receiving RampID packets, the following additional identifier touchpoints are also allowed for translation to RampIDs:

  • Name and postcode, which consists of first name, last name, and postcode (maximum of one per record)

  • AAID (maximum of one per record)

  • IDFA (maximum of one per record)

  • IMEI (maximum of one per record)

Example Header

See the header shown below for an example of what the header might look like when sending data in a pipe-separated file (psv) for an Onboarding use case, where segment data fields are included:

RID|EMAIL1|EMAIL2|EMAIL3|SHA256EMAIL1|SHA256EMAIL2|SHA256EMAIL3|MOBILE1|MOBILE2|SHA256MOBILE1|SHA256MOBILE2|LANDLINE1|SHA256LANDLINE1|FIRSTNAME|LASTNAME|POSTCODE|AAID|IDFA|IMEI|ATTRIBUTE_1|...|ATTRIBUTE_N

Replace ATTRIBUTE_1 … N in the example header with the name of your CRM attributes.

Example Output Files

For more information on the format of the output file, see "Output Examples".

Upload the File to the Input Bucket

Uploading a file to your Local Encoder services input bucket kicks off the encoding operation. To upload your file, run a command similar to the example below (this example shows using an AWS S3, but this could be any local directory):

aws s3 cp [your_file].csv s3://com-liveramp-vault-[your-vpc-id]-input

Caution

To successfully process a file, the input bucket cannot contain more than 10 files. Before uploading a new file to the input bucket, check that the bucket will not have more than 10 files once the new file has been uploaded.

Once the file has been processed, you’ll get a confirmation message that includes the number of records processed.

  • For RampID packet output, all consumer identifier data in a row is transformed into derived RampIDs, packaged into one data structure and encrypted again, yielding a RampID packet.

  • For RampID output, all consumer identifier data in a row is transformed into derived RampIDs in the form of a JSON string in a “RampID” column.

  • For identity envelope output, all consumer identifier data is transformed into derived RampIDs. A selection logic is applied, then the RampID is additionally obfuscated and encrypted into an identity envelope. Only one identity envelope is returned per row of data. A timestamp column is appended to the end of each row. This column gives the expiration date and time for the identity envelope in Unix format (timezone UTC).

For more information on the format of the output file, see "Output Examples".

Configuration Parameters

See the sections below for information on the required and optional parameters to use, depending on the deployment method being used.

Required Configuration Parameters

Parameter Name

Parameter for Local Configuration

Parameter for Command Line or Configmaps.yaml

Example Value(s)

Notes

AWS user ID

account_id

LR_VAULT_ACCOUNT_ID

AID.….

Provided by LiveRamp

Account type

account_type

LR_VAULT_ACCOUNT_TYPE

awsiam

LiveRamp AWS account ID

AWS_LR_ACCOUNT_ID

461694764112

Provided by LiveRamp

AWS IAM access key ID

lr_access_key_id

LR_VAULT_LR_AWS_ACCESS_KEY_ID

AKI......

Provided by LiveRamp

AWS IAM secret access key

lr_secret_access_key

LR_VAULT_LR_AWS_SECRET_ACCESS_KEY

LiveRamp provides the secret encrypted with customer key

Input File Location

input

LR_VAULT_INPUT

  • s3://input-bucket

  • gs://input-bucket

  • /tmp/input-folder

  • For AWS S3 bucket, prefix is “s3://”.

  • For GCS buckets, prefix is “gs://”.

Output file Location

output

LR_VAULT_OUTPUT

  • s3lr://bucket-name

  • gs://bucket-name

  • s3://bucket-name

  • /tmp/output-folder

  • For an S3 bucket that belongs to a LiveRamp account, prefix is “‘s3lr://”’.

  • For AWS S3 bucket, prefix is “s3://’.

  • For GCS buckets, prefix is “gs://”.

AWS region for LR resources

lr_region

LR_VAULT_LR_AWS_REGION

eu-central-1

LiveRamp’s AWS Region

Origin of the data being processed

locale

LR_VAULT_LOCALE

us

Two letter country code representing the origin of the data being processed (for example, Australia = “au”, Great Britain = “GB”). Not case sensitive.

Optional Configuration Parameters

Parameter Name

Parameter for Local Configuration

Parameter for Command Line or Configmaps.yaml

Example Value(s)

Notes

Customer Profile

profile

LR_VAULT_PROFILE

  • prod

  • dev

Default is "prod"

Filename

filename_pattern

LR_VAULT_FILENAME_PATTERN

  • ^test.*

The regex to use to determine which files in the input folder or bucket should be processed (for example, entering “^test.*” would include a file named “test.csv”). The app will process files from the folder/bucket with filenames that match the regex.

Country Header

country_code_column

LR_VAULT_COUNTRY_CODE_COLUMN

  • COUNTRY_CODE (default value)

  • {{Customer provided}} (optional)

  • The header name for the column containing country codes for each row.

  • The values in the country code column determine the hygiene applied to phone numbers in the file.

  • Must match the value in input file.

Public Key Encryption

public_key_encryption

LR_VAULT_PUBLIC_KEY_ENCRYPTION

true

Include this parameter and set to “true” to encrypt each row of data in the files before they’re sent to LiveRamp for processing.

Header Mapping

header_mapping

LR_VAULT_HEADER_MAPPING

{{customer determined}}

newvalue=defaultvalue,

A list of key=value pairs which can be used to replace the default headers for the identifier columns in the file. For example, if the email columns have the headers “primary_email” and “alt_email”, then the header mapping should be set to “primary_email=email1,alt_email=email2”.

Error Log

customer_logging_enabled

LR_VAULT_CUSTOMER_LOGGING_ENABLED

Include this parameter and set to “true” to enable the generation of an error log file showing any processing errors (such as illegal characters or incorrect header values) and the row number for where the error occurred.

Error Log Location

-v <localFolder>:/var/logs/vault-app

-v <localFolder>:/var/logs/vault-app

Include this parameter to set the delivery location for the error log file.

Mode

mode

LR_VAULT_MODE

  • default

  • task

The default value is "default" for long-running file processing, set to "task" to enable single file processing (the application will shut down after processing a single file, not available for Kubernetes setup).

Packet Type

packet_type

LR_VAULT_PACKET_TYPE

unencoded

Include this parameter and set to "unencoded" to receive RampIDs rather than RampID packets. Include this parameter only when you are receiving RampIDs.

Envelope Output

envelopes_flow

LR_VAULT_ENVELOPES_FLOW

true

Include this parameter and set to "true" to have the output RampIDs packaged into identity envelopes. Include this parameter only when you are receiving identity envelopes.

Test Mode

dry_run

LR_VAULT_DRY_RUN

  • true

  • false

The default value is “'false”'. Set to “true” to run the app in dry run mode. Only outputs encrypted packets in dry run mode.

Customer AWS Access Key

N/A

AWS_ACCESS_KEY

Only for customers using S3 bucket as input source. Access key for client’s AWS.

Customer AWS Secret Access Key

N/A

AWS_SECRET_ACCESS_KEY

Only for customers using S3 bucket as input source. Secret access key for client’s AWS.

Customer AWS Region

N/A

AWS_REGION

Only for customers using S3 bucket as input source. AWS region in which the bucket is residing.

Customer AWS Region

N/A

AWS_DEFAULT_REGION

Only for customers using S3 bucket as input source. AWS region in which the bucket is residing.

Customer GCS Bucket Credentials

N/A

GOOGLE_APPLICATION_CREDENTIALS

Only for customers using GCS bucket as input/output source. Path to your Google Credentials JSON file.

Customer GCS Project

gcp_project_name

LR_VAULT_GCP_PROJECT_NAME

Only for customers using GCS bucket as input/output source. The name of your GCP project. Added if the default profile name can't be found.