Running LiveRamp’s Local Encoder In an AWS Environment
Local Encoder enables you to generate securely-encoded RampIDs for your consumer data files within your own cloud environment and then utilize that data for Activation or addressability use cases, depending on your needs. In this way, your consumer data is never exposed to an external network, while still enabling full use of the LiveRamp solutions. The encoded RampIDs produced by the application cannot be decoded back to the original consumer identifiers.
Local Encoder can be run on any infrastructure that supports running Docker images. The Local Encoder Docker image is currently distributed via the Amazon Elastic Container Registry (ECR).
For more information on Local Encoder, including information on security, use cases, data you can send, and output options, see "LiveRamp Local Encoder".
For information on running Local Encoder using AWS, see the sections below .
Overall Steps
Running the Local Encoder in an AWS environment involves the following overall steps:
You provide LiveRamp with your PGP public key.
LiveRamp provides you with credentials.
You set up the infrastructure and configure the desired parameters.
You deploy a kubernetes manifest.
You upload your data files to the appropriate input location.
Local Encoder performs the following operations:
The data is normalized and hygiene is performed.
The identifiers in the data are converted into derived RampIDs.
If appropriate, the derived RampIDs for each record are encoded into secure RampID packets or identity envelopes.
The input identifiers are removed and replaced with the appropriate RampID output type (RampIDs, RampID packets, or identity envelopes).
For Activation use cases the following steps are performed:
The output containing RampID packets is delivered to LiveRamp.
LiveRamp decodes the RampID packets into their individual derived RampIDs.
LiveRamp matches those derived RampIDs to their associated maintained RampIDs.
LiveRamp creates the appropriate fields and segments from the segment data in your LiveRamp platform (such as Connect or Safe Haven).
For Addressability use cases the following steps are performed:
The output containing RampIDs or identity envelopes is output to the destination of your choice.
You leverage the output to build a mapping table of your customer IDs to RampIDs or identity envelopes.
Prerequisites
Running the Local Encoder in an AWS environment requires that you have the following prerequisites:
A Docker installation
AWS Command Line Interface (CLI)
AWS kubectl CLI
A Terraform installation
A PGP public key
Note
You will use the AWS command line tool to issue commands at your system's command line to perform Amazon ECR and other AWS tasks. We recommend that you have the latest version of the AWS CLI installed. For information about installing the AWS CLI or upgrading it to the latest version, see Installing the AWS Command Line Interface.
Set Up the Infrastructure
Use the terraform sample configuration file that LiveRamp provided to set up the infrastructure if you don’t have an existing cluster.
Unpack the archive file (terraform.zip).
Modify the file “locals.tf”:
If desired, use the
cluster_name
parameter to update the input bucket name.Use the
region
parameter to define your AWS Region (see Amazon’s AWS region availability for more information).Use the
path
andversion
parameters to provide Docker image details.If you have specific subnet definitions, use the
vpc
parameter to provide VPC details (otherwise, you do not need to change the defaults).
Sample “locals.tf” file:
locals { cluster_name = "local-encoder-eks-${random_string.this.result}" region = "[customer AWS region]" image = { path = "461694764112.dkr.ecr.eu-central-1.amazonaws.com/vault-app" version = "latest" } vpc = { cidr = "10.0.0.0/16" private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"] public_subnets = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"] } } resource "random_string" "this" { length = 8 special = false }
Use your IAM credentials to authenticate the Terraform AWS provider, and set the AWS_ACCESS_KEY_ID environment variable and your secret key:
export AWS_ACCESS_KEY_ID=”<your_aws_access_key_id>” export AWS_SECRET_ACCESS_KEY=”<your_aws_access_key>”
Note
If you don't have access to IAM user credentials, use another authentication method as described in the AWS provider documentation.
Initialize terraform and plan for changes by running the following command:
terraform init terraform plan
You’ll see a list of resources that will be created:
VPC
Security group
EKS cluster
Storage bucket
Apply the changes by running the following command:
terraform apply
This step can take up to 10 minutes depending on whether a new cluster is being created.
Sample output:
Configuration output: cluster_endpoint = "https://<AWS access key ID>.gr7.ap-southeast-2.eks.amazonaws.com" cluster_id = "local-encoder-eks-<random>" cluster_name = "local-encoder-eks-<random>" cluster_security_group_id = "<cluster security group ID>" region = "ap-southeast-2”
Output parameters:
Cluster_endpoint: The cluster endpoint, which includes your AWS access key ID.
Cluster_id: The cluster ID, which includes the endpoint name you set and a randomly generated string value.
Cluster_name: The cluster name, which includes the endpoint name you set and a randomly generated string value.
Cluster_security_group_id: The cluster security group ID, which is generated from the “locals.tf” file.
Region: The AWS region you set.
Once the output is displayed, connect your new or existing cluster to your
kubeconfig.
by running the following command:aws eks --region $(terraform output -raw region) update-kubeconfig \ --name $(terraform output -raw cluster_name)
From this point onwards, kubectl can be used to manage the cluster and deploy kubernetes configuration.
For example: Execute the command: kubectl cluster-info
Deploy a Kubernetes Manifest
Use the sample yaml configuration that LiveRamp provided to deploy the Local Encoder to the target cluster.
Unpack the kubernetes archive file.
Access the credentials that LiveRamp provided to use them in subsequent steps:
Local Encoder account ID
AWS IAM Access Key ID
AWS IAM Secret Access Key (will need to be decrypted)
Set the configuration file “configmap.yaml” using the provided credentials. See the “Optional Configuration Parameters” section below for more information on additional parameters.
apiVersion: v1 data: LR_VAULT_ACCOUNT_TYPE: awsiam LR_VAULT_ACCOUNT_ID: '<Local Encoder account ID>' LR_VAULT_LR_AWS_ACCESS_KEY_ID: '<AWS IAM Access Key ID>' LR_VAULT_INPUT: 's3://aws-au-chp-vaultapp-input' LR_VAULT_OUTPUT: 's3lr://com-liveramp-chp-vaultapp-output-prod' AWS_ACCESS_KEY: '<YOUR_AWS_ACCESS_KEY>' AWS_SECRET_ACCESS_KEY: '<YOUR_AWS_SECRET_ACCESS_KEY>' AWS_REGION: '<REGION>' kind: ConfigMap metadata: name: 'local-encoder-config'
Note
LR_VAULT_OUTPUT
is the LiveRamp destination bucket.To receive either RampIDs or identity envelopes, you will need to add an additional line. For more information, see the "Configure the Output Type" section below.
Update the “secret.yaml” file using the provided credentials:
Note
The secret access key will need to be base64 encoded before storing.
apiVersion: v1 data: LR_VAULT_LR_AWS_SECRET_ACCESS_KEY: '<AWS IAM Secret Access Key>' kind: Secret metadata: name: 'local-encoder-secret' type: Opaque
Once the values are updated, run the following command to deploy:
kubectl apply -k .
Complete Configuration Activities
See the sections below for information on completing any desired additional configuration activities. For more information on configuration parameters, see the "Configuration Parameters" section below.
Configure the Output Type
The Local Encoder application offers multiple output formats:
RampID packets: Used for Activation by brands and marketers, these RampID packets can be delivered to LiveRamp, where they can be transformed into RampIDs and used to generate fields and segments in your LiveRamp application. This is the default output type.
RampIDs: Used for addressability by publishers and platforms who want to create a RampID mapping.
Identity envelopes: Used for addressability by publishers and platforms who want to get RampIDs into the bidstream to safely engage with the programmatic advertising ecosystem.
For more information on the output types available, see "Output Options and Examples".
Note
If you plan to receive RampIDs or identity envelopes, contact your LiveRamp representative for approval and account configuration changes.
Identity envelope output is only available with Version 1.6 of Local Encoder.
After you are approved for RampID or identity envelope output, add the following line to your Kubernetes ConfigMaps (when deploying a Kubernetes manifest) to add an additional variable using the appropriate method listed below:
Note
To receive RampID packets, you do not need to make any changes to the configuration.
For RampID output, add the following line to the Kubernetes ConfigMaps:
LR_VAULT_PACKET_TYPE: unencoded
For identity envelope output, add the following line to the Kubernetes ConfigMaps:
LR_VAULT_ENVELOPES_FLOW=true
Note
When editing the yml file, any formatting issues will prevent the configuration file from working properly. We recommend that you run the file through a YAML validator.
Utilize Encryption
Optional encryption is available with Local Encoder (version 1.5 and greater). This functionality encrypts each row of data before it is sent to LiveRamp for processing.
Note
Adding encryption increases the processing time approximately 20% depending on the size of the file and number of records. LiveRamp recommends limiting file size to 15GB.
To utilize encryption, add the following line to the configuration file, just before the LR_VAULT_OUTPUT
parameter: LR_VAULT_PUBLIC_KEY_ENCRYPTION: 'true'
.
Format the File
Input files must include identifier fields and (for Activation use cases where you're receiving RampID packets) can also include segment data fields if desired.
Before uploading a file to the input location, make sure to format the data according to these guidelines:
Include a header row in the first line of every file consistent with the contents of the file. Files cannot be processed without headers.
If you want to maintain the ability to sort the output file, or if you're utilizing one of the deconfliction options, you must include a column containing row IDs (“RID”) as the first column of the file.
Note
The row identifier column is only required to maintain sort order and should not contain any customer personally-identifiable data.
Make sure that the only identifiers included are the allowed identifier touchpoints listed below.
If you’re sending data for consumers in multiple countries or if you’re including phone numbers, you must include the appropriate country code column (depending on the method used) to identify the country of each record. For more information, see the “Optional Configuration Parameters” section below.
Include a maximum of 500 segment data fields in a single file (for Activation use cases where you're receiving RampID packets).
Segment data field types can be in the form of a string, numeral, enum, etc.
The application supports three file formats: CSV, PSV, and TSV.
Make sure to name your files in a format that includes info on the file part, such as "filename-part-0001-of-0200".
Note
You must make file names unique, as the application will not process a file that has the same name as a previous file (unless you restart the application to clear the memory).
Files must be rectangular (have the same number of columns for every row).
If any values contain the file’s delimiter character (for example, a comma in a .csv file), make sure that your values are contained within quotes.
The recommended maximum file size is 20GB.
Allowed Identifier Touchpoints
You can include any of the following allowed identifier touchpoints for translation to RampIDs in both Activation and Addressability use cases:
Plaintext email address (maximum of three per record)
SHA-256 hashed email address (maximum of three per record)
Plaintext mobile phone number (maximum of two per record)
SHA-256 hashed mobile phone number (maximum of two per record)
Plaintext landline phone number (maximum of one per record)
SHA-256 hashed landline phone number (maximum of one per record)
Additional Allowed Touchpoints for Activation Use Cases
For Activation use cases (where you're receiving RampID packets, the following additional identifier touchpoints are also allowed for translation to RampIDs:
Name and postcode, which consists of first name, last name, and postcode (maximum of one per record) (European countries only)
AAID (maximum of one per record)
IDFA (maximum of one per record)
IMEI (maximum of one per record)
Example Header
See the header shown below for an example of what the header might look like when sending data in a pipe-separated file (psv) for an Activation use case, where segment data fields are included:
RID|EMAIL1|EMAIL2|EMAIL3|SHA256EMAIL1|SHA256EMAIL2|SHA256EMAIL3|MOBILE1|MOBILE2|SHA256MOBILE1|SHA256MOBILE2|LANDLINE1|SHA256LANDLINE1|FIRSTNAME|LASTNAME|POSTCODE|AAID|IDFA|IMEI|ATTRIBUTE_1|...|ATTRIBUTE_N
Replace ATTRIBUTE_1 … N in the example header with the name of your CRM attributes.
Example Output Files
For more information on the output options and the format of the output files, see "Output Options and Examples".
Upload the File to the Input Bucket
Uploading a file to your Local Encoder services input bucket kicks off the encoding operation. To upload your file, run a command similar to the example below (this example shows using an AWS S3, but this could be any local directory):
aws s3 cp [your_file].csv s3://com-liveramp-vault-[your-vpc-id]-input
Once the file has been processed, you’ll get a confirmation message that includes the number of records processed.
For RampID packet output, all consumer identifier data in a row is transformed into derived RampIDs, packaged into one data structure and encrypted again, yielding a RampID packet.
For RampID output, all consumer identifier data in a row is transformed into derived RampIDs in the form of a JSON string in a “RampID” column.
For identity envelope output, all consumer identifier data is transformed into derived RampIDs. A selection logic is applied, then the RampID is additionally obfuscated and encrypted into an identity envelope. Only one identity envelope is returned per row of data. A timestamp column is appended to the end of each row. This column gives the expiration date and time for the identity envelope in Unix format (timezone UTC).
For more information on output options and the format of the output files, see "Output Options and Examples".
Configuration Parameters
See the sections below for information on the required and optional parameters to use, depending on the deployment method being used.
Required Configuration Parameters
Parameter Name | Parameter for Local Configuration | Parameter for Command Line or Configmaps.yaml | Example Value(s) | Notes |
---|---|---|---|---|
AWS user ID | account_id | LR_VAULT_ACCOUNT_ID | AID.…. | Provided by LiveRamp |
Account type | account_type | LR_VAULT_ACCOUNT_TYPE | awsiam | |
LiveRamp AWS account ID | AWS_LR_ACCOUNT_ID | 461694764112 | Provided by LiveRamp | |
AWS IAM access key ID | lr_access_key_id | LR_VAULT_LR_AWS_ACCESS_KEY_ID | AKI...... | Provided by LiveRamp |
AWS IAM secret access key | lr_secret_access_key | LR_VAULT_LR_AWS_SECRET_ACCESS_KEY | LiveRamp provides the secret encrypted with customer key | |
Input File Location | input | LR_VAULT_INPUT |
|
|
Output file Location | output | LR_VAULT_OUTPUT |
|
|
AWS region for LR resources | lr_region | LR_VAULT_LR_AWS_REGION | eu-central-1 | LiveRamp’s AWS Region |
Origin of the data being processed | locale | LR_VAULT_LOCALE | us | Two letter country code representing the origin of the data being processed (for example, Australia = “au”, Great Britain = “GB”). Not case sensitive. |
Optional Configuration Parameters
Parameter Name | Parameter for Local Configuration | Parameter for Command Line or Configmaps.yaml | Example Value(s) | Notes |
---|---|---|---|---|
Customer Profile | profile | LR_VAULT_PROFILE |
| Default is "prod" |
Filename | filename_pattern | LR_VAULT_FILENAME_PATTERN |
| The regex to use to determine which files in the input folder or bucket should be processed (for example, entering “^test.*” would include a file named “test.csv”). The app will process files from the folder/bucket with filenames that match the regex. |
Country Header | country_code_column | LR_VAULT_COUNTRY_CODE_COLUMN |
|
|
Public Key Encryption | public_key_encryption | LR_VAULT_PUBLIC_KEY_ENCRYPTION | true | Include this parameter and set to “true” to encrypt each row of data in the files before they’re sent to LiveRamp for processing. |
Header Mapping | header_mapping | LR_VAULT_HEADER_MAPPING | {{customer determined}} newvalue=defaultvalue, | A list of key=value pairs which can be used to replace the default headers for the identifier columns in the file. For example, if the email columns have the headers “primary_email” and “alt_email”, then the header mapping should be set to “primary_email=email1,alt_email=email2”. |
Error Log | customer_logging_enabled | LR_VAULT_CUSTOMER_LOGGING_ENABLED | Include this parameter and set to “true” to enable the generation of an error log file showing any processing errors (such as illegal characters or incorrect header values) and the row number for where the error occurred. | |
Error Log Location | -v <localFolder>:/var/logs/vault-app | -v <localFolder>:/var/logs/vault-app | Include this parameter to set the delivery location for the error log file. | |
Mode | mode | LR_VAULT_MODE |
| The default value is "default" for long-running file processing, set to "task" to enable single file processing (the application will shut down after processing a single file, not available for Kubernetes setup). |
Packet Type | packet_type | LR_VAULT_PACKET_TYPE | unencoded | Include this parameter and set to "unencoded" to receive RampIDs rather than RampID packets. Include this parameter only when you are receiving RampIDs. |
Envelope Output | envelopes_flow | LR_VAULT_ENVELOPES_FLOW | true | Include this parameter and set to "true" to have the output RampIDs packaged into identity envelopes. Include this parameter only when you are receiving identity envelopes. |
Test Mode | dry_run | LR_VAULT_DRY_RUN |
| The default value is “'false”'. Set to “true” to run the app in dry run mode. Only outputs encrypted packets in dry run mode. |
Customer AWS Access Key | N/A | AWS_ACCESS_KEY | Only for customers using S3 bucket as input source. Access key for client’s AWS. | |
Customer AWS Secret Access Key | N/A | AWS_SECRET_ACCESS_KEY | Only for customers using S3 bucket as input source. Secret access key for client’s AWS. | |
Customer AWS Region | N/A | AWS_REGION | Only for customers using S3 bucket as input source. AWS region in which the bucket is residing. | |
Customer AWS Region | N/A | AWS_DEFAULT_REGION | Only for customers using S3 bucket as input source. AWS region in which the bucket is residing. | |
Customer GCS Bucket Credentials | N/A | GOOGLE_APPLICATION_CREDENTIALS | Only for customers using GCS bucket as input/output source. Path to your Google Credentials JSON file. | |
Customer GCS Project | gcp_project_name | LR_VAULT_GCP_PROJECT_NAME | Only for customers using GCS bucket as input/output source. The name of your GCP project. Added if the default profile name can't be found. | |
Java Tool Options | N/A | JAVA_TOOL_OPTIONS | --env JAVA_TOOL_OPTIONS="-XX:+DisableAttachMechanism -Dcom.sun.management.jmxremote -XX:ActiveProcessorCount=3" |
|