Formatting File Data
Once you’ve determined what data you’ll be including, review the information in the sections below (including the sections pertaining to your file type) to make sure that the data are formatted in a way that LiveRamp can accept.
Make sure to review the following guidelines for the LiveRamp workflow that you're using (in addition to our data guidelines in "Types of Data That Can Be Included in Files" and "LiveRamp Data Restrictions"):
The Activation workflow, which involves creating fields and segments from your data so that you can distribute those fields and segments to your desired destination platforms for activation. See "Activation Workflow Overview" for more information.
The Measurement Enablement workflow, which involves replacing the input identifiers (such as PII, cookies, mobile device IDs, CIDs, etc.) in the files you upload with their associated RampIDs and returning the files to the location you specify. See "Measurement Enablement Workflow" for more information.
Once you’ve made sure that all of the data are formatted correctly, finalize the file for uploading.
Note
See "Uploading Data" for an overview of the file creation and formatting process.
For both column-based files and key-value files, follow the guidelines listed below (in addition to the appropriate guidelines for formatting column-based files and for formatting key-value files).
Use Quotation Marks When the Data Contains a Delimiter
Data containing punctuation characters are at risk of delimiter collision, and thus data bleed when any delimiter (such as a comma, pipe, or semicolon) also appears as part of the data values. This can cause LiveRamp to interpret data in a particular row as belonging to the wrong field.
To avoid this, enclose each value in a column-based file, and each key and value in a key-value file, in quotation marks ("), following the guidelines listed below:
Equals signs (in key-value files) and delimiters (such as commas or pipes) should be outside the quotation marks.
Make sure that all quotation marks are closed so that there is an even number of quotation mark characters per data row.
If using quotation marks, the best practice (but not required) is to enclose all values with quotation marks rather than only those with delimiter collision potential.
Note
It is not necessary to enclose empty or null fields with quotation marks.
If a value contains a quotation mark character, that value should be enclosed in quotation marks and each quotation mark character in the value should be properly escaped by putting a quotation mark character right before it, as shown in the examples below:
LCD TV,50"
becomes"LCD TV,50"""
"early-bird" special
becomes"""early-bird"" special"
5'8"
becomes"5'8"""
Put All Segment Data Related to a Given Identifier on a Single Row
For example, if you have three different segments, put them all on one row, either as three columns or three key-value pairs, rather than listing the same identifier on three rows with one segment per row. The latter approach will make file processing take significantly longer.
Caution
Multi-valued data are an exception. If you have a column where there are multiple values for the same header, each column value should be on a separate row, along with the identifier. For more information, see "Multi-Value Fields".
Do Not Use Placeholders for Empty Values
If a given field entry for a particular row of data has no value, leave it blank. Do not use a placeholder such as "NULL" or "N/A."
Format Multi-Value Fields Correctly
When sending in files with multi-value fields, make sure that those field values are formatted correctly by sending separate rows for each value a consumer has in a particular field.
Note
When sending files with multi-value fields to a particular audience, make sure to include that information when creating a support case before uploading the first file so those fields get ingested correctly.
For example, to make a consumer with a customer ID of "123" a member of the segments "Pets_Owned=Dog", "Pets_Owned=Cat", and "Pets_Owned=Hamster", format a column-based file like the example below:
Customer_ID | Pets_Owned |
---|---|
123 | Dog |
123 | Cat |
123 | Hamster |
For a key-value file, format the file like the example below:
123,Pets_Owned=Dog 123,Pets_Owned=Cat 123,Pets_Owned=Hamster
For more information, see "Multi-Value Fields".
Use UTF-8 Encoding
UTF-8 is in wide adoption and ensures maximum compatibility across different systems. ASCII encoding is accepted as it is included in the UTF-8 standard. When you're saving your file, make sure not to save it as something other than a UTF-8 encoded file type.
Formatting Dates
You can include segment data fields that contain calendar date values (such as date of most recent purchase) in the files you upload.
If you want to use a date field to create derived segments, use one of our preferred date formats: YYYYMM or YYYYMMDD. Do not include forward slashes (/) or hyphens (-).
Note
Privacy issues: Keep in mind that LiveRamp will not distribute any data that could easily identify a group of fewer than 25 individuals (or fewer than two individuals for measurement files). Using the YYYYMM format can help avoid this situation.
Formatting Segment List Data
Not commonly used, these are single-field identifier files, typically device-based identifiers such as cookies tied to single PII identifier.
LiveRamp accepts non-rectangular segment list-based data. These are usually files with a single identifier field (typically a device-based identifier, such as a cookie ID, or PII-based data tied only to a single email address or phone number).
The identifier should be the first field of each row:
<identifier1>,seg1,seg2
<identifier2>,seg2
Separate the identifier and each segment with one of the allowed delimiters.
Segments should be unique per row. That is, do not include the same segment multiple times on a single row.
LiveRamp has developed recommended file limits to minimize delays and ensure maximum performance. When preparing your files, make sure to take into account the recommended file limits listed below.
Caution
Depending on your contract, you might be required to adhere to these limits.
Item | Limit |
---|---|
Rows | Minimum of 25 rows per file |
Maximum of 500,000,000 rows per file (100,000,000 for UK data files and 30,000,000 for French data files) | |
Segment data fields (segment categories, such as "Gender") | Maximum of 400 per file Can contain letters, numbers, and underscores NoteThis does not include identifier data fields. |
Distinct field/value pairs for segment data (such as "Gender=Female") | Maximum of 5,000 total per file |
Character limits for field labels (headers or keys) and field values |
|
Character limits for each line or row | Maximum of 10,000 characters per line or row (1,000 for EU data files) |
File size (uncompressed) | Maximum 50 GB |
The file types listed below can be accepted, as long as they use one of the allowed file delimiters and meet all other file formatting requirements:
Comma-separated values files (.csv)
Tab-separated values files (.tsv)
Pipe (|)-separated values files (.psv)
Semicolon-separated values files (.scsv)
Text files (.txt)
When creating files for upload, remember that files must be provided in a flat, text-based format. As a general rule of thumb, if the file contents cannot be previewed within a command-line terminal or simple text editor (like WordPad or TextEdit), we cannot accept it.
Files can be column-based with delimiters or in key-value format (except for EU data files, which can only be sent as column-based files). Files can also be compressed, which will result in a different file extension.
Caution
No Excel or Word files: Non-text-based file types, such as Microsoft Excel (.xls or .xlsx) or Word (.doc or .docx) files, cannot be accepted.
For Mac Excel files: If exporting data from Microsoft Excel for Mac, choose the "Windows Comma Separated (.csv)" option (do not use the MS-DOS or Macintosh CSV versions, or any non-UTF-8 option). Make sure that extra columns haven't been added during the export process before uploading.
EU data cannot be sent in TXT files.
Use one of the delimiters listed below to separate values when creating a file for uploading:
Commas (comma-separated values files, or .csv files)
Tabs (tab-separated values files, or .tsv files)
Pipes (|) (pipe-separated values files, or .psv files)
Semicolons (semicolon-separated values files, or .scsv files)
Make sure to follow all other guidelines for allowed file types and for formatting file data.
Note
Enclose values in quotation marks when the data contains a delimiter: Data containing punctuation characters is at risk of delimiter collision , and thus data bleed, when any delimiter (such as a comma, pipe, or semicolon) also appears as part of the data values. This can cause LiveRamp to interpret data in a particular row as belonging to the wrong field. To prevent this, enclose values in quotation marks ("). See "Formatting Guidelines for All Files" for more information.
Avoiding File Ingestion Failure
There are many possible considerations to keep in mind when creating files for upload, but the cause of file ingestion failure or delay often falls into one of the categories listed below.
To be able to automatically ingest files that you upload for processing (which speeds up and streamlines the ingestion process for your files), we create an automation configuration for that particular audience. Often we will use a file from another of your audiences, or the first file you upload to your new audience, as a "seed" or "template" to create this configuration.
Once this configuration has been enabled, it's very important that subsequent files that are uploaded for that audience stay consistent with the "seed" or "template" configuration. If you upload a file that is not consistent (such as a file with different headers or a different audience key), the file will be paused in the ingestion process until the file is manually mapped to the audience.
For more information on the automation configuration, see "The Ingestion Automation Process for File Uploads".
Note
If you need to upload a file that does not match the initial automation configuration, create a support case so we can make sure the file is ingested correctly.
To avoid ingestion issues, be sure to keep the same file formatting as the original seed file, especially including the following:
The same identifier fields (including having the same audience key field(s) as the original file
The same field names (headers in a column-based file and keys in a key-value file) for identifier fields and for any previously-included segment data fields as the original file
Note
If you upload a file with new segment data fields that weren't in the original file, those fields should upload without an issue. However, you might want to monitor the file upload on the Files page in LiveRamp Connect to make sure there are no problems. If the processing pauses or fails, use the appropriate Troubleshooting File Upload Issue quick case to have the Support team investigate the issue.
For examples of properly formatted files (including downloadable files that can be used as templates), as well as files containing common errors, see "File Formatting Examples."
Header Issues
Header issues are the most common cause of file upload failure. The most common header issues are:
A header for an identifier column (including an identifier column that is being used as the audience key) that was in the original file uploaded for that audience is missing, or is named differently (including being capitalized differently).
Caution
When including segment data columns that were in previously-uploaded files, make sure that the headers for those columns also stay consistent. If a header for a segment data column changes, a new field will be created instead of having the data from that column update the existing field.
If you are using the "full refresh" option to update the audience, keep in mind that any previously-uploaded segment data columns need to be included in the file if you want to keep that data in the associated fields. Otherwise, if a previously-uploaded segment data column is not included, the associated field will have all the members removed.
The file contains two or more headers that are exactly the same (sometimes extra headers and columns are added when saving from Excel).
Make sure that:
Every column-based file you upload has a header row.
Make sure that the headers are unique and that headers for identifier columns (including identifier columns that are being used as the audience key) and headers for any previously-included segment data columns match the headers in the original file uploaded for that audience.
See "Formatting Column-Based Files" for more information.
Audience Key Issues
Keep in mind that every file uploaded to a particular audience in an Activation workflow must use the same audience key as the original file uploaded for that audience so that we can effectively deduplicate and consolidate all rows within that audience.
The most common audience key issues are:
The audience key is different from the one used for the first file uploaded for that audience.
An identifier that was used originally as the audience key is not included in the file, such as “city” when the audience key is name and postal address.
The header for an audience key column is different from the header used in the original file.
The audience key has a low fill rate. For best results, and to avoid having ingestion processing pause or fail, make sure the audience key has a fill rate of as close to 100% as possible.
Identifier Issues
An identifier is formatted incorrectly, such as including both the first and last name in the same field instead of having separate “first name” and “last name” fields.
See "Formatting Identifiers" for more information.
Delimiter Issues
There is an extra delimiter (such as a comma) in a row, making it appear as though the row has more fields in it than it should. When this happens, our system gets confused while trying to parse your file. Watch out for this, especially when creating CSV files from Excel files.
The data contains the delimiter but is not enclosed in quotation marks. Data containing punctuation characters are at risk of delimiter collision and thus data bleed, where the specified delimiter (such as a comma) also appears as part of the data values. This can cause LiveRamp to interpret data in a particular row as belonging to the wrong field. See "Formatting Guidelines for All Files" for more information.
Raw Field Issues
A "raw field" is a field in a given file that has over 250 distinct values. Most files uploaded to LiveRamp should not contain raw fields unless you have previously coordinated this with your LiveRamp account manager.
The values in raw fields cannot be separated out and managed individually in Connect. Many platforms also cannot accept raw fields.
New Identifier Field Issues
After the upload of the first file in an audience, subsequent files need to use the same file formatting. All originally-included fields should be included with the exact same headers. If you include an additional identifier field (such as a new email address column in a column-based file) without having LiveRamp first adjust the mapping to include that new field, the file will not process until that mapping has been adjusted.
To avoid delays or processing failures, if you need to add a new identifier field to a subsequent file, first create a support case so that we can adjust the mapping before you upload that file.
Multi-Value Field Issues
The default format for fields in a file for ingestion (columns in a column-based file) is for each field to be a "single-value" field. A single-value field is a field where each consumer record can only be a member of one segment in that field. For example, a file might have a column for the field "Favorite_Pet", where each consumer record can only have one value associated with them (such as "Favorite_Pet=Dog" or "Favorite_Pet=Cat", but not both).
But for some fields you might want to allow each consumer record to have multiple values for that field. For example, you might want to create a field in Connect called "Pets_Owned" where each consumer record might be a member of multiple segments, such as "Pets_Owned=Dog", "Pets_Owned=Cat", and "Pets_Owned=Hamster". This type of field is called a "multi-value field".
For consumers that have multiple values for a particular multi-value field, make sure to create separate rows for each value. For more information on formatting multi-value fields, see "Formatting Guidelines for All Files".
Before you upload the first file to a new audience where the file will contain one or more multi-value fields, use the Set Up Audience for First File Upload (Activation) quick case to create a support case to make sure the audience is configured correctly to process those multi-value fields.
Once the ingestion automation process has been set up for the initial file, make sure to create another support case if you change any of the single-value fields to multi-value fields so that the file ingests properly.
For more information on multi-value fields, see "Multi-Value Fields".