Data Connection FAQs
See the FAQs below for common data connection questions.
Why does partitioning matter?
Partitioning optimizes the dataset even before you get to the query stage. Partitioning improves query performance because data processing during question runs occurs only on the relevant filtered data. For more information, see “Data Connection Partitioning”.
What are the best practices for partitioning?
Data partitioning (dividing a large dataset into smaller, more manageable subsets) is recommended for optimizing query performance and leading to faster processing times. By indicating partition columns for your data connections, data processing during question runs occurs only on the relevant filtered data, which reduces query cost and time to execute. Best practices include:
Partition at the source: When configuring your data connection to LiveRamp Clean Room, define partition columns.
Consider the collaboration context: Make sure that the partition columns make sense for the types of questions that a dataset is likely to be used for. For example:
If you anticipate questions that analyze data over time, partition the dataset by a date field (e.g., event_date or impression_date). This allows queries that filter by date ranges to scan only relevant partitions, reducing processing time and costs.
If the main use case is to analyze data by different brands or products, then partitioning by a brand or product_id column makes sense. This strategy ensures that queries filtering by brand will only access the necessary subset of the data.
Verify column data types: Partitioning supports date, string, integer, and timestamp field types. Complex types (such as arrays, maps, or structs) are not allowed.
Cloud-specific formatting: For cloud storage sources like S3, GCS, and Azure, structure your buckets and file paths in a partitioning format based on the partition column. For BigQuery and Snowflake, make sure columns are indicated as partition keys in your source tables.
For more information, see “Data Connection Partitioning”.