Skip to main content

Determining the Applications to Use

Use the following diagram to help determine the best applications to use to process your query.



This diagram provides guidance for choosing tools to process complex queries. However, determining when a job has a "large" amount of data also depends on the query or script. For example, if your job fails or stalls due to memory issues when using BigQuery options and you are processing fewer than 100 million records, check the efficiency of your code. Iterative table lookups or multiple "for loops" could cause interactive processing to fail. When a job requires extensive data manipulation, cross-table joins or lookups, or advanced libraries, you should typically choose PySpark. If memory issues persist, create a Safe Haven case in the LiveRamp Community portal.

Interactive vs. Non-interactive

Some applications allow you to run code in the following ways:

  • Interactive: You can run the code for each step of the overall process independently and see the results for that step.

  • Non-interactive: The process runs the entire set of code all at once and you wait for all of the code to run before seeing the results. Non-interactive processing is recommended if your code requires advanced scripting or library use.

Use the following diagram to help determine the best applications to use depending on whether you want to work in an interactive or non-interactive way.