Skip to main content

Configure a BigQuery Data Connection for a BigQuery Clean Room

If you have data in Google BigQuery and want to be able to use that data in questions in LiveRamp Clean Room, you can create a BigQuery data connection.

A BigQuery data connection for a BigQuery (native-pattern) clean room can be used in the following clean room types:

  • BigQuery

Note

To configure a BigQuery data connection to be used in either a Hybrid clean room or a Confidential Computing clean room, see "Configure a BigQuery Data Connection for a Hybrid Clean Room".

After you’ve created the data connection and Clean Room has validated the connection by connecting to the data in your cloud account, you will then need to map the fields before the data connection is ready to use. This is where you specify which fields can be queryable across any clean rooms, which fields contain identifiers to be used in matching, and any columns by which you wish to partition the dataset for questions.

After fields have been mapped, you’re ready to provision the resulting dataset to your desired clean rooms. Within each clean room, you’ll be able to set dataset analysis rules, exclude or include columns, filter for specific values, and set permission levels.

To configure a BigQuery data connection to be used in a BigQuery (native-pattern) clean room, see the instructions below.

Overall Steps

Perform the following overall steps in Google Cloud Platform to configure a BigQuery data connection for a BigQuery clean room:

Once the above steps have been performed in Google Cloud Platform, perform the following overall steps in LiveRamp Clean Room:

For information on performing these steps, see the sections below.

Prerequisites

Before proceeding, note the following:

  • LiveRamp Clean Room recommends configuring your BigQuery instance in the US-multi-region for U.S. customers and the EU-multi-region for EU customers. This configuration enhances flexibility, especially when collaborating with partners across different regions.

  • There are three service accounts involved in the process of configuring the required GCP and BigQuery permissions to orchestrate BigQuery Clean Rooms in LiveRamp Clean Room:

    • Owner service account

    • Partner service account (follow the same steps as the owner service account)

    • LiveRamp Clean Room service account

  • The steps for the owner service account and partner service account are the same. If you are a clean room partner, follow the same steps below where the owner service account is referenced.

Note

We recommend that you use partition columns in your source BigQuery tables for the best query performance. See below for configuration details.

Perform Steps in Google Cloud Platform

Perform the following overall steps in Google Cloud Platform to configure a BigQuery data connection for a BigQuery clean room.

Create a GCP Project

  1. Create a GCP Project associated with the owner service account including the "Service Account" and "Service Account Key".

  2. Note the "Project ID" and "Credential JSON". You'll need these for the "Configure BigQuery Credentials in LiveRamp Clean Room" section below.

  3. Ensure that the created project is associated with a billing account.

Create Authorized Views

For LiveRamp Clean Room to read data from Google BigQuery, authorized views must be created, which can only be done by users who have access to the data itself.

Note

LiveRamp recommends permitting LiveRamp to create the authorized views rather than letting the customer create them in order to mitigate manual maintenance requirements by the customer when changes are made to the data over time. If a customer creates authorized views and makes any changes to the underlying data (for example, columns added or removed), this will result in Clean Room run failures, which must be mediated by the customer. LiveRamp is unable to support troubleshooting and resolution in this case. If LiveRamp creates the views, all maintenance is managed by LiveRamp.

If you do want to create authorized views rather than letting LiveRamp create them, reach out to LiveRamp Community portal for more information.

Configure Permissions

Note

Keep in mind the following when configuring permissions:

  • The LiveRamp Clean Room service account has access to the source data.

  • LiveRamp manages the authorized views. If there are changes to the source data, they are managed by LiveRamp.

  • LiveRamp Clean Room features, such as row-level filtering and adding custom logic when configuring datasets in a clean room, will be available to the clean room owner.

  • LiveRamp simplifies authorized views and removes related ambiguity and complexity.

To configure permissions, do the following:

  • Create custom roles where indicated below and configure the following permissions for the owner and LiveRamp Clean Room service accounts:

Service Account

Roles/Permissions

Permission Level

Purpose

Owner

BigQueryMetaDataViewer

Table

Enables data connections to reach metadata.

Owner

Create a custom role named "LiveRamp Clean Room Authorized View Role" with the following permissions:

  • bigquery.datasets.create

  • bigquery.datasets.get

  • bigquery.datasets.update

  • bigquery.tables.create

  • bigquery.tables.getData

Project

Used to create a shared dataset and create authorized views in the shared dataset.

Owner

Analytics Hub Admin

Project

Enables LiveRamp Clean Room to create Exchange and Listing.

Note

The following Custom Roles should also be created and permissions configured. However, they cannot be applied to the LiveRamp Clean Room service account until the service account has been generated (see the "Generate a Service Account and Understand Billing Considerations" section below). When the service account has been generated, you can apply the following roles to it.

LiveRamp Clean Room

Create a Custom Role named "LiveRamp Clean Room BigQuery Job" with the following permissions:

  • bigquery.jobs.create

Recommended for feature completeness:

  • bigquery.jobs.delete

  • bigquery.jobs.update

  • bigquery.readsessions.create

  • bigquery.readsessions.getData

Project

Used to orchestrate the job. If not provided, LiveRamp Clean Room will pass through the compute costs

LiveRamp Clean Room

If you have an Organization IAM policy restriction (IP or domain), create a Custom Role named "LiveRamp Clean Room Analytics Hub Subscriber" with the following permissions:

  • analyticshub.listings.subscribe

Project

Used to subscribe to the listings ephemerally

Enable Project-Level API

Note

This applies to both the owner and partner at the project level, as APIs cannot be enabled at the dataset level. This is a Google-defined process for working in GCP BigQuery Clean Rooms.

  • Enable the "Analytics Hub API" at the project level. For more information on managing your APIs, see "Enabled APIs & Services".

Now that you've configured the required permissions in BigQuery, you're ready to configure your BigQuery credentials.

Perform Steps in LiveRamp Clean Room

Once the above steps have been performed in Google Cloud Platform, perform the overall steps in the sections below in LiveRamp Clean Room.

Note

if your cloud security limits access to only approved IP addresses, talk to your LiveRamp representative before creating the data connection to coordinate any necessary allowlisting of LiveRamp IP addresses.

Configure BigQuery Credentials in LiveRamp Clean Room

To configure BigQuery credentials in LiveRamp Clean Room, do the following:

  1. From the LiveRamp Clean Room navigation pane, select Data ManagementCredentials.

  2. Click Add Credential

    add_credential.png
  3. Complete the following fields:

    • Name: Enter the credential name.

    • Credentials Type: Select Google Service Account from the list.

    • Project ID: Enter the project ID (see the "Create a GCP Project" section above)

    • Credential JSON: Enter the credential JSON (see the "Create a GCP Project" section above)

  4. Click Save Credential.

    bq_add_cred.png

Generate a Service Account and Understand Billing Considerations

The LiveRamp Clean Room service account is used to run queries on the selected GCP billing project. This is done at the clean room level.

Each time a new shared clean room is created, a new service account is automatically created. The service account must be added to the GCP billing project and the proper roles applied from the "Create Authorized Views" section above.

Procedure. To create a new clean room:
  1. From the LiveRamp Clean Room navigation pane, select Clean RoomsClean Rooms.

  2. Click New Clean Room

    cr_add_new.png
  3. Select Google Cloud BigQuery and click Start.

    bq_cr_type.png
  4. In Configuration , enter the following: a name for the clean room, a description, and start and end dates (optional).

    • Clean Room Name: Enter a unique name for the clean room

    • Description: Enter the description of the clean room.

    • Start Date: and End Date: Enter the start and end dates for the clean room. The end date is optional.

    • Region: Select the region for the clean room from the list.

    Note

    If an end date is configured, when the clean room reaches its end date, it becomes a "Read Only" clean room. No new question runs or datasets can be configured and new partners cannot be added. To reactivate an expired clean room, update its end date to a new date in the future.

  5. Click Next Step.

    cr_create_config.png
  6. On Data Control Parameters, specify the following additional layers of privacy protection to your query outputs, which are typically based on the privacy tolerance of your organization (to learn more, see "Privacy-Preserving Techniques and Clean Room Results"):

    1. Data Decibel: Configure the level of noise to prevent identifying an individual or consumer. Noise is configured at a clean room level and applied to specific measures that will be determined during the question creation process.

    2. Crowd Size: Configure the k-minimum value or the threshold that must be achieved by a specific measure in the query for it to display results. If the number of users is less than this value, the row of data in the query result is redacted.

  7. Click Next Step.

    cr_create_privparams.png
  8. Confirm your clean room configuration settings and click Save.

Procedure. To access the service account:
  1. Next to the newly provisioned clean room, click Edit.

    bq_edit_cr.png
  2. Skip to the third and final step titled Settings. The service account will appear next to the Service Account line item.

    Apply the service account roles, created in the "Create Authorized Views" section above, to the service account.

    bq_find_sa.png

Now that you've set up the required permissions and linked your Google Service Account credentials, you are ready to configure a data connection.

Create the Data Connection

After you've added the credentials to LiveRamp Clean Room, create the data connection:

Note

if your cloud security limits access to only approved IP addresses, talk to your LiveRamp representative before creating the data connection to coordinate any necessary allowlisting of LiveRamp IP addresses.

  1. From the LiveRamp Clean Room navigation pane, select Data ManagementData Connections.

  2. From the Data Connections page, click New Data Connection.

    data_cxn_new.png
  3. From the New Data Connection screen, select "Google Cloud Big Query".

    Screenshot 2024-03-28 at 12.07.55.png
  4. Select the credentials created in the previous procedure from the list.

    Screenshot 2024-03-28 at 12.10.43.png
  5. Complete the following fields in the Set up Data Connection section:

    Screenshot 2024-06-17 at 11.43.12.png
    • To use partitioning on the dataset associated with the data connection, slide the Uses Partitioning toggle to the right.

    • Category: Enter a category of your choice.

    • Dataset Type: Select Generic.

  6. Complete the following tasks and fields in the Data Location and Schema section:

    • To use partitioning on the dataset associated with the data connection, slide the Uses Partitioning toggle to the right.

      Note

      If the dataset uses partitioning, the dataset can be divided into subsets so that data processing occurs only on relevant data during question runs, which results in faster processing times. When using partitioning, a temporary dataset is required to be entered below.

    • Project Id: Enter the Google Project ID.

    • Source Dataset: Enter the BigQuery source dataset.

    • Source Table: Enter the BigQuery source table name

      Note

      LiveRamp Clean Room does not support BigQuery views, only BigQuery tables.

    • Temporary Dataset: If you enabled partitioning above, enter the name of the temporary empty dataset you created previously to use when partitioning columns.

      Note

      Once you've specified a temporary dataset to use when partitioning, be sure not to remove the temporary dataset or change the dataset name.

  7. Review the data connection details and click Save Data Connection.

    Note

    All configured data connections can be seen on the Data Connections page.

  8. If you haven't already, upload your data files to your specified location.

When a connection is initially configured, it will show "Verifying Access" as the configuration status. Once the connection is confirmed and the status has changed to "Mapping Required", map the table's fields.

data_cxn_waiting_for_mapping.png

You will receive file processing notifications via email.

Map the Fields

Once the above steps have been performed in Google Cloud Platform, perform the overall steps in the sections below in LiveRamp Clean Room.

Note

Before mapping the fields, we recommend confirming any expectations your partners might have for field types for any specific fields that will be used in questions.

  1. From the row for the newly created data connection, click the More Options menu (the three dots) and then click Edit Mapping.

    data_cxn_mapping.png

    The Map Fields screen opens, and the file column names auto-populate.

    data_cxn_mapping2.png
  2. For any columns that you do not want to be queryable, slide the Include toggle to the left.

    Note

    Ignore the field delimiter fields because this was defined in a previous step.

  3. Click Next.

    The Add Metadata screen opens.

    image-20240612-162557.png
  4. For any column that contains PII data, slide the PII toggle to the right.

  5. Select the data type for each column.

  6. For any partition columns, slide the Allow Partitions toggle to the right.

  7. If a column contains PII, slide the User Identifiers toggle to the right and then select the user identifier that defines the PII data.

  8. Click Save.

Your data connection configuration is now complete and the status changes to "Completed".

You can now provision the resulting dataset to your desired BIgQuery clean rooms.