Configure a BigQuery Data Connection for a Hybrid Clean Room

If you have data in Google BigQuery and want to be able to use that data in questions in LiveRamp Clean Room, you can create a BigQuery data connection.

A BigQuery data connection for a Hybrid clean room can be used in the following clean room types:

Hybrid
Confidential Computing

Note

To configure a BigQuery data connection to be used in a Google Cloud BigQuery (native-pattern) clean room, see "Configure a BigQuery Data Connection for a BigQuery Clean Room".

After you’ve created the data connection and Clean Room has validated the connection by connecting to the data in your cloud account, you will then need to map the fields before the data connection is ready to use. This is where you specify which fields can be queryable across any clean rooms, which fields contain identifiers to be used in matching, and any columns by which you wish to partition the dataset for questions.

After fields have been mapped, you’re ready to provision the resulting dataset to your desired clean rooms. Within each clean room, you’ll be able to set dataset analysis rules, exclude or include columns, filter for specific values, and set permission levels.

To configure a BigQuery data connection to be used in either a Hybrid clean room or a Confidential Computing clean room, see the instructions below.

Overall Steps

Perform the following overall steps in Google Cloud Platform to configure a BigQuery data connection to be used in a Hybrid clean room or a Confidential Computing clean room:

Once the above steps have been performed in Google Cloud Platform, perform the following overall steps in LiveRamp Clean Room:

For information on performing these steps, see the sections below.

Perform Steps in Google Cloud Platform

Perform the steps in the sections below in Google Cloud Platform to configure a BigQuery data connection for a hybrid clean room.

Create a Google Service Account

Perform the steps in the sections below in Google Cloud Platform to configure a BigQuery data connection for a hybrid clean room.

From GCP's main menu, select IAM & Admin → Service Accounts.
Click CREATE SERVICE ACCOUNT. Save the service account email because you will need it in later steps.
Enter a name for the service account.
Create a Private Key of type JSON for the newly created Service Account and save the JSON Key File.
Note
This credential JSON will be used in LiveRamp Clean Room to set up your credentials and data connections.

Determine Partition Columns

LiveRamp Clean Room allows you to define column(s) for partitioning data during question runs. This is recommended, as it reduces query cost and time to execute. Based on the tables you plan to connect, determine which columns are partition columns and then perform the following steps in Google Cloud Platform:

Note

If you do not need to define any partition columns, skip this procedure.

From the Google Cloud Platform main menu, select. Service Accounts → IAM & Admin → Roles.
Create a custom role with the following permissions at the Project level, depending on which data connection type you'll be using:
- Google Cloud BigQuery data connection type:
  - bigquery.jobs.create
  - bigquery.jobs.delete
  - bigquery.jobs.update
  - bigquery.readsessions.create
  - bigquery.readsessions.getData
  - roles/bigquery.readSessionUser
- Google Cloud Authorized View data connection type:
  - bigquery.jobs.create
  - bigquery.readsessions.create
  - bigquery.readsessions.getData
  - bigquery.tables.create
  - bigquery.tables.delete
  - bigquery.tables.get
  - bigquery.tables.getData
  - bigquery.tables.update
  - bigquery.tables.updateData
In the same Google Project, create an empty dataset by following Google's instructions here.
Note
The empty dataset should exist in the same region as the source dataset.
Assign the newly-created custom role to the empty dataset or use the role created for all datasets by assigning it at the Project level as in the next step. For more information, see Google's instructions here.
To define partition columns, apply the following roles and permissions to all relevant datasets/tables:
- roles/bigquery.dataViewer
- bigquery.tables.create
- bigquery.tables.updateData
- bigquery.tables.update
- bigquery.tables.delete

Add BigQuery Read Session User Permission

To add BigQuery Read Session User Permission to the service account in Google Cloud Platform at the Project level:

From the Google Cloud Platform main menu, select. Service Accounts → IAM & Admin → Manage Resources.
Select the desired Google Project.
In the right pane, under the Permissions tab click Add Principal.
Enter the email of the newly created Service Principal and assign the role "BigQuery Read Session User".
Click SAVE.

Add BigQuery Data View Permission

To add BigQuery Data View Permission to the service account in Google Cloud Platform at the BigQuery Table level:

From the Google Cloud Platform main menu, select BigQuery View.
Select the BigQuery table and click Share.
In the right pane, click Add Principal.
Enter the email of the new created Service Principal and assign the role "BigQuery Data Viewer".
Note
if you're configuring a View or Authorized View, make sure the provided Service Account has access to both the View/Authorized View and the source datasets which are used to materialize them.

Perform Steps in LiveRamp Clean Room

Once the above steps have been performed in Google Cloud Platform, perform the overall steps in the sections below in LiveRamp Clean Room.

Note

if your cloud security limits access to only approved IP addresses, talk to your LiveRamp representative before creating the data connection to coordinate any necessary allowlisting of LiveRamp IP addresses.

Add the Credentials

To add credentials:

From the navigation menu, select Clean Room → Credentials to open the Credentials page.
Click Create Credential.
Enter a descriptive name for the credential.
For the Credentials Type, select "Google Service Account".
For the Project ID, enter the project ID.
Enter the Credential JSON.
Click Save Credential.

Create the Data Connection

After you've added the credentials to LiveRamp Clean Room, create the data connection:

Note

From the navigation menu, select Clean Room → Data Connections to open the Data Connections page.
From the Data Connections page, click Create Data Connection.
From the New Data Connection screen, select either Google Cloud BigQuery or Google Cloud Authorized View.
Select the credentials created in the previous procedure from the list.
If you selected the "Google Cloud BigQuery" data connection type, complete the following fields in the Set up Data Connection section:
- Name: Enter a name for the data connection (this will be the name for the dataset that you'll provision to clean rooms).
- Category: Enter a category of your choice.
- Dataset Type: Select Generic.
If you selected the "Google Cloud BigQuery" data connection type, complete the following tasks and fields in the Data Location and Schema section:
- To use partitioning on the dataset associated with the data connection, slide the Use Partitioning toggle to the right.
  Note
  If the dataset uses partitioning, the dataset can be divided into subsets so that data processing occurs only on relevant data during question runs, which results in faster processing times. When using partitioning, a temporary dataset is required to be entered below.
- Project Id: Enter the Google Project ID.
- Source Dataset: Enter the BigQuery source dataset.
- Source Table: Enter the name of the BigQuery source table, view, or materialized view.
- Temporary Dataset: Enter the name of the temporary empty dataset you created in the "Determine Partition Columns" section above to use when partitioning columns.
  Note
  Temporary Dataset names can only include letters, numbers, and underscores (and cannot contain spaces or special characters).
  Once you've specified a temporary dataset to use when partitioning, be sure not to remove the temporary dataset or change the dataset name.
If you selected the "Google Cloud Authorized View" data connection type, complete the following fields:
- Name: Enter a name for the data connection (this will be the name for the dataset that you'll provision to clean rooms).
- Category: Enter a category of your choice.
- Dataset Type: Select Generic.
- Authorized View: Enter the name of the authorized view.
- Dataset: Enter the the name of the BigQuery source dataset.
- Project Id: Enter the Google Project ID.
Review the data connection details and click Save Data Connection.
Note
All configured data connections can be seen on the Data Connections page.
If you haven't already, upload your data files to your specified location.

When a connection is initially configured, it will show "Verifying Access" as the configuration status. Once the connection is confirmed and the status has changed to "Mapping Required", map the table's fields.

You will receive file processing notifications via email.

Map the Fields

Once the above steps have been performed in Google Cloud Platform, perform the overall steps in the sections below in LiveRamp Clean Room.

Note

Before mapping the fields, we recommend confirming any expectations your partners might have for field types for any specific fields that will be used in questions.

From the row for the newly created data connection, click the More Options menu (the three dots) and then click Edit Mapping.
The Map Fields screen opens, and the file column names auto-populate.
For any columns that you do not want to be queryable, slide the Include toggle to the left.
Note
Ignore the field delimiter fields because this was defined in a previous step.
Click Next.
The Add Metadata screen opens.
For any column that contains PII data, slide the PII toggle to the right.
Note
If you data contains a column with RampIDs, do not slide the PII toggle for that column. Mark the RampID column as a User Identifier and select "RampID" as the identifier type. If the data contains a RampID column, no other columns can be enabled as PII.
Select the data type (field type) for each column (for more information on supported field types, see "Field Types for Data Connections").
If a column contains PII, slide the User Identifiers toggle to the right and then select the user identifier that defines the PII data.
Note
When you select "Raw Email" as the user identifier for an email column, those email addresses will be automatically SHA256 hashed.
For any partition columns that were defined in the "Determine Partition Columns" section above, slide the Allow Partitions toggle to the right.
Click Save.

Your data connection configuration is now complete and the status changes to "Completed".

You can now provision the resulting dataset to your desired Hybrid or Confidential Computing clean rooms.

In this section:

Configure a BigQuery Data Connection for a Hybrid Clean Room

Note

Overall Steps

Perform Steps in Google Cloud Platform

Create a Google Service Account

Note

Determine Partition Columns

Note

Note

Add BigQuery Read Session User Permission

Add BigQuery Data View Permission

Note

Perform Steps in LiveRamp Clean Room

Note

Add the Credentials

Create the Data Connection

Note

Note

Note

Note

Map the Fields

Note

Note

Note

Note

Search results