Configure a BigQuery Data Connection for a Hybrid Clean Room
LiveRamp Clean Room’s application layer enables companies to securely connect distributed datasets with full control and flexibility while protecting the privacy of consumers and the rights of data owners.
To configure a BigQuery data connection for a hybrid clean room, see the instructions below.
Overall Steps
Perform the following overall steps in Google Cloud Platform to configure a BigQuery data connection for a hybrid clean room:
Once the above steps have been performed in Google Cloud Platform, perform the following overall steps in LiveRamp Clean Room:
For information on performing these steps, see the sections below.
Perform Steps in Google Cloud Platform
Perform the steps in the sections below in Google Cloud Platform to configure a BigQuery data connection for a hybrid clean room.
Create a Google Service Account
Perform the steps in the sections below in Google Cloud Platform to configure a BigQuery data connection for a hybrid clean room.
From the Google Cloud Platform main menu, select IAM & Admin → Service Accounts.
Click
. Save the service account email because you will need it in later steps.Enter a name for the service account.
Create a Private Key of type JSON for the newly created Service Account and save the JSON Key File.
Note
This credential JSON will be used in LiveRamp Clean Room to set up your credentials and data connections.
Determine Partition Columns
LiveRamp Clean Room allows you to define column(s) for partitioning data during question runs. This is recommended, as it reduces query cost and time to execute. Based on the tables you plan to connect, determine which columns are partition columns and then perform the following steps in Google Cloud Platform:
Note
If you do not need to define any partition columns, skip this procedure.
From the Google Cloud Platform main menu, select. Service Accounts → IAM & Admin → Roles.
Create a custom role with the following permissions at the Project level, depending on which data connection type you'll be using:
Google Cloud BigQuery data connection type:
bigquery.jobs.create
bigquery.jobs.delete
bigquery.jobs.update
bigquery.readsessions.create
bigquery.readsessions.getData
roles/bigquery.readSessionUser
Google Cloud Authorized View data connection type:
bigquery.jobs.create
bigquery.readsessions.create
bigquery.readsessions.getData
bigquery.tables.create
bigquery.tables.delete
bigquery.tables.get
bigquery.tables.getData
bigquery.tables.update
bigquery.tables.updateData
In the same Google Project, create an empty dataset by following Google's instructions here.
Note
The empty dataset should exist in the same region as the source dataset.
Assign the newly-created custom role to the empty dataset or use the role created for all datasets by assigning it at the Project level as in the next step. For more information, see Google's instructions here.
To define partition columns, apply the following roles and permissions to all relevant datasets/tables:
roles/bigquery.dataViewer
bigquery.tables.create
bigquery.tables.updateData
bigquery.tables.update
bigquery.tables.delete
Add BigQuery Read Session User Permission
To add BigQuery Read Session User Permission to the service account in Google Cloud Platform at the Project level:
From the Google Cloud Platform main menu, select. Service Accounts → IAM & Admin → Manage Resources.
Select the desired Google Project.
In the right pane, under the Permissions tab click
.Enter the email of the newly created Service Principal and assign the role "BigQuery Read Session User".
Click
.
Add BigQuery Data View Permission
To add BigQuery Data View Permission to the service account in Google Cloud Platform at the BigQuery Table level:
From the Google Cloud Platform main menu, select BigQuery View.
Select the BigQuery table and click
.In the right pane, click
.Enter the email of the new created Service Principal and assign the role "BigQuery Data Viewer".
Note
if you're configuring a View or Authorized View, make sure the provided Service Account has access to both the View/Authorized View and the source datasets which are used to materialize them.
Perform Steps in LiveRamp Clean Room
Once the above steps have been performed in Google Cloud Platform, perform the overall steps in the sections below in LiveRamp Clean Room.
Add the Credentials
To add credentials:
From the LiveRamp Clean Room navigation pane, select Data Management → Credentials.
Click
.Enter a descriptive name for the credential.
For the Credentials Type, select "Google Service Account".
For the Project ID, enter the project ID.
Enter the Credential JSON you stored in the "Create a Google Service Account" procedure above.
Click
.
Create the Data Connection
To create the data connection:
From the LiveRamp Clean Room navigation pane, select Data Management → Data Connections.
From the Data Connections page, click
.From the New Data Connection screen, select either Google Cloud BigQuery or Google Cloud Authorized View.
Select the credentials created in the previous procedure from the list.
If you selected the "Google Cloud BigQuery" data connection type, complete the following fields:
Name: Enter a name of your choice.
Category: Enter a category of your choice.
Dataset Type: Select Generic.
Project Id: Enter the Google Project ID.
Source Dataset: Enter the BigQuery source dataset.
Source Table: Enter the BigQuery Table Name
Temporary Dataset: Enter the name of the temporary empty dataset you created in the "Determine Partition Columns" section above to use when partitioning columns.
Note
Once you've specified a temporary dataset to use when partitioning, be sure not to remove the temporary dataset or change the dataset name.
If you selected the "Google Cloud Authorized View" data connection type, complete the following fields:
Name: Enter a name of your choice.
Category: Enter a category of your choice.
Dataset Type: Select Generic.
Authorized View: Enter the name of the authorized view.
Dataset: Enter the BigQuery source dataset.
Project Id: Enter the Google Project ID.
Review the data connection details and click
.Note
All configured data connections can be seen on the Data Connections page.
Upload your data files to your specified location.
When a connection is initially configured, it will show "Verifying Access" as the configuration status. Once the connection is confirmed and the status has changed to "Mapping Required", map the table's fields.
You will receive file processing notifications via email.
Map the Fields
Once the connection is confirmed and the status has changed to "Mapping Required", map the table's fields and add metadata:
From the row for the newly-created data connection, click the More Options menu (the three dots) and then click
.The Map Fields screen opens and the file column names auto-populate.
For any columns that you do not want to be queryable, slide the Include toggle to the left.
If needed, update any column labels.
Note
Ignore the field delimiter fields because this was defined in a previous step.
Click
.The Add Metadata screen opens.
For any column that contains PII data, slide the PII toggle to the right.
Select the data type for each column.
If a column contains PII, slide the User Identifiers toggle to the right and then select the user identifier that defines the PII data.
For any partition columns that were defined in the "Determine Partition Columns" section above, slide the Allow Partitions toggle to the right.
Click
.
Your data connection configuration is now complete and the status changes to "Completed".