Skip to main content

Configure an Iceberg Catalog Data Connection

If you have data tables in Apache Iceberg format, you can configure a connection to the data from LiveRamp Clean Room so that you can use that data in either a Hybrid or Confidential Computing clean room.

Note

  • This connection type currently uses AWS Glue Catalog as the mechanism for connecting to the Iceberg tables. Other catalog types will be available in the future.

  • For information on how LiveRamp Clean Room interprets the data types from Glue Catalog, see “Glue Catalog”.

To configure an Iceberg Catalog data connection, see the instructions below.

Overall Steps

After making sure all prerequisites are in place, perform the following overall steps to configure an Iceberg Catalog data connection in LiveRamp Clean Room:

For information on performing these steps, see the sections below.

Prerequisites

The Iceberg table must be cataloged in the AWS Glue Catalog.

The following information is needed to configure your Iceberg Catalog data connection in LiveRamp Clean Room:

  • AWS Access Key ID

  • AWS Secret Access Key

  • AWS User ARN

  • AWS Region

  • Database Name

  • Table Name

  • Catalog Name

  • Catalog ID

Add the Credentials

To add credentials:

  1. From the LiveRamp Clean Room navigation pane, select Data ManagementCredentials.

  2. Click Add Credential.

    add_credential.png
  3. Enter a descriptive name for the credential.

  4. For the Credentials Type, select "AWS IAM User Credentials".

  5. Enter the following parameters associated with your AWS configuration:

    LCR-Add_IAM_User_Credentials-parameters.png
    • AWS Access Key ID

    • AWS Secret Access Key

    • AWS User ARN

    • AWS Region

  6. Click Save Credential.

Create the Data Connection

To create the data connection:

  1. From the LiveRamp Clean Room navigation pane, select Data ManagementData Connections.

  2. From the Data Connections page, click New Data Connection.

    data_cxn_new.png
  3. From the New Data Connection screen, select "Iceberg Catalog".

    LCR-Configure_Iceberg_Catalog_Data_Connection-Iceberg_Catalog_tile.png
  4. Select the credentials created in the previous procedure from the list.

    Note

    You can also create credentials here by clicking New Credentials and following the instructions in the "Add the Credentials" section above.

  5. Configure the data connection:

    LCR-Configure_Iceberg_Catalog_Data_Connection-data_connection_parameters.png
    • Name: Enter a name of your choice.

    • Category: Enter a category of your choice.

    • Dataset Type: Select Generic.

    • Catalog Type: Select GLUE.

    • Database Name: Enter the name of the database that contains your data.

    • Table Name: Enter the name of the Apache Iceberg table.

    • Catalog Name: Enter the name of the AWS account that contains the Iceberg table.

    • Catalog ID: Enter the ID of the AWS account that contains the Iceberg table.

  6. Review the data connection details and click Save Data Connection.

    Note

    All configured data connections can be seen on the Data Connections page.

When a connection is initially configured, it will show "Verifying Access" as the configuration status. Once the connection is confirmed and the status has changed to "Mapping Required", map the table's fields.

LCR-Configure_Iceberg_Catalog_Data_Connection-Mapping_Required_status.png

You will receive file processing notifications via email.

Map the Fields

Once the connection is confirmed and the status has changed to "Mapping Required", map the table's fields and add metadata:

  1. From the row for the newly-created data connection, click the More Options menu (the three dots) and then click Edit Mapping.

    LCR-Configure_Iceberg_Catalog_Data_Connection-Edit_Mapping_menu_selection.png

    The Map Fields screen opens and the file column names auto-populate.

  2. For any columns that you do not want to be queryable, slide the Include toggle to the left.

  3. Click Next.

    The Add Metadata screen opens.

    image-20240612-162557.png
  4. For any column that contains PII data, slide the PII toggle to the right.

  5. Select the data type for each column.

  6. For columns that you want to partition, slide the Allow Partitions toggle to the right.

  7. If a column contains PII, slide the User Identifiers toggle to the right and then select the user identifier that defines the PII data.

  8. Click Save.

Your data connection configuration is now complete and the status changes to "Completed".