Analytics Environment Resources
In the Analytics Environment, there are two types of data storage resources available:
Cloud data warehouse databases (BigQuery datasets)
Cloud object storage buckets (GCS buckets)
BigQuery Datasets
Your BigQuery instance comes with predefined datasets (within the Resources area on the left) that can be used to store different types of data, depending on your user persona:
Note
In your BigQuery instance, the names for these datasets start with your account name and end with the suffixes shown below (for example, "<organization_name
>_wh").
Suffix | Access Type | Description | Data Scientists | Data Analysts |
---|---|---|---|---|
_ai | Read, write | Holds input and output tables used in machine learning (ML) models (analytics insights) | Yes | No |
_pub | Read, write | Holds tables that need to be published to Tableau or marked for granting permissions | Yes | No |
_rdm | Read-only | For tenants with retail data, holds sales transaction data (retail data model) | Yes | No |
_rp | Read-only | Holds tables to use for creating Tableau reports | No | Yes |
_sg | Read, write | Holds segment tables to send to Customer Profiles for Data Collaboration or directly to a destination | Yes | No |
_sub | Read-only | Holds permissioned views from a primary tenant to a partner tenant (subtenant) | Yes | No |
_wh | Read-only | Holds warehouse tables that store your ingested data | Yes | No |
_work | Read, write | Holds tables as a workspace for use by your organization (not shared with other tenants) By default, there is no expiration policy for tables in this dataset. | Yes | No |
Customer Profiles Segment Data
The following tables in the _wh (warehouse) dataset store your ingested data:
customer_profiles_segments: Contains the segment data (members) that were sent to Analytics Environment from Customer Profiles for Data Collaboration. Use this table to perform campaign measurements and develop insights.
customer_profiles_segment_reference: Contains metadata from distributed segments that are automatically sent to Analytics Environment, including name, counts, destinations distributed, and timestamps.
customer_profiles_segments table:
Column Name | Data Type | Description |
---|---|---|
distributed_timestamp | Timestamp | The distribution date, which depends on the mode:
For information about these modes, see "Distribute to Analytics Environment." |
segment_name | String | The name of the segment |
test_flag | Boolean | Indicates whether a RampID is in the segment's test group (True) or its control group (False) |
user_id | String | The RampID associated with the record in the segment |
customer_profiles_segment_reference table:
Column name | Data Type | Description |
---|---|---|
alwayson_enabled | Boolean |
|
append_mode | Integer |
|
count | Integer | The count of the segment RampIDs |
destination_name | String | The destination name. If there are multiple destinations in the distribution, multiple records are created. |
distributed_timestamp | Timestamp | The distribution date |
segment_id | String | The unique ID of the segment |
segment_name | String | The name of the segment |
Destination Tables
If your Analytics Environment tenant has been configured to support delivering simple segment tables directly to a destination platform, the following tables will be available in your _wh (warehouse) dataset in BigQuery.
The following tables in the _wh (warehouse) dataset store data about your destination accounts and distributions:
da_reference: This table includes your allow list of destination accounts that have been implemented in Customer Profiles. For information, see The Destinations Page.
syndicate_destination_result: This table stores the status of any distributions to a destination platform. If it does not yet exist, it will be created once your first distribution directly from Analytics Environment is complete. You can query the data in this table if you want to check the status of your distribution.
da_reference table:
Column Name | Data Type | Description |
---|---|---|
da_id | Integer | The unique ID of the destination account Example: 3557 |
da_name | String | The value of the destination account's "Destination Name" Example: TTD |
syndicate_destination_result table:
Column Name | Data Type | Description |
---|---|---|
da_id | Integer | The unique ID of the destination account Example: 3557 |
da_name | String | The value of the destination account's "Destination Name" Example: TTD |
end_time | Timestamp | The end date set for the distribution Example: 2023-12-07 17:31:00 |
start_time | Timestamp | The start date set for the distribution Example: 2023-09-07 15:26:21 |
status | String | The status of the distribution, which can be either SUCCESS or FAIL |
Table_Name | String | The name of the simple segment table that was sent to the distribution account Example: cereal_campaign_converter |
GCS Buckets
In addition to BigQuery datasets, data scientists can use the following GCS buckets for storing data and code:
_CODEREPO (read, write): Store code artifacts for submitting a non-interactive job to the Analytics Environment Spark cluster.
_WORK (read, write): Store files processed by the Spark cluster (a sandbox).
Use Logs Explorer in the Google Cloud Console
Users with the "LSH Data Scientist" persona can use the logs explorer in the Google Cloud console to view log details for their Google Cloud Platform (GCP) resources, including:
BigQuery
Google Cloud Storage (GCS) buckets
Cloud Dataproc Clusters
Note
Currently, you can only view the logs. Some features are not yet available, for example, create metric and create alert.
From the Analytics Environment virtual machine desktop landing page, click the Google BigQuery tile.
Tip
If your account has access to more than one tenant, choose the tile for the desired tenant account.
From the BigQuery console, open the Project drop-down list and select the appropriate organization and project.
Search for "logs explorer" and then click
.Tip
Pin this menu for future reference.
Filter Logs
You can filter the logs by using the Logs Explorer Query tab. For example, you could troubleshoot a failed BigQuery job:
View Dashboard List
You can view the list of log dashboards provided by Google to gain further insights.