Applications Available
The Analytics Environment virtual machine desktop provides the following applications for performing a variety of tasks with your Safe Haven data.
Applications Available Based on User Personas
If you have the LSH Admin persona, LSH Data Scientist persona, or LSH Data Analyst persona, you can access the Analytics Environment . However, your access to applications in the Analytics Environment can differ based on the persona you are granted.
Note
Due to Analytics Environment data access restrictions, a user cannot be granted an LSH Data Scientist persona and either an LSH Data Analyst persona or an LSH Admin persona for the same tenant.
Applications | LSH Admin | LSH Data Scientist | LSH Data Analyst |
---|---|---|---|
Tableau Desktop | |||
Tableau Server | |||
Jupyter | |||
DataProc | |||
BigQuery - Non-aggregated data | |||
BigQuery - Aggregated data | |||
BigQuery - Permissioned data |
All Applications
The Analytics Environment virtual machine desktop provides the following applications for performing a variety of tasks with your Safe Haven data:
Google Cloud Platform (GCP) console, which provides access to the following applications:
BigQuery: A data warehouse that enables fast SQL queries
The following applications rely on BigQuery data processing engines for running code:
BigQuery engine (a fully-managed, big data SQL engine) when using BigQuery console or BigQuery Python API
Dataproc Spark cluster (managed Hadoop servers) when using PySpark or Python
Dataproc Console: Uses Spark Jobs Submit to submit PySpark and Python jobs to a distributed Hadoop cluster
Other GCP applications are locked down and cannot be used in your Safe Haven instance.
Job Management: Features that enable you to create, schedule, and manage PySpark and Python jobs and receive email notifications about their status.
Jupyter: Allows you to manipulate your Safe Haven data interactively using Python or PySpark.
You can use JupyterLab's GitLab extension to provide version control for your code in JupyterLab.
Coderepo: From the Jupyter terminal, you can use the command-line interface to perform coderepo operations. For information, see "Use the Command-Line Interface".
Users with the LSH Data Analyst persona can access Tableau Desktop to build reports and visualizations.
Users with the LSH Admin persona or the LSH Data Scientist persona lack access to Tableau Desktop.
Tableau Server: Users who are granted the LSH Data Analyst w publishing persona can edit and publish Tableau reports. Users with the following personas can edit reports but cannot save them:
LSH Admin
LSH Campaign Planner w report editing
LSH Data Analyst w/o publishing
LSH Data Scientist
Interactive vs. Non-interactive
Some applications allow you to run code in the following ways:
Interactive: You can run the code for each step of the overall process independently and see the results for that step.
Non-interactive: The process runs the entire set of code all at once and you wait for all of the code to run before seeing the results. Non-interactive processing is recommended if your code requires advanced scripting or library use.
Tools for querying and analyzing data in an interactive way:
BigQuery (from within the GCP Console)
Jupyter Notebooks:
Tools for submitting non-interactive jobs:
Dataproc jobs submit (from within the GCP Console)