Skip to main content

Applications Available

The Analytics Environment virtual machine desktop provides the following applications for performing a variety of tasks with your Safe Haven data:

  • Google Cloud Platform (GCP) console, which provides access to the following applications:

    • BigQuery: A data warehouse that enables fast SQL queries

      The following applications rely on BigQuery data processing engines for running code:

      • BigQuery engine (a fully-managed, big data SQL engine) when using BigQuery console or BigQuery Python API

      • Dataproc Spark cluster (managed Hadoop servers) when using PySpark or Python

    • Dataproc Console: Uses Spark Jobs Submit to submit PySpark and Python jobs to a distributed Hadoop cluster

    Other GCP applications are locked down and cannot be used in your Safe Haven instance.

  • Job Management: Features that enable you to create, schedule, and manage PySpark and Python jobs and receive email notifications about their status.

  • Jupyter: Allows you to manipulate your Safe Haven data interactively using Python or PySpark. You can use JupyterLab's GitLab extension to provide version control for your code in JupyterLab.

  • Tableau Desktop:

    • Users with the LSH Data Analyst persona can access Tableau Desktop to build reports and visualizations.

    • Users with the LSH Admin persona or the LSH Data Scientist persona lack access to Tableau Desktop.

  • Tableau Server: Users who are granted the LSH Data Analyst w publishing persona can edit and publish Tableau reports. Users with the following personas can edit reports but cannot save them:

    • LSH Admin

    • LSH Campaign Planner w report editing

    • LSH Data Analyst w/o publishing

    • LSH Data Scientist

Interactive vs. Non-interactive

Some applications allow you to run code in the following ways:

  • Interactive: You can run the code for each step of the overall process independently and see the results for that step.

  • Non-interactive: The process runs the entire set of code all at once and you wait for all of the code to run before seeing the results. Non-interactive processing is recommended if your code requires advanced scripting or library use.

Tools for querying and analyzing data in an interactive way:

Tools for submitting non-interactive jobs:

See Also