Spark UI for Batch Pipelines
- Last UpdatedJan 18, 2021
- 3 minute read
Overview
Batch pipelines have a useful tool for monitoring and inspecting batch jobs' execution. The Spark framework includes a Web Console that is active for all Spark jobs in the Running state. It is called the Spark UI and can be accessed directly from within the platform. This article explains how to access the Spark UI and some of its basic functions.
Using Spark UI
Spark UI is automatically started when a batch pipeline job is initiated. It can be accessed from the the HERE platform portal or from a terminal.
The best way to reach the Spark UI is through the HERE platform portal. Figure 1 shows a typical display for a running batch pipeline. On the right side of the screen you can see two links: See Log
and Open Spark UI
.
Alternately, OLP CLI command pipeline job show
can be used to get the pipeline UI URL
for the Spark UI for a Batch pipeline.
Note::Logging in
If you have not already logged into the Spark UI in your current session, you will be requested to do so using the normal platform sign-in dialog. Use your normal platform credentials. Cookies are used to maintain your current session information.
The Open Spark UI
link in the platform portal or the pipeline UI URL
link in the CLI opens the Spark UI in your browser as illustrated in figure 2.
There are 5 tabs that allow access to different categories of information. These include:
- Jobs -- This tab shows the job and its stages and tasks with the current state.
- Stages -- This tab shows details of a selected stage.
- Storage -- This tab shows information about server storage available for the job.
- Environment -- This tab shows run-time information for the job.
- Executors -- This tab shows information about executor status and resource allocations.
Clicking on the linked pipeline job description opens the details page as shown in Figure 3.
The Executors tab also has useful information as shown in Figure 4.
This version of the Spark UI has been modified to make it compatible with the platform. This means that there are some functions that might be available in a native Spark environment that are not available here. When referring to other Spark UI documentation references, you may see discussions of features that are missing here. Those missing features are not compatible with the platform.