Powered by Zoomin Software. For more details please contactZoomin

Pipelines API - Developer Guide

Product category
Technology
Doc type
Version
Product lifecycle
This publication

Pipelines API - Developer Guide: Pipeline patterns

Pipeline patterns

HERE platform pipelines are designed to accommodate specific usage patterns. The available patterns are illustrated below, starting with the simplest pattern and progressing to more complex use cases. Additional information is provided throughout the Developer Guide.

General pattern

This is the general pattern for using a pipeline.

general pipeline pattern

Basic pipeline pattern

Note

Data sources and sinks
A specific catalog layer can serve as a source or a sink, but never both at once. The type of catalog layer that may be used depends on the type of pipeline it is used with. For example, a streamed layer cannot be used with a batch pipeline.

Multiple inputs

You can have multiple inputs, but only one output from a pipeline.

multiple input pattern

Multiple inputs pattern

Stream processing pattern

You can use the pipeline to process continuous data streams using Apache Flink framework and stream layers.

stream processing pattern

Stream processing pattern
  • The data catalog is defined in the pipeline-config.conf file.
  • The layer used is defined in the code.

Note

You may use the same data catalog for a stream pipeline's input and output as long as separate layers are being used for the data source and data sink.

Batch processing pattern

This is a typical batch processing pattern using Apache Spark framework and versioned layers.

Batch processing pattern

Batch processing pattern

Volatile pattern

This is a typical pattern using volatile layers.

Volatile pattern

Volatile pattern

Index pattern

These are typical patterns using index layers.

index pattern 1

Index layer sink 1

index pattern 2

Index layer sink 2

Index layer limits of use:

Pipeline Source Sink
Batch Yes Yes
Stream No Yes

Advanced patterns

A more advanced pattern uses a catalog's volatile layer as reference data.

In this case, the output catalog uses a stream layer.

Advanced patterns

Advanced pattern

Info

The stream layer here typically uses a windowing function.

But in this case, the output catalog is only interested in a "data snapshot," so the volatile layer is used.

Using a volatile layer

Volatile layer pattern

Alternatively, you can use the output catalog's versioned layer, perhaps for aggregating data over a window of time. This approach could also be useful for archiving data, with or without processing enhancement. Also, it could be useful for historical analysis in a Notebook.

Using a versioned layer

Versioned layer pattern

Or, you can use the output catalog's Index layer, perhaps for organizing historical data by event time.

Advanced index layer pattern

Index Layer Pattern 1

Another pattern combines input data from a versioned data set with data from an index layer.

Advanced index layer pattern 2

Index Layer Pattern 2

For examples of pipeline implementation, see HERE Workspace Examples for Java and Scala Developers.
For detailed step-by-step instructions on configuring and running pipelines, see Developer tutorials.

Was this article helpful?
TitleResults for “How to create a CRG?”Also Available inAlert