This module contains a Layer class to access data in HERE platform catalogs.

class here.platform.layer.HexbinClustering(clustering_type: str = 'hexbin', absolute_resolution: int | None = None, resolution: int | None = None, relative_resolution: int | None = None, property: str | None = None, pointmode: bool | None = None)[source]#

Bases: object

This class defines attributes for hexbin clustering algorithm.

absolute_resolution: int | None = None#
clustering_type: str = 'hexbin'#
pointmode: bool | None = None#
property: str | None = None#
relative_resolution: int | None = None#
resolution: int | None = None#
class here.platform.layer.IndexLayer(layer_id: str, catalog: Catalog)[source]#

Bases: Layer

This class provides access to data stored in index layers.

blob_exists(data_handle: str, billing_tag: str | None = None) bool[source]#

Check if a blob exists for the requested data handle.

Parameters:
  • data_handle – The data handle identifies a specific blob so that you can get that blob’s contents.

  • billing_tag – A string which is used for grouping billing records.

Returns:

a boolean indicating if the handle exists.

delete_blob(data_handle: str, billing_tag: str | None = None)[source]#

Delete blob (raw bytes) for given layer ID and data-handle from storage.

Parameters:
  • data_handle – The data handle identifies a specific blob so that you can get that blob’s contents.

  • billing_tag – A string which is used for grouping billing records.

Returns:

a boolean flag true on successful delete

delete_partitions(query: str)[source]#

Delete the partitions that match the query in an index layer.

The query must be in RSQL format, see also: jirutka/rsql-parser.

Parameters:

query – A string representing a RSQL query.

:return : true when delete partitions succeeds. :raises ValueError: delete partitions failed.

get_blob(data_handle: str, range_header: str | None = None, billing_tag: str | None = None, stream: bool = False, chunk_size: int = 102400) bytes | Iterator[bytes][source]#

Get blob (raw bytes) for given layer ID and data-handle from storage.

Parameters:
  • data_handle – The data handle identifies a specific blob so that you can get that blob’s contents.

  • range_header – an optional Range parameter to resume download of a large response

  • billing_tag – A string which is used for grouping billing records.

  • stream – whether to stream data.

  • chunk_size – the size to request each iteration when streaming data.

Returns:

a blob response as bytes or iterator of bytes if stream is True

get_partitions_metadata(query: str, adapter: Adapter | None = None, part: str | None = None, billing_tag: str | None = None, **kwargs) Iterator[IndexPartition] | pd.DataFrame[source]#

Get list of all partitions matching the query.

The query must be in RSQL format, see also: jirutka/rsql-parser.

Parameters:
  • query – the RSQL query

  • adapter – the Adapter to transform and return the result. None to use the default adapter of the catalog.

  • part – Indicates which part of the layer shall be queried.

  • billing_tag – A string which is used for grouping billing records.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Returns:

Iterator of IndexPartition objects, or adapter-specific

get_parts(num_requested_parts: int = 1, billing_tag: str | None = None) dict[source]#

Return a list of Part Ids which represent the layer parts that can be used to limit the scope of a query operation. This allows to run parallel queries with multiple parts. The user has to provide the desired number of parts and the service will return a list of Part Ids. Please note in some cases the requested number of parts will make them too small and in this case the service might return lesser amount of the parts than requested.

Parameters:
  • num_requested_parts – Indicates requested number of layer parts.

  • billing_tag – A string which is used for grouping billing records.

Returns:

dict of parts as per num_requested_parts.

put_blob(path_or_data: str | bytes | Path, publication: Publication | None = None, partition_id: str | None = None, data_handle: str | None = None, part_size: int = 50, fields: Dict[str, str | int | bool] = {}, additional_metadata: Dict[str, str] = {}, timestamp: int | None = None) Partition[source]#

Upload a blob to the durable blob service.

Parameters:
  • path_or_data – content to be uploaded, it must match the layer content type, if set.

  • publication – the publication this operation is part of

  • partition_id – partition identifier the blob relates to.

  • data_handle – data handle to use for the blob, in case already available, if not available an appropriate one is generated and returned.

  • part_size – An int representing size in MB, to upload in multiple parts minimum value is 5MB and maximum is 50MB.

  • fields – A dict representing the fields of index record for data being uploaded for index layer only.

  • additional_metadata – A dict of additional metadata about data being uploaded for index layer only.

  • timestamp – timestamp, in milliseconds since Unix epoch (1970-01-01T00:00:00 UTC)

Returns:

partition object referencing the uploaded data

read_partitions(query: str, decode: bool = True, adapter: Adapter | None = None, part: str | None = None, stream: bool = False, chunk_size: int = 102400, **kwargs) Iterator[Tuple[IndexPartition, bytes]] | Iterator[Tuple[IndexPartition, Iterator[bytes]]] | Iterator[Tuple[IndexPartition, Any]] | pd.DataFrame[source]#

Read of all partition data matching the query.

The query must be in RSQL format, see also: jirutka/rsql-parser.

Parameters:
  • query – the RSQL query

  • decode – whether to decode the data through an adapter or return raw bytes

  • adapter – the Adapter to transform and return the result. None to use the default adapter of the catalog.

  • part – Indicates which part of the layer shall be queried.

  • stream – whether to stream data. This implies decode=False.

  • chunk_size – the size to request each iteration when streaming data.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Returns:

Iterator of IndexPartition objects each with its raw data, in case decode=False, adapter-specific otherwise

Raises:
  • ValueError – in case decoding is requested but the adapter does not support the content type of the layer requested, or invalid parameters # noqa

  • LayerConfigurationException – in case decoding is requested but the layer doesn’t have any content type configured # noqa

set_partitions_metadata(update: Iterable[IndexPartition] | None = None, delete: Iterable[str] | None = None)[source]#

Update the metadata of the layer as part of a publication by publishing updated partitions and/or deleting partitions.

Parameters:
  • update – the complete partitions to update.

  • delete – the data handles to delete.

write_single_partition(data: str | Path | bytes | pd.DataFrame, timestamp: int | None = None, fields: Dict[str, str | int | bool] = {}, additional_metadata: Dict[str, str] = {}, part_size: int = 50, encode: bool = True, adapter: Adapter | None = None, **kwargs)[source]#

Upload content to the layer and publish the related partition metadata.

Parameters:
  • data – data to upload to the layer and derive metadata from.

  • timestamp – timestamp

  • fields – a dict representing the fields of index record for data being uploaded

  • additional_metadata – a dict of additional metadata about data being uploaded

  • part_size – An int representing size in MB, to upload in multiple parts minimum value is 5MB and and Maximum is 50MB.

  • encode – whether to encode the data through an adapter or store raw bytes

  • adapter – the Adapter to transform the input data. None to use the default adapter of the catalog.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

class here.platform.layer.InteractiveMapLayer(layer_id: str, catalog: Catalog)[source]#

Bases: Layer

This class provides access to data stored in Interactive Map layers.

delete_feature(feature_id: str) None[source]#

Delete feature from the layer.

Parameters:

feature_id – A feature_id to be deleted.

delete_features(feature_ids: List[str] | pd.Series, **kwargs) None[source]#

Delete features from layer.

Parameters:
  • feature_ids – A list of feature_ids to be deleted, or adapter-specific

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

get_feature(feature_id: str, selection: List[str] | None = None, force_2d: bool = False) Feature[source]#

Return GeoJSON feature for the provided feature_id.

Parameters:
  • feature_id – Feature id which is to fetched.

  • selection – A list, only these properties will be present in returned feature.

  • force_2d – If set to True then features in the response will have only X and Y components, else all x,y,z coordinates will be returned.

Returns:

Feature object.

get_features(feature_ids: List[str], selection: List[str] | None = None, force_2d: bool = False, **kwargs) FeatureCollection | gpd.GeoDataFrame[source]#

Return GeoJSON FeatureCollection for the provided feature_ids.

Parameters:
  • feature_ids – A list of feature identifiers to fetch.

  • selection – A list, only these properties will be present in returned features.

  • force_2d – If set to True then features in the response will have only X and Y components, else all x,y,z coordinates will be returned.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Raises:

ValueError – If feature_ids is empty list.

Returns:

FeatureCollection object, or adapter-specific

get_features_in_bounding_box(bounds: Tuple[float, float, float, float], clip: bool = False, limit: int = 30000, params: Dict[str, str | list | tuple] | None = None, selection: List[str] | None = None, skip_cache: bool = False, clustering: HexbinClustering | QuadbinClustering | None = None, force_2d: bool = False, **kwargs) FeatureCollection | gpd.GeoDataFrame[source]#

Return the features which are inside a bounding box stipulated by bounds parameter.

Parameters:
  • bounds – A tuple of four numbers representing the West, South, East and North margins, respectively, of the bounding box.

  • clip – A Boolean indicating if the result should be clipped (default: False)

  • limit – A maximum number of features to return in the result. Default is 30000. Hard limit is 100000.

  • params

    A dict to represent additional filters on features to be searched.

    Properties initiated with ‘p.’ are used to access values in the stored feature which are under the ‘properties’ property. - params={"p.name": "foo"}

    returns all features with a value of property p.name equal to foo.

    Properties initiated with ‘f.’ are used to access values which are added by default

    in the stored feature.The possible values are: ‘f.id’, ‘f.createdAt’ and ‘f.updatedAt’

    • params={"f.createdAt": 1634}

      returns all features with a value of property f.createdAt equal to 1634.

    The query can also be written by using the long operators: “=gte”, “=lte”, “=gt”,

    ”=lt” and “=cs”

    • params={"p.count=gte": 10} returns all features with a value of property p.count greater than or equal to 10.

    • params={"p.count=lte": 10} returns all features with a value of property p.count less than or equal to 10.

    • params={"p.count=gt": 10} returns all features with a value of property p.count greater than 10.

    • params={"p.count=lt": 10} returns all features with a value of property p.count less than 10.

    • params={"p.name=cs": "bar"} returns all features with a value of property p.name which contains bar.

  • selection – A list, only these properties will be present in returned features.

  • skip_cache – If set to True the response is not returned from cache. Default is False.

  • clustering – An object of either HexbinClustering or QuadbinClustering.

  • force_2d – If set to True then features in the response will have only X and Y components, else all x,y,z coordinates will be returned.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Returns:

FeatureCollection object, or adapter-specific

iter_features(chunk_size: int = 30000, selection: List[str] | None = None, skip_cache: bool = False, force_2d: bool = False) Iterator[Feature][source]#

Return all the features in a Layer as Generator.

Parameters:
  • chunk_size – A number of features to return in single iteration.

  • selection – A list, only these properties will be present in returned features.

  • skip_cache – If set to True the response is not returned from cache. Default is False.

  • force_2d – If set to True then features in the response will have only X and Y components, else all x,y,z coordinates will be returned.

Yields:

A Feature object

search_features(limit: int = 30000, params: Dict[str, str | list | tuple] | None = None, selection: List[str] | None = None, skip_cache: bool = False, force_2d: bool = False, **kwargs) FeatureCollection | gpd.GeoDataFrame[source]#

Search for features in the layer based on the properties.

Parameters:
  • limit – A maximum number of features to return in the result. Default is 30000. Hard limit is 100000.

  • params

    A dict to represent additional filters on features to be searched.

    Properties initiated with ‘p.’ are used to access values in the stored feature

    which are under the ‘properties’ property.

    • params={"p.name": "foo"}

      returns all features with a value of property p.name equal to foo.

    Properties initiated with ‘f.’ are used to access values which are added by default in the stored feature.The possible values are: ‘f.id’, ‘f.createdAt’ and ‘f.updatedAt’. - params={"f.createdAt": 1634}

    returns all features with a value of property f.createdAt equal to 1634

    The query can also be written by using the long operators: “=gte”, “=lte”, “=gt”,

    ”=lt” and “=cs”

    • params={"p.count=gte": 10} returns all features with a value of property p.count greater than or equal to 10.

    • params={"p.count=lte": 10} returns all features with a value of property p.count less than or equal to 10.

    • params={"p.count=gt": 10} returns all features with a value of property p.count greater than 10.

    • params={"p.count=lt": 10} returns all features with a value of property p.count less than 10.

    • params={"p.name=cs": "bar"} returns all features with a value of property p.name which contains bar.

  • selection – A list, only these properties will be present in returned features.

  • skip_cache – If set to True the response is not returned from cache. Default is False.

  • force_2d – If set to True then features in the response will have only X and Y components, else all x,y,z coordinates will be returned.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Returns:

FeatureCollection object, or adapter-specific

Return the features which are inside the specified radius.

Parameters:
  • lng – The longitude in WGS’84 decimal degree (-180 to +180) of the center Point.

  • lat – The latitude in WGS’84 decimal degree (-90 to +90) of the center Point.

  • radius – Radius in meter which defines the diameter of the search request.

  • limit – The maximum number of features in the response. Default is 30000. Hard limit is 100000.

  • params

    A dict to represent additional filters on features to be searched.

    Properties initiated with ‘p.’ are used to access values in the stored feature which are under the ‘properties’ property. - params={"p.name": "foo"}

    returns all features with a value of property p.name equal to foo.

    Properties initiated with ‘f.’ are used to access values which are added by default

    in the stored feature.The possible values are: ‘f.id’, ‘f.createdAt’ and ‘f.updatedAt’

    • params={"f.createdAt": 1634}

      returns all features with a value of property f.createdAt equal to 1634

    The query can also be written by using the long operators: “=gte”, “=lte”, “=gt”,

    ”=lt” and “=cs”

    • params={"p.count=gte": 10} returns all features with a value of property p.count greater than or equal to 10.

    • params={"p.count=lte": 10} returns all features with a value of property p.count less than or equal to 10.

    • params={"p.count=gt": 10} returns all features with a value of property p.count greater than 10.

    • params={"p.count=lt": 10} returns all features with a value of property p.count less than 10.

    • params={"p.name=cs": "bar"} returns all features with a value of property p.name which contains bar.

  • selection – A list, only these properties will be present in returned features.

  • skip_cache – If set to True the response is not returned from cache. Default is False.

  • force_2d – If set to True then features in the response will have only X and Y components, else all x,y,z coordinates will be returned.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Returns:

FeatureCollection object or ‘Geo dataframe’ specific to adapter.

spatial_search_geometry(geometry: Feature | Geometry | dict | Any, radius: int | None = None, limit: int = 30000, params: Dict[str, str | list | tuple] | None = None, selection: List[str] | None = None, skip_cache: bool = False, force_2d: bool = False, **kwargs) FeatureCollection | gpd.GeoDataFrame[source]#

Return the features which are inside the specified radius and geometry.

The origin point is calculated based on the provided geometry.

Parameters:
  • geometry – Geometry which will be used in intersection. It supports GeoJSON Feature, GeoJSON Geometry, or __geo_interface__.

  • radius – Radius in meter which defines the diameter of the search request.

  • limit – The maximum number of features in the response. Default is 30000. Hard limit is 100000.

  • params

    A dict to represent additional filters on features to be searched.

    Properties initiated with ‘p.’ are used to access values in the stored feature which are under the ‘properties’ property. - params={"p.name": "foo"}

    returns all features with a value of property p.name equal to foo.

    Properties initiated with ‘f.’ are used to access values which are added by default

    in the stored feature.The possible values are: ‘f.id’, ‘f.createdAt’ and ‘f.updatedAt’

    • params={"f.createdAt": 1634}

      returns all features with a value of property f.createdAt equal to 1634.

    The query can also be written by using the long operators: “=gte”, “=lte”, “=gt”,

    ”=lt” and “=cs”

    • params={"p.count=gte": 10} returns all features with a value of property p.count greater than or equal to 10.

    • params={"p.count=lte": 10} returns all features with a value of property p.count less than or equal to 10.

    • params={"p.count=gt": 10} returns all features with a value of property p.count greater than 10.

    • params={"p.count=lt": 10} returns all features with a value of property p.count less than 10.

    • params={"p.name=cs": "bar"} returns all features with a value of property p.name which contains bar.

  • selection – A list, only these properties will be present in returned features.

  • skip_cache – If set to True the response is not returned from cache. Default is False.

  • force_2d – If set to True then features in the response will have only X and Y components, else all x,y,z coordinates will be returned.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Returns:

FeatureCollection object, or adapter-specific

property statistics: dict#

The statistical information of the layer.

subscribe(subscription_name: str, description: str, destination_catalog_hrn: str, destination_layer_id: str, interactive_map_subscription_type: InteractiveMapSubscriptionType) InteractiveMapSubscription[source]#

Method to Subscribe to a Stream Layer from Layer’s Catalog HRN. Source Layer is the current layer and Source Catalog is Layer’s Catalog which it belongs.

Parameters:
  • subscription_name – Name of the subscription.

  • description – Description of the subscription.

  • destination_catalog_hrn – Catalog HRN of the destination Catalog.

  • destination_layer_id – Layer Id of the destination Stream Layer.

  • interactive_map_subscription_type – InteractiveMapSubscriptionType containing type of

subscription. :raises KeyError: in case statusToken in Response of createSubscription. :raises ValueError: in case Created Subscription Status is NOT Active after

multiple retry till max retry time.

Returns:

InteractiveMapSubscription object containing details of the created subscription.

update_feature(feature_id: str, data: Feature | dict) None[source]#

Update the GeoJSON feature in the Layer.

Parameters:
  • feature_id – A feature_id to be updated.

  • data – A GeoJSON Feature object to update.

update_features(data: FeatureCollection | dict | gpd.GeoDataFrame, **kwargs) None[source]#

Update multiple features provided as FeatureCollection object.

Parameters:
  • data – A FeatureCollection, dict, or adapter-specific

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

write_feature(feature_id: str, data: Feature | dict) None[source]#

Write GeoJSON feature to Layer.

Parameters:
  • feature_id – Identifier for the feature.

  • data – GeoJSON feature which is written to layer.

write_features(features: FeatureCollection | dict | Iterator[Feature] | List[Feature] | gpd.GeoDataFrame | None = None, from_file: str | Path | None = None, feature_count: int = 2000, **kwargs) None[source]#

Write GeoJSON FeatureCollection to layer.

As API has a limitation on the size of features, features are divided into groups, and each group has number of features based on feature_count.

Parameters:
  • features – Features represented by FeatureCollection, Dict, Iterator, list of features, or adapter-specific

  • from_file – Path of GeoJSON file.

  • feature_count – An int representing a number of features to upload at a time.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

class here.platform.layer.KafkaTokenProvider(stream_layer: StreamLayer)[source]#

Bases: AbstractTokenProvider

This class provides token to Kafka consumer and producer.

token()[source]#

Returns token for Kafka consumer. :return: token for consumer

class here.platform.layer.Layer(layer_id: str, catalog: Catalog)[source]#

Bases: object

This base class provides access to data stored in catalog layers.

Instances can read their schemas for data stored in protobuf format, all available partition IDs as well as the raw data blobs inside such partitions. You have to use the Schema class to access the decoded protobuf data.

property configuration: LayerConfiguration#

The configuration of the layer

get_details() Dict[str, Any][source]#

Get layer details from the platform.

Returns:

a dictionary with the layer details

get_schema() Schema | None[source]#

Return the schema of the layer, if available.

This allows for parsing the partition data. It only works for layers which define a protobuf schema.

Returns:

a Schema instance

has_schema() bool[source]#

Check whether the layer has a schema defined. This does not obtain and register the schema.

Returns:

whether the layer has a schema

is_index() bool[source]#

Check if this is an index layer.

Returns:

True if this is an index layer otherwise False

is_interactivemap() bool[source]#

Check if this is an interactive map layer.

Returns:

True if this is an interactive map layer otherwise False

is_objectstore() bool[source]#

Check if this is an objectstore layer.

Returns:

True if this is an objectstore layer otherwise False

is_stream() bool[source]#

Check if this is a stream layer.

Returns:

True if this is a stream layer otherwise False

is_versioned() bool[source]#

Check if this is a versioned layer.

Returns:

True if this is a versioned layer otherwise False

is_volatile() bool[source]#

Check if this is a volatile layer.

Returns:

True if this is a volatile layer otherwise False

open_in_portal()[source]#

Open the layer page on the HERE platform portal.

class here.platform.layer.LayerConfiguration(json_dict: Dict[str, Any])[source]#

Bases: JsonDictDocument

The configuration of a layer, including its most significant properties

property billing_tag: List[str] | None#

List of billing tags used for grouping billing records together for the layer

property content_encoding: str | None#

The content transfer encoding used to transfer blobs, typically gzip or empty

property content_type: str | None#

The MIME type of the blobs stored in the layer, e.g. application/x-protobuf.

property coverage: Coverage | None#

The geographic area that this layer covers

property created: datetime#

Timestamp, in ISO 8601 format, when the layer was initially created

property description: str | None#

A longer description of the layer

property hrn: str | None#

The HERE Resource Name (HRN) of the layer

property id: str | None#

The ID of the layer

property name: str#

The name of the layer

property partitioning: Partitioning | None#

Describes the way in which data is partitioned within the layer

property properties: Dict[str, Any] | None#

Returns additional properties depending on layer type. :return: Dict of layer-specific properties or None if no extra properties are set.

property schema: Dict[str, str] | None#

Describes a HRN for the layer schema. Can be updated by the user for any kind of layer. :return: Dict of schema or None

property summary: str | None#

The summary of the layer

property tags: List[str]#

List of user-defined tags applied to the layer

property type: LayerType#

The type of the layer.

property volume: Volume | None#

Describes the volume to be used for storing the layer’s data content

class here.platform.layer.LayerType(value)[source]#

Bases: Enum

LayerType enum defines the different layer types supported.

Supported types: versioned, index, volatile, stream, interactivemap, objectstore.

The string representation is lowercase, to match with strings used in the platform APIs.

INDEX = 2#
INTERACTIVEMAP = 5#
OBJECTSTORE = 6#
STREAM = 4#
UNKNOWN = 0#
VERSIONED = 1#
VOLATILE = 3#
classmethod from_str(s: str) LayerType[source]#

Create a LayerType from a string, this is case-insensitive

class here.platform.layer.ObjectMetadata(key: str, last_modified: str | None, size: int | None, object_type: ObjectType, content_type: str | None, content_encoding: str | None)[source]#

Bases: object

Metadata and details of an object stored in an ObjectStoreLayer.

This includes, among others, object type and size, HTTP content type and last modified date.

content_encoding: str | None#
content_type: str | None#
key: str#
last_modified: str | None#
object_type: ObjectType#
size: int | None#
class here.platform.layer.ObjectStoreLayer(layer_id: str, catalog: Catalog)[source]#

Bases: Layer

This class provides access to data stored in object store layers.

MAX_UPLOAD_PART_SIZE = 96#
MIN_UPLOAD_PART_SIZE = 5#
copy_object(key: str, copy_from: str, replace: bool = False)[source]#

Copy object using the source to copy from in the object store layer .

Parameters:
  • key – key for the object to created.

  • copy_from – key for the object to copy from.

  • replace – if true, will replace the object while copying if the destination already exists. This replace is not atomic, if the delete is succeeded and put object fails then the object is gone.

Raises:

ValueError – in case given key and copy_from are same or destination already exists with replace=False.

delete_all_objects(parent_key: str = '/', strict: bool = False)[source]#

Delete all objects which are associated with given key from the object store layer.

Parameters:
  • parent_key – parent key for the object to delete

  • strict – when True, raise a PlatformException if the object doesn’t exist, when False, no exception is raised

Raises:

PlatformException – if the platform responds with an HTTP error

delete_object(key: str, strict: bool = False)[source]#

Delete an object from the object store layer.

Parameters:
  • key – key for the object to delete

  • strict – when True, raise a PlatformException if the object doesn’t exist, when False, no exception is raised

Raises:

PlatformException – if the platform responds with an HTTP error

get_object_metadata(key: str) ObjectMetadata[source]#

Get the metadata of the object with the given key.

Parameters:

key – key of the object to fetch metadata of

Returns:

object metadata of the given object

get_objects_metadata(parent: str | None = None, limit: int = 1000, deep: bool = False) Iterator[ObjectMetadata][source]#

Iterate over the metadata of the objects stored in the layer.

Parameters:
  • parent – a string that tells what “directory” should be the root for the returned content. When not set, the root is assumed

  • deep – if True, returns also metadata from the subdirectories

  • limit – number of results to return per request call: a larger value performs larger but less frequent requests to the service, a smaller value performs shorter but more frequent requests to the service. The overall content retrieved is independent of this value. To limit the amount of metadata returned, simply filter the iterator or consume the iterator up to the number of elements wanted.

Returns:

an iterator of ObjectMetadata

is_directory(key: str) bool[source]#

Check if given key is a directory.

Parameters:

key – key of the object.

Returns:

returns True if the given key is a directory.

iter_keys(parent: str | None = None, deep: bool = False, limit: int = 1000) Iterator[str][source]#

Iterate over the keys of the objects stored in the layer.

Parameters:
  • parent – a string that tells what “directory” should be the root for the returned content. When not set, the root is assumed

  • deep – if True, returns also keys from the subdirectories

  • limit – number of results to return per request call: a larger value performs larger but less frequent requests to the service, a smaller value performs shorter but more frequent requests to the service. The overall content retrieved is independent of this value. To limit the amount of keys returned, simply filter the iterator or consume the iterator up to the number of elements wanted.

Returns:

an iterator of object keys

key_exists(key: str) bool[source]#

Check if the layer contains an object with the given key.

Parameters:

key – the object key to check

Returns:

if the layer contain an object with the given key

list_keys(parent: str | None = None, deep: bool = False) List[str][source]#

List the keys of the objects stored in the layer.

Parameters:
  • parent – a string that tells what “directory” should be the root for the returned content. When not set, the root is assumed

  • deep – if True, returns also keys from the subdirectories

Returns:

a list of object keys

read_object(key: str, include_metadata: bool = False, stream: bool = False, chunk_size: int = 102400) bytes | Iterator[bytes] | Tuple[bytes | Iterator[bytes], ObjectMetadata][source]#

Read and return the content of an object.

Optionally, also return the corresponding object metadata.

Parameters:
  • key – key for the object to read

  • include_metadata – whether to also return the object metadata

  • stream – whether to stream data

  • chunk_size – the size to request each iteration when streaming data

Returns:

the content of the object and, if requested, also its metadata

set_max_upload_part_size(size: int)[source]#

Sets the maximum size of uploaded parts in MB (megabytes).

Parameters:

size – max. size of uploaded parts.

write_object(key: str, path_or_data: str | Path | bytes, content_type: str = 'application/octet-stream', overwrite: bool = True, upload_part_size: int | None = None, content_encoding: str | None = None)[source]#

Write an object to the object store layer. If file/bytes size is larger than max. upload part size then the blob will be written in multiple parts.

This functions adds a new object or overwrites an existing object.

Parameters:
  • key – key for the object to write.

  • path_or_data – data to be written.

  • content_type – the standard MIME type describing the format of the data.

  • overwrite – if True then this method will overwrite if the key exists, and if False then this method will raise error if key exists.

  • upload_part_size – optional size of upload parts; if not specified the class’ default is used

  • content_encoding – Content-encoding of the object. This header is optional. For more information, see https://tools.ietf.org/html/rfc2616#section-14.11

Returns:

None

Raises:

ValueError – in case the file does not exist or upload_part_size is out of range.

class here.platform.layer.ObjectType(value)[source]#

Bases: Enum

ObjectType defines the different types of object stored in an ObjectStoreLayer.

DIRECTORY = 'commonPrefix'#
OBJECT = 'object'#
class here.platform.layer.QuadbinClustering(clustering_type: str = 'quadbin', no_buffer: bool = False, relative_resolution: int | None = None, resolution: int | None = None, countmode: str | None = None)[source]#

Bases: object

This class defines attributes for quadbin clustering algorithm.

clustering_type: str = 'quadbin'#
countmode: str | None = None#
no_buffer: bool = False#
relative_resolution: int | None = None#
resolution: int | None = None#
class here.platform.layer.StreamIngestion(json_dict: Dict[str, Any])[source]#

Bases: JsonDictDocument

Response of the stream layer to confirm successful data ingestion.

property message_ids: List[str]#

The identifiers assigned to each SDII message ingested.

property message_list_id: str#

The identifier assigned to the ingested SDII message list.

class here.platform.layer.StreamLayer(layer_id: str, catalog: Catalog)[source]#

Bases: Layer

This class provides access to data stored in stream layers.

append_stream_metadata(partitions: Iterable[StreamPartition] | pd.DataFrame, publication: Publication | None = None, adapter: Adapter | None = None, **kwargs) None[source]#

Append new partition metadata to the stream layer directly as messages to the stream.

Parameters:
  • publication – the publication this operation is part of

  • partitions – the partitions to append as messages, or adapter-specific

  • adapter – the Adapter to transform the input. None to use the default adapter of the catalog.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

blob_exists(data_handle: str, billing_tag: str | None = None) bool[source]#

Check if a blob exists for the requested data handle.

Parameters:
  • data_handle – The data handle identifies a specific blob so that you can get that blob’s contents.

  • billing_tag – A string which is used for grouping billing records.

Returns:

a boolean indicating if the handle exists.

get_kafka_topic() str[source]#

Returns topic for stream layer. :return: topic

get_stream_metadata(subscription: StreamSubscription, commit_offsets: bool = True, adapter: Adapter | None = None, **kwargs) Iterator[StreamPartition] | pd.DataFrame[source]#

Consume metadata for a subscription.

It does not download blobs, use read_stream for that.

The function consumes and returns messages for the stream subscription. The amount of messages retrieved depends on a variety of factors and it is not possible to assume that all the available messages are returned with one single invocation: users should invoke this function multiple times to read all the content present in the stream, at least until some data is returned, if this is what is wanted.

When no more data is returned, users can reasonably assume the end of stream is reached. However, when operating with a distributed, asynchronous messaging system like the one employed in this case, producers can append new messages at any point in time and there may be a delay between the moment when data is produced and the moment when data is available for consumption.

While no message is lost, the end of stream can’t always be detected reliably.

Parameters:
  • subscription – the subscription from where to consume the data

  • commit_offsets – automatically commit offsets so next read starts at the end of the last consumed message.

  • adapter – the Adapter to transform and return the result. None to use the default adapter of the catalog.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Returns:

Iterator of StreamPartition objects, or adapter-specific

Raises:

ValueError – in case the subscription is invalid # noqa

kafka_consumer(group_id: str | None = None, **kwargs) KafkaConsumer[source]#

Instantiate and return a new KafkaConsumer pre-configured to operate with the layer.

Parameters:
Returns:

kafka consumer

kafka_producer(**kwargs) KafkaProducer[source]#

Instantiate and return a new KafkaProducer pre-configured to operate with the layer.

Parameters:

kwargs – Kafka producer properties.

Returns:

Kafka producer

put_blob(path_or_data: str | bytes | Path, partition_id: str | None = None, data_handle: str | None = None, inline_stream_data_limit: int | None = 1048576) Partition[source]#

Upload a blob to the durable blob service.

Parameters:
  • path_or_data – content to be uploaded, it must match the layer content type, if set.

  • partition_id – partition identifier the blob relates to.

  • data_handle – data handle to use for the blob, in case already available, if not available an appropriate one is generated and returned.

  • inline_stream_data_limit – threshold data size in bytes to decide if inline stream data field should be populated, if data size is less than the inline_data_limit then the data would be added to StreamPartition.data field or else blob would be uploaded and its data_handle will be added to StreamPartition.data_handle field.

Returns:

partition object referencing the uploaded data

read_stream(subscription: StreamSubscription, commit_offsets: bool = True, decode: bool = True, adapter: Adapter | None = None, stream: bool = False, chunk_size: int = 102400, **kwargs) Iterator[Tuple[StreamPartition, bytes]] | Iterator[Tuple[StreamPartition, Iterator[bytes]]] | Iterator[Tuple[StreamPartition, Any]] | pd.DataFrame[source]#

Consume data for this subscription. Download and decode the blobs.

The function consumes and returns messages for the stream subscription. The amount of messages retrieved depends on a variety of factors and it is not possible to assume that all the available messages are returned with one single invocation: users should invoke this function multiple times to read all the content present in the stream, at least until some data is returned, if this is what is wanted.

When no more data is returned, users can reasonably assume the end of stream is reached. However, when operating with a distributed, asynchronous messaging system like the one employed in this case, producers can append new messages at any point in time and there may be a delay between the moment when data is produced and the moment when data is available for consumption.

While no message is lost, the end of stream can’t always be detected reliably.

Parameters:
  • subscription – the subscription from where to consume the data

  • commit_offsets – automatically commit offset so next read starts at the end of the last message.

  • decode – whether to decode the data through an adapter or return raw bytes

  • adapter – the Adapter to transform and return the result. None to use the default adapter of the catalog.

  • stream – whether to stream data. This implies decode=False.

  • chunk_size – the size to request each iteration when streaming data.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Returns:

Iterator of StreamPartition objects each with its raw data, in case decode=False, adapter-specific otherwise

Raises:
  • ValueError – in case the subscription is invalid # noqa

  • ValueError – in case decoding is requested but the adapter does not support the content type of the layer requested, or invalid parameters

  • LayerConfigurationException – in case decoding is requested but the layer doesn’t have any content type configured

subscribe(mode: Mode = Mode.SERIAL, consumer_id: str | None = None, kafka_consumer_properties: dict | None = None, group_id: str | None = None, auto_offset_reset: str | None = None, subscription_id: str | None = None) StreamSubscription[source]#

Enable message consumption for this layer.

Parameters:
  • mode – The subscription mode for this subscription. By default value is serial.

  • consumer_id – The Id to use to identify this consumer. It must be unique within the consumer group. If you do not provide one, the system will generate one.

  • kafka_consumer_properties – Properties to configure the kafka consumer on the service

  • group_id – set the consumer group id

  • auto_offset_reset

    to seek to some predefined locations in the stream. earliest: automatically reset the offset to the earliest offset latest: automatically reset the offset to the latest offset none: the service will return an error if no previous offset is

    found for the consumer’s group

  • subscription_id – subscription id returned from a previous call to subscribe(). This allows a previously created subscription (e.g. saved to persistent storage between application runs) to be restored.

For other kafka consumer available settings, see https://kafka.apache.org/documentation/#consumerconfigs. :return: a new subscription to the stream layer

write_stream(data: Iterable[Tuple[str | int, str | Path | bytes] | Tuple[str | int, str | Path | bytes, int | None]] | Mapping[str | int, str | Path | bytes] | Iterable[Tuple[str | int, Any] | Tuple[str | int, Any, int | None]] | Mapping[str | int, Any] | pd.DataFrame, timestamp: int | None = None, encode: bool = True, inline_data_limit: int = 819200, adapter: Adapter | None = None, **kwargs)[source]#

Write new content to the layer and push the related partition metadata to the stream as part of a publication.

Parameters:
  • data – data to upload to the layer and derive metadata from: a sequence of elements, each either (id, data) or (id, data, timestamp). Timestamp is optional and in milliseconds since Unix epoch (1970-01-01T00:00:00 UTC)

  • encode – whether to encode the data or upload raw bytes

  • timestamp – optional timestamp for all the messages, if none is specified in data: in milliseconds since Unix epoch (1970-01-01T00:00:00 UTC)

  • inline_data_limit – threshold data size in bytes to decide if inline stream data field should be populated ,if data size is less than the inline_data_limit then the data would be added to StreamPartition.data field or else blob would be uploaded and its data_handle will be added to StreamPartition.data_handle field. Default is 819200 bytes.

  • adapter – the Adapter to transform the input. None to use the default adapter of the catalog.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

class here.platform.layer.StreamSubscription(layer: StreamLayer, sub_id: str, sub_mode: Mode, node_base_url: str)[source]#

Bases: object

Represent a subscription to consume data from a stream layer.

The subscription must be closed by unsubscribing to free resources on the service.

class Mode(value)[source]#

Bases: Enum

Mode of a stream subscription.

PARALLEL = 'parallel'#
SERIAL = 'serial'#
commit_offsets(offsets: Dict[int, int])[source]#

Commit specified offsets once read is done.

Parameters:

offsets – Dict of offset {<Partition ID>:<Offset Number>, <Partition ID>:<Offset Number>}. Partition id is kafka partition id.

seek_to_offsets(offsets: Dict[int, int])[source]#

Seek to stream offsets for a stream layer subscription. It will start reading data from a specified offsets.

Parameters:

offsets – Dict of offset {<Partition ID>:<Offset Number>, <Partition ID>:<Offset Number>}. Partition id is kafka partition id.

unsubscribe(strict: bool = False)[source]#

Disable message consumption for this layer.

After unsubscribing, you need to subscribe to the stream layer again to be able to resume the data consumption.

Parameters:

strict – True to require that the subscription exists, False to allow it to have already been cancelled.

class here.platform.layer.VersionedLayer(layer_id: str, catalog: Catalog)[source]#

Bases: Layer

This class provides access to data stored in versioned layers.

blob_exists(data_handle: str, billing_tag: str | None = None) bool[source]#

Check if a blob exists for the requested data handle.

Parameters:
  • data_handle – The data handle identifies a specific blob so that you can get that blob’s contents.

  • billing_tag – A string which is used for grouping billing records.

Returns:

a boolean indicating if the handle exists.

get_age_map(data_level: str) VersionedLayerStatisticsMap[source]#

Retrieve layer age map.

Parameters:

data_level – One of the Data Levels configured for this layer.

By default, assets generated at deepest data level are returned. Note that assets returned for data levels greater than 11 represent data at data level 11.

Returns:

VersionedLayerStatisticsMap object containing properties data, image.

get_blob(data_handle: str, range_header: str | None = None, billing_tag: str | None = None, stream: bool = False, chunk_size: int = 102400) bytes | Iterator[bytes][source]#

Get blob (raw bytes) for given layer ID and data-handle from storage.

Parameters:
  • data_handle – The data handle identifies a specific blob so that you can get that blob’s contents.

  • stream – whether to stream data.

  • chunk_size – the size to request each iteration when streaming data.

  • range_header – an optional Range parameter to resume download of a large response

  • billing_tag – A string which is used for grouping billing records.

Returns:

a blob response as bytes or iterator of bytes if stream is True

get_partition_changes(since_version: int | None = None, version: int | None = None, part: str | None = None, additional_fields: List[str] | None = ['dataSize', 'checksum', 'compressedDataSize', 'crc'], adapter: Adapter | None = None, **kwargs) Iterator[VersionedPartition] | pd.DataFrame[source]#

Get list of all partition objects for the catalog with the given version.

Parameters:
  • since_version – version from which partitions need to be tracked.

  • version – the catalog version. If not specified, the latest catalog version will be used

  • part – indicates which part of the layer shall be queried. If not specified, return all the partitions. It cannot be specified together with partition_ids

  • additional_fields – Additional metadata fields dataSize, checksum, compressedDataSize, crc. By default considers all.

  • adapter – the Adapter to transform and return the result. None to use the default adapter of the catalog.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Returns:

Iterator of VersionedPartition objects, or adapter-specific

get_partitions_metadata(partition_ids: List[int | str] | None = None, version: int | None = None, part: str | None = None, additional_fields: List[str] | None = ['dataSize', 'checksum', 'compressedDataSize', 'crc'], adapter: Adapter | None = None, **kwargs) Iterator[VersionedPartition] | pd.DataFrame[source]#

Get list of all partition objects for the catalog with the given version.

Parameters:
  • partition_ids – The list of partition IDs. If not specified, all partitions are returned

  • version – the catalog version. If not specified, the latest catalog version will be used

  • part – indicates which part of the layer shall be queried. If not specified, return all the partitions. It cannot be specified together with partition_ids

  • additional_fields – Additional metadata fields dataSize, checksum, compressedDataSize, crc. By default considers all.

  • adapter – the Adapter to transform and return the result. None to use the default adapter of the catalog.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Returns:

Iterator of VersionedPartition objects, or adapter-specific

Raises:

ValueError – in case of invalid parameter combination

get_size_map(data_level: str) VersionedLayerStatisticsMap[source]#

Retrieve layer size map.

Parameters:

data_level – One of the Data Levels configured for this layer.

By default, assets generated at deepest data level are returned. Note that assets returned for data levels greater than 11 represent data at data level 11.

Returns:

VersionedLayerStatisticsMap object containing properties data, image.

get_statistics() VersionedLayerStatistics[source]#

Retrieve layer statistics.

Returns:

VersionedLayerStatistics object containing layer statistics.

get_tile_map(data_level: str) VersionedLayerStatisticsMap[source]#

Retrieve layer tile map.

Parameters:

data_level – One of the Data Levels configured for this layer.

By default, assets generated at deepest data level are returned. Note that assets returned for data levels greater than 11 represent data at data level 11.

Returns:

VersionedLayerStatisticsMap object containing properties data, image.

put_blob(path_or_data: str | bytes | Path, publication: Publication | None = None, partition_id: str | None = None, data_handle: str | None = None) Partition[source]#

Upload a blob to the durable blob service.

Parameters:
  • path_or_data – content to be uploaded, it must match the layer content type, if set.

  • publication – the publication this operation is part of

  • partition_id – partition identifier the blob relates to.

  • data_handle – data handle to use for the blob, in case already available, if not available an appropriate one is generated and returned.

Returns:

partition object referencing the uploaded data

read_partitions(partition_ids: List[int | str] | None = None, version: int | None = None, part: str | None = None, decode: bool = True, adapter: Adapter | None = None, stream: bool = False, chunk_size: int = 102400, **kwargs) Iterator[Tuple[VersionedPartition, bytes]] | Iterator[Tuple[VersionedPartition, Iterator[bytes]]] | Iterator[Tuple[VersionedPartition, Any]] | pd.DataFrame[source]#

Read partition data from a layer.

Parameters:
  • partition_ids – The list of partition IDs. If not specified, all partitions are read.

  • version – the catalog version. If not specified, the latest catalog version will be used.

  • part – indicates which part of the layer shall be queried. If not specified, return all the partitions. It cannot be specified together with partition_ids

  • decode – whether to decode the data through an adapter or return raw bytes

  • stream – whether to stream data. This implies decode=false.

  • chunk_size – the size to request each iteration when streaming data.

  • adapter – the Adapter to transform and return the result. None to use the default adapter of the catalog.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Returns:

Iterator of VersionedPartition objects each with its raw data, in case decode=False, adapter-specific otherwise

Raises:
  • ValueError – in case decoding is requested but the adapter does not support the content type of the layer requested, or invalid parameters

  • LayerConfigurationException – in case decoding is requested but the layer doesn’t have any content type configured # noqa

set_partitions_metadata(publication: Publication, update: None | Iterable[VersionedPartition] | pd.DataFrame = None, delete: None | Iterable[str | int] | pd.Series = None, adapter: Adapter | None = None, **kwargs)[source]#

Update the metadata of the layer as part of a publication by publishing updated partitions and/or deleting partitions.

Parameters:
  • publication – the publication this operation is part of

  • update – the complete partitions to update, if any, or adapter-specific

  • delete – the partition ids to delete, if any, or adapter-specific

  • adapter – the Adapter to transform the input. None to use the default adapter of the catalog.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

write_partitions(publication: Publication, data: Iterable[Tuple[str | int, str | Path | bytes]] | Mapping[str | int, str | Path | bytes] | Iterable[Tuple[str | int, Any]] | Mapping[str | int, Any] | pd.DataFrame, encode: bool = True, adapter: Adapter | None = None, **kwargs)[source]#

Upload content to the layer and publish the related partition metadata as part of a publication.

Parameters:
  • publication – the publication this operation is part of

  • data – data to upload to the versioned layer, or adapter-specific

  • encode – whether to encode the data or upload raw bytes.

  • adapter – the Adapter to transform the input. None to use the default adapter of the catalog.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

class here.platform.layer.VersionedLayerStatistics(json_dict: Dict[str, Any])[source]#

Bases: JsonDictDocument

Response of the versioned layer containing layer statistics.

property level_summary: Dict[int, VersionedLevelSummary]#

The summary of each level.

class here.platform.layer.VersionedLayerStatisticsMap(data: bytes)[source]#

Bases: object

Response of the versioned layer containing layer bitmap (bytes) data type response handling.

property data: bytes#

The raw data bytes.

property image#

The representation of the raw bytes as an IPython.display.Image. Note: IPython toolkit needs to be installed.

Returns:

IPython.display.Image

Raises:

RuntimeError – in case IPython toolkit is not installed

class here.platform.layer.VersionedLevelSummary(json_dict: Dict[str, Any])[source]#

Bases: JsonDictDocument

Response of the versioned layer containing level summary.

property bounding_box: dict#

The bounding box of the level.

property max_partition_size: int#

The max partition size of level.

property min_partition_size: int#

The minimum partition size of level.

property processed_timestamp: int#

The processed timestamp of level.

property size: int#

The size in bytes in the level .

property total_partitions: int#

The total number of partitions in level.

property version: int#

The version of level.

class here.platform.layer.VolatileLayer(layer_id: str, catalog: Catalog)[source]#

Bases: Layer

This class provides access to data stored in volatile layers.

blob_exists(data_handle: str, billing_tag: str | None = None) bool[source]#

Check if a blob exists for the requested data handle.

Parameters:
  • data_handle – The data handle identifies a specific blob so that you can get that blob’s contents.

  • billing_tag – A string which is used for grouping billing records.

Returns:

a boolean indicating if the handle exists.

delete_partitions(publication: Publication, partitions: Iterable[VolatilePartition])[source]#

Delete content to selected partitions of the layer.

Parameters:
  • publication – the publication this operation is part of

  • partitions – identifiers of the volatile partitions to delete

get_blob(data_handle: str, billing_tag: str | None = None, stream: bool = False, chunk_size: int = 102400) bytes | Iterator[bytes][source]#

Get blob (raw bytes) for given layer ID and data-handle from storage.

Parameters:
  • data_handle – The data handle identifies a specific blob so that you can get that blob’s contents.

  • billing_tag – A string which is used for grouping billing records.

  • stream – whether to stream data

  • chunk_size – the size to request each iteration when streaming data.

Returns:

a blob response as bytes or iterator of bytes if stream is True

get_partitions_metadata(partition_ids: List[int | str] | None = None, additional_fields: List[str] | None = ['dataSize', 'checksum', 'compressedDataSize', 'crc'], adapter: Adapter | None = None, **kwargs) Iterator[VolatilePartition] | pd.DataFrame[source]#

Get list of all partition objects for the catalog.

Parameters:
  • partition_ids – The list of partition IDs. If not specified, all partitions are read.

  • additional_fields – Additional metadata fields dataSize, checksum, compressedDataSize, crc. By default considers all.

  • adapter – the Adapter to transform and return the result. None to use the default adapter of the catalog.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Returns:

Iterator of VolatilePartition objects, or adapter-specific

put_blob(path_or_data: str | bytes | Path, publication: Publication | None = None, partition_id: str | None = None, data_handle: str | None = None) Partition[source]#

Upload a blob to the volatile blob service.

Parameters:
  • path_or_data – content to be uploaded, it must match the layer content type, if set.

  • publication – the publication this operation is part of

  • partition_id – partition identifier the blob relates to

  • data_handle – data handle to use for the blob, in case already available, if not available an appropriate one is generated and returned.

Returns:

partition object referencing the uploaded data

read_partitions(partition_ids: List[int | str] | None = None, decode: bool = True, adapter: Adapter | None = None, stream: bool = False, chunk_size: int = 102400, **kwargs) Iterator[Tuple[VolatilePartition, bytes]] | Iterator[Tuple[VolatilePartition, Iterator[bytes]]] | Iterator[Tuple[VolatilePartition, Any]] | pd.DataFrame[source]#

Read partition data from a layer.

Parameters:
  • partition_ids – The list of partition IDs. If not specified, all partitions are read.

  • decode – whether to decode the data through an adapter or return raw bytes

  • stream – whether to stream data. This implies decode=false.

  • adapter – the Adapter to transform and return the result. None to use the default adapter of the catalog.

  • chunk_size – the size to request each iteration when streaming data.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

Returns:

Iterator of VolatilePartition objects each with its raw data, in case decode=False, adapter-specific otherwise

Raises:
  • ValueError – in case decoding is requested but the adapter does not support the content type of the layer requested, or invalid parameters

  • LayerConfigurationException – in case decoding is requested but the layer doesn’t have any content type configured # noqa

set_partitions_metadata(publication: Publication, update: None | Iterable[VolatilePartition] | pd.DataFrame = None, delete: None | Iterable[str | int] | pd.Series = None, adapter: Adapter | None = None, **kwargs) None[source]#

Update the metadata of the layer as part of a publication by publishing updated partitions and/or deleting partitions.

Parameters:
  • publication – the publication this operation is part of

  • update – the complete partitions to update, if any, or adapter-specific

  • delete – the partition ids to delete, if any, or adapter-specific

  • adapter – the Adapter to transform the input. None to use the default adapter of the catalog.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports

write_partitions(publication: Publication, data: Iterable[Tuple[str | int, str | Path | bytes]] | Mapping[str | int, str | Path | bytes] | Iterable[Tuple[str | int, Any]] | Mapping[str | int, Any] | pd.DataFrame, encode: bool = True, adapter: Adapter | None = None, **kwargs) None[source]#

Upload content to the layer and publish the related partition metadata as part of a publication.

Parameters:
  • publication – the publication this operation is part of.

  • data – data to upload to the volatile layer, or adapter-specific

  • encode – whether to encode the data or upload raw bytes.

  • adapter – the Adapter to transform the input. None to use the default adapter of the catalog.

  • kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports