here.geopandas_adapter.geopandas_adapter module — HERE Data SDK for Python documentation
- Last UpdatedJun 18, 2025
- 14 minute read
HERE Platform Python SDK, GeoPandas adapter access package
- class here.geopandas_adapter.geopandas_adapter.GeoPandasAdapter(partition_column: str = 'partition_id', timestamp_column: str = 'partition_timestamp', including_default_value_fields: bool = True, preserving_proto_field_name: bool = True)[source]#
Bases:
Adapter
This adapter transform data from and to
pd.DataFrame
andgpd.DataFrame
, when geometry information such as longitude and latitude is involved.An adapter controls the encoding and decoding process of platform data. It transforms data from and to adapter-specific data structure and supports reading, writing, encoding and decoding a variety of MIME content types.
For the list of MIME content types supported when reading and writing a layer with
read_*
andwrite_*
functions of theLayer
and its subclasses, please see documentation ofGeoPandasDecoder
andGeoPandasEncoder
.All the operations involving content passes through an adapter when the parameters
encode
ordecode
areTrue
, their default value. These are parameters of theread_*
andwrite_*
functions. If a content type is not supported, or if reading or writing raw content is preferred, passFalse
to skip encoding or decoding and deal with raw bytes instead.- property content_adapter: ContentAdapter#
The adapter specialized for content.
- from_feature_ids(feature_ids: Iterator[str], **kwargs) Series [source]#
Adapt a sequence of feature identifiers to a Series.
- Parameters:
feature_ids – sequence of feature identifiers
kwargs – additional parameters are passed unchanged to
pd.Series()
. For additional information please see: https://pandas.pydata.org/docs/reference/api/pandas.Series.html
- Returns:
a Series with the feature identifiers
- from_geo_features(features: Iterator[Feature], **kwargs) GeoDataFrame [source]#
Adapt a sequence of geographic features to a GeoDataFrame.
- Parameters:
features – sequence of geographic features
kwargs – additional parameters are passed unchanged to
gpd.GeoDataFrame.from_features()
. For additional information please see: https://geopandas.org/docs/reference/api/geopandas.GeoDataFrame.from_features.html
- Returns:
a new
gpd.GeoDataFrame
containing the features
- from_index_data(partitions_data: Iterator[Tuple[IndexPartition, bytes]], content_type: str, schema: Schema | None, **kwargs) DataFrame [source]#
Adapt index partition metadata and data to a :class:
pd.DataFrame
.- Parameters:
partitions_data – sequence of partition metadata and data from a stream layer
content_type – the MIME content type of the layer
schema – optional
Schema
of the layerkwargs – additional, content-type-specific parameters, see
GeoPandasDecoder
- Returns:
partition data as
pd.DataFrame
orgpd.GeoDataFrame
- from_index_metadata(partitions: Iterator[IndexPartition], **kwargs) DataFrame [source]#
Adapt index partition metadata to a :class:
pd.DataFrame
.- Parameters:
partitions – sequence of partition metadata from an index layer
kwargs – unused
- Returns:
partition metadata as :class:
pd.DataFrame
- from_stream_data(partitions_data: Iterator[Tuple[StreamPartition, bytes]], content_type: str, schema: Schema | None, **kwargs) DataFrame [source]#
Adapt stream partition metadata and data to a :class:
pd.DataFrame
.- Parameters:
partitions_data – sequence of partition metadata and data from a stream layer
content_type – the MIME content type of the layer
schema – optional
Schema
of the layerkwargs – additional, content-type-specific parameters, see
GeoPandasDecoder
- Returns:
stream message data as
pd.DataFrame
orgpd.GeoDataFrame
- from_stream_metadata(partitions: Iterator[StreamPartition], **kwargs) DataFrame [source]#
Adapt stream partition metadata to a :class:
pd.DataFrame
.- Parameters:
partitions – sequence of partition metadata from a versioned layer
kwargs – unused
- Returns:
partition metadata as :class:
pd.DataFrame
- from_versioned_data(partitions_data: Iterator[Tuple[VersionedPartition, bytes]], content_type: str, schema: Schema | None, **kwargs) DataFrame [source]#
Adapt versioned partition metadata and data to a :class:
pd.DataFrame
.- Parameters:
partitions_data – sequence of partition metadata and data from a versioned layer
content_type – the MIME content type of the layer
schema – optional
Schema
of the layerkwargs – additional, content-type-specific parameters, see
GeoPandasDecoder
- Returns:
partition data as
pd.DataFrame
orgpd.GeoDataFrame
- from_versioned_metadata(partitions: Iterator[VersionedPartition], **kwargs) DataFrame [source]#
Adapt versioned partition metadata to a :class:
pd.DataFrame
.- Parameters:
partitions – sequence of partition metadata from a versioned layer
kwargs – unused
- Returns:
partition metadata as :class:
pd.DataFrame
- from_volatile_data(partitions_data: Iterator[Tuple[VersionedPartition, bytes]], content_type: str, schema: Schema | None, **kwargs) DataFrame [source]#
Adapt versioned partition metadata and data to a :class:
pd.DataFrame
.- Parameters:
partitions_data – sequence of partition metadata and data from a volatile layer
content_type – the MIME content type of the layer
schema – optional
Schema
of the layerkwargs – additional, content-type-specific parameters, see
GeoPandasDecoder
- Returns:
partition data as
pd.DataFrame
orgpd.GeoDataFrame
- from_volatile_metadata(partitions: Iterator[VolatilePartition], **kwargs) DataFrame [source]#
Adapt volatile partition metadata to the target format.
- Parameters:
partitions – sequence of partition metadata from a volatile layer
kwargs – unused
- Returns:
partition metadata as :class:
pd.DataFrame
- to_feature_ids(data: Series, **kwargs) Iterator[str] [source]#
Adapt data from a Series to a sequence of feature identifiers.
Values are converted to str. NA values discarded.
- Parameters:
data – a Series containing feature identifiers
kwargs – unused
- Returns:
sequence of feature identifiers
- to_geo_features(data: GeoDataFrame, **kwargs) Iterator[Feature] [source]#
Adapt data in a GeoDataFrame to a sequence of geographic features.
- Parameters:
data – the
gpd.GeoDataFrame
to adaptkwargs – additional parameters are passed unchanged to
gpd.GeoDataFrame.iterfeatures()
. For additional information please see: https://geopandas.org/docs/reference/api/geopandas.GeoDataFrame.iterfeatures.html
- Returns:
sequence of geographic features from the GeoDataFrame
- to_index_single_data(data: DataFrame, content_type: str, schema: Schema | None, **kwargs) bytes [source]#
Adapt a DataFrame to be stored in an index layer.
- Parameters:
data – data in the form of DataFrame
content_type – the MIME content type of the layer
schema – optional
Schema
of the layerkwargs – additional, content-type-specific parameters, see
GeoPandasEncoder
- Returns:
data encoded for an index layer
- Raises:
ValueError – in case the content type is not supported by the adapter # noqa
- to_stream_data(layer: StreamLayer, data, content_type: str, schema: Schema | None, timestamp: int | None, **kwargs) Iterator[Tuple[str | int, bytes, int | None]] [source]#
Adapt data from the target format to stream partition metadata and data.
- Parameters:
layer – the layer all the metadata and data belong to
data – adapter-specific, the data to adapt
content_type – the MIME content type of the layer
schema – optional
Schema
of the layertimestamp – optional timestamp for all the messages, if none is specified in data: in milliseconds since Unix epoch (1970-01-01T00:00:00 UTC)
kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports
- Yield:
partition id, data and timestamp for the stream layer
- Raises:
ValueError – in case required columns are missing
- to_stream_metadata(layer: StreamLayer, partitions: DataFrame, **kwargs) Iterator[StreamPartition] [source]#
Adapt what to publish from the target format to stream partition metadata.
- Parameters:
layer – the layer all the metadata and data belong to
partitions – the
pd.DataFrame
of partition metadata to appendkwargs – unused
- Yield:
the
StreamPartition
that are adapted
- to_versioned_data(layer: VersionedLayer, data: pd.DataFrame, content_type: str, schema: Schema | None, **kwargs) Iterator[Tuple[str | int, bytes]] [source]#
Adapt data from sequence of partition ids and data to versioned partition id and data.
- Parameters:
layer – the layer all the metadata and data belong to
data – data as
pd.DataFrame
orgpd.GeoDataFrame
content_type – the MIME content type of the layer
schema – optional
Schema
of the layerkwargs – additional, content-type-specific parameters, see
GeoPandasEncoder
- Returns:
sequence of partition id and data for the volatile layer
- to_versioned_metadata(layer: VersionedLayer, partitions_update: DataFrame | None, partitions_delete: Series | None, **kwargs) Tuple[Iterator[VersionedPartition], Iterator[str | int]] [source]#
Adapt
pd.DataFrame
of metadata andpd.Series
of keys to versioned partition metadata and partition ids to update and delete.- Parameters:
layer – the layer all the metadata and data belong to
partitions_update – the
pd.DataFrame
of partition metadata to update, if anypartitions_delete – the
pd.Series
of partitions ids to delete, if anykwargs – unused
- Returns:
tuple of Iterator, the first with the
VersionedPartition
that have to be updated, the second with the partition ids to delete
- to_volatile_data(layer: VolatileLayer, data: pd.DataFrame, content_type: str, schema: Schema | None, **kwargs) Iterator[Tuple[str | int, bytes]] [source]#
Adapt data from sequence of partition ids and data to volatile partition id and data.
- Parameters:
layer – the layer all the metadata and data belong to
data – data as
pd.DataFrame
orgpd.GeoDataFrame
content_type – the MIME content type of the layer
schema – optional
Schema
of the layerkwargs – additional, content-type-specific parameters, see
GeoPandasEncoder
- Returns:
sequence of partition id and data for the volatile layer
- to_volatile_metadata(layer: VolatileLayer, partitions_update: DataFrame | None, partitions_delete: Series | None, **kwargs) Tuple[Iterator[VolatilePartition], Iterator[str | int]] [source]#
Adapt
pd.DataFrame
of metadata andpd.Series
of keys to volatile partition metadata and partition ids to update and delete.- Parameters:
layer – the layer all the metadata and data belong to
partitions_update – the
pd.DataFrame
of partition metadata to update, if anypartitions_delete – the
pd.Series
of partitions ids to delete, if anykwargs – unused
- Returns:
tuple of Iterator, the first with the
VolatilePartition
that have to be updated, the second with the partition ids to delete
- class here.geopandas_adapter.geopandas_adapter.GeoPandasContentAdapter(partition_column: str)[source]#
Bases:
ContentAdapter
Specialization of the
GeoPandasAdapter
to map tabular-like content from content bindings toGeoDataFrame
orDataFrame
.- from_objects(fields: type, data: Iterator[object], single_element: bool = False, index_partition: None | str | Callable[[object], Partition] = None, index_id: None | str | Callable[[object], Identifier] = None, index_ref: None | str | Callable[[object], Ref | Iterable[Ref]] = None) DataFrame | GeoDataFrame [source]#
Adapt content form a structured representation to pandas
DataFrame
or geopandasGeoDataFrame
.It can optionally perform indexing of objects, based on their partition, identifier and set of references to other objects. Indexing is specified by naming the field of the object that contains the value to index, or by passing a function that calculates that value from the object.
- Parameters:
fields – the fields to extract, as specified by a dataclass. Field names are looked up among the attributes of each object via
getattr`. When missing, ``None
or equivalent is used. Each field has a type that describes its semantic: it is used to adapt the value to the most appropriate representation for the output format.TypeError
is raised in case this is not possible.data – the objects to adapt to the target format. Fields not mentioned in
fields
are discarded. Expected but missing fields and identifiers are consideredNone
. Field values may be of any type compatible with the type declared for the field. Partition ids don’t have to be unique, but they have to be contiguous: all the objects with a given partition identifier must be returned in sequence. Object identifiers, when present, must be unique across the whole content.single_element – the data contains exactly one element, the content adapter case use this information to optimize or return a specialized representation
index_partition – index the content by partition, using the field specified
index_id – index the content by object identifier, using the field specified
index_ref – index the content by references, using the field specified. Each object can contain zero, one or more references, and references can be shared among multiple objects.
- Returns:
objects in a dataframe, indexed as requested
- Raises:
ValueError: if the fields are not described by a dataclass KeyError: in case partition id, object id or reference is needed but not present TypeError: in case partition or object id is not of type int or string. Also raised in case field values are not of the type declared for the field, or if they can’t be converted to it.
- class here.geopandas_adapter.geopandas_adapter.GeoPandasDecoder(including_default_value_fields: bool = True, preserving_proto_field_name: bool = True)[source]#
Bases:
Decoder
Implementation of a
Decoder
to work withpd.DataFrame
andgpd.GeoDataFrame
.- decode_blob(data: bytes, content_type: str, schema: Schema | None = None, **kwargs)[source]#
Decode one single blob of data.
- Parameters:
data – the encoded data
content_type – the MIME content type to be decoded
schema – the schema, if the content type requires one
kwargs –
additional, content-type-specific parameters for the decoder:
For Protobuf (application/protobuf or application/x-protobuf): -
record_path
: the name of a schema field that is decoded and transformed toDataFrame
. It can reference nested fields by concatenating the field names with.
. When referencing a single Protobuf sub-message, that message is decoded into one single dataframe row. When referencing repeated Protobuf messages, each repeated message is decoded in its own row, resulting in multiple rows per partition. Fields that are not Protobuf messages or repeated fields containing single values (ints, strings, …) are not supported because it is not possible to transform them to a dataframe. If not specified, the whole blob is decoded as single message. Messages are decoded, normalized (seemax_level
) and passed topd.DataFrame.from_record()
together with the rest ofkwargs
: this turns each field of the normalized messages into a column of the resulting dataframe. -record_prefix
: if True, prefix the column names with therecord_path
. If a non-empty string, that string is used as prefix..
is used as separator. -max_level
: normalize each record of the decoded Protobuf message up to the specified maximum level in depth.0
disables normalization. -geometry_col
: name of a column that contains geometries that is converted to a geopandasGeoSeries
, resulting in aGeoDataFrame
returned in place of a pandasDataFrame
. For the supported formats, please see documentation ofhere.geopandas_adapter.geo_utils.to_geometry
. Geometry field and sub-fields are excluded from normalization. If not specified, pandasDataFrame
is returned and geometry is not interpreted. -geometry_crs
: the CRS to set in theGeoDataFrame
, when applicable. - The rest of the parameters are passed unchanged topd.DataFrame.from_record()
for further customizations. For additional information please see: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.from_records.htmlFor Parquet (application/x-parquet): -
engine
: an optional param for type of engine used to parse the parquet data, values allowed are [auto, fastparquet, pyarrow]. If ‘auto’, then the behavior is to try ‘pyarrow’, falling back to ‘fastparquet’ ifArrowNotImplementedError
is raised. - The rest of the parameters are passed unchanged topd.read_parquet()
for further customizations. For additional information please see: https://pandas.pydata.org/docs/reference/api/pandas.read_parquet.htmlFor CSV (text/csv):
sep
: delimiter or column separator to use.header
: row number(s) to use as the column names, and the start of the data. Default behavior is to infer the column names: if no names are passed the behavior is identical toheader=0
and column names are inferred from the first line of the file, if column names are passed explicitly then the behavior is identical toheader=None
. Explicitly passheader=0
to replace existing names. The header can be a list of integers that specify row locations for a multi-index on the columns e.g. [0,1,3]. Intervening rows that are not specified are skipped (e.g. 2 in this example is skipped). Note that this parameter ignores commented lines and empty lines ifskip_blank_lines=True
, soheader=0
denotes the first line of data rather than the first line of the file.names
: list of column names to use. If the file contains a header row, then you should explicitly passheader=0
to override the column names. Duplicates in this list are not allowed.index_col
: column(s) to use as the row labels of the DataFrame, either given as string name or column index. If a sequence of int/str is given, a MultiIndex is used. Note:index_col=False
can be used to force pandas to not use the first column as the index, e.g. when you have a malformed file with delimiters at the end of each line.For JSON (application/json):
orient
: indication of expected JSON string format. The set of possible orients is: - ‘split’: dict like {index -> [index], columns -> [columns], data -> [values]} - ‘records’: list like [{column -> value}, … , {column -> value}] - ‘index’: dict like {index -> {column -> value}} - ‘columns’: dict like {column -> {index -> value}} - ‘values’: just the values array - ‘table’: dict like {‘schema’: {schema}, ‘data’: {data}}lines
: set toTrue
to read the file as a json object per linenrows
: the number of lines from the line-delimited json file to read. This can only be passed iflines=True
. IfNone
, all the rows are returned.For GeoJSON (application/geo+json or application/vnd.geo+json): No additional parameters available.
- Returns:
the decoded blob, its type correspond to the type declared in the property
supported_content_types
for the content type- Raises:
ValueError – in case the specified content type is not decodable or the schema is mandatory for the content type but missing
UnsupportedContentTypeDecodeException – in case the content type is not decodable
ValueError – if the schema is mandatory for the content type but missing
DecodeException – in case the blob can’t be properly decoded # noqa
SchemaException – in case the schema can’t be used to decode the content # noqa
- property supported_content_types: Dict[str, type | Tuple[type, ...]]#
- Returns:
the dictionary of MIME content types supported when decoding single blobs
with the
decode_blob
function of this decoder, each with the type of the decoded data.
- class here.geopandas_adapter.geopandas_adapter.GeoPandasEncoder[source]#
Bases:
Encoder
Implementation of an
Encoder
to work withpd.DataFrame
andgpd.GeoDataFrame
.- encode_blob(data, content_type: str, schema: Schema | None = None, **kwargs) bytes [source]#
Encode one single blob of data.
- Parameters:
data – the data to be encoded, its type corresponds to the type declared in the property
supported_content_types
for the content typecontent_type – the MIME content type to be encoded
schema – the schema, if the content type requires one
kwargs –
additional, content-type-specific parameters for the encoder:
For Parquet (application/x-parquet): -
engine
: an optional param for type of engine used to parse the parquet data, values allowed are [auto, fastparquet, pyarrow]. If ‘auto’, then the behavior is to try ‘pyarrow’, falling back to ‘fastparquet’ ifArrowNotImplementedError
is raised. - The rest of the parameters are passed unchanged topd.DataFrame.to_parquet()
for further customizations. For additional information please see: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_parquet.htmlFor CSV (text/csv): For parameters and general info, please see: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html
For JSON (application/json):
orient
: indication of JSON string format to produce. The set of possible orients is: - ‘split’: dict like {index -> [index], columns -> [columns], data -> [values]} - ‘records’: list like [{column -> value}, … , {column -> value}] - ‘index’: dict like {index -> {column -> value}} - ‘columns’: dict like {column -> {index -> value}} - ‘values’: just the values array - ‘table’: dict like {‘schema’: {schema}, ‘data’: {data}}lines
: if orient is records write out line-delimited json format. For additional parameters and general info, please see: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_json.htmlFor GeoJSON (application/geo+json or application/vnd.geo+json): For parameters and general info, please see: https://geopandas.org/docs/reference/api/geopandas.GeoDataFrame.to_json.html#geopandas.GeoDataFrame.to_json
For Protobuf (application/protobuf or application/x-protobuf): For parameters and general info, please see: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_records.html
- Returns:
the encoded data
- Raises:
UnsupportedContentTypeEncodeException – in case the content type is not encodable
ValueError – if the schema is mandatory for the content type but missing
EncodeException – in case the blob can’t be properly encoded # noqa
SchemaException – in case the schema can’t be used to encode the content # noqa
- property supported_content_types: Dict[str, type | Tuple[type, ...]]#
- Returns:
the dictionary of MIME content types supported when encoding single blobs
with the
encode_blob
function of this encoder, each with the type of the encoded data.