Utilities to manipulate the structure of pandas DataFrame and Series

here.geopandas_adapter.utils.dataframe.prefix_columns(data: DataFrame, prefix: str, columns: List[str] | None = None) DataFrame[source]#

Rename all or selected columns by adding a prefix to their name. The prefix is prepended to the column names using . as separator.

For example, when applied with prefix my_attr, the function renames columns a, b and c to my_attr.a, my_attr.b and my_attr.c.

Parameters:
  • data – the input dataframe

  • prefix – the prefix to prepend to the selected columns

  • columns – names of column to nest under the prefix. The operation is applied to all the columns of the input dataframe if not specified.

Returns:

a dataframe with the selected columns nested under a prefix

Raises:

ValueError – in case the prefix is invalid

here.geopandas_adapter.utils.dataframe.replace_column(dataframe: DataFrame, column: str, new_columns: DataFrame) DataFrame[source]#

Replace one column of a DataFrame with the all columns of another DataFrame.

The selected column is removed and the new columns are inserted in its place. Indices are aligned, but only rows already present in the input dataframe are retained: rows present in the new columns but not in the original dataframe are discarded. The input dataframe is not altered.

Parameters:
  • dataframe – the input dataframe

  • column – the name of the column of the input dataframe to replace

  • new_columns – the DataFrame to replace the selected column with

Returns:

a new DataFrame with the column replaced

here.geopandas_adapter.utils.dataframe.unpack(series: Series, keep_prefix: bool = True, max_level: int = -1) DataFrame[source]#

Unpack a Series that contains dictionaries into a DataFrame that has one column for each of the field of the dictionaries found in the series.

In the process, nested dictionaries are also unpacked: if a field of a dictionary contains a nested dictionary, its fields are added as columns as well, with a name composed by both the names of the field and nested field, separated by ..

Unpacking is recursive until nested dictionaries are found, or a maximum level is reached. For more details about the recursive unpacking process, please see here.platform.utils.collection_utils.flatten_iterator.

The resulting columns, their order and their types are function of the data. The index of the series is retained. None, pd.NA and other values that are not dictionaries, including lists and scalar values, are discarded.

Parameters:
  • series – the series containing dictionaries to unpack

  • keep_prefix – keep the name of the series as prefix for the unpacked columns

  • max_level – the maximum level of the recursive unpacking. 0 performs no recursive unpacking, 1 unpacks only the first nested dictionaries, 2 only the first and second nested dictionaries, and so on. A negative number represents no limit.

Returns:

a DataFrame, with one column for each of the fields of the dictionaries

here.geopandas_adapter.utils.dataframe.unpack_columns(dataframe: DataFrame, columns: str | List[str], keep_columns: bool = False, keep_prefix: bool = True, max_level: int = -1) DataFrame[source]#

Unpack one or more columns of a DataFrame, replacing them with columns extracted from the fields of the dictionaries they contain.

Similarly to how pandas explode function can unroll a list to multiple rows of a DataFrame, this function is useful to unpack columns containing dict into constructs that are easier to manipulate with pandas.

The input dataframe is not altered and its index retained.

Unpacking is recursive, optionally down to a certain maximum level. See unpack of a single Series for details of the unpacking algorithm.

Parameters:
  • dataframe – the DataFrame to unpack

  • columns – one or more names of columns to unpack

  • keep_columns – keep the original unpacked columns in the resulting dataframe, retaining any single values but discarding dictionaries that have been unpacked

  • keep_prefix – keep the name of the columns as prefix for the unpacked columns

  • max_level – the maximum level of the recursive unpacking. 0 performs no recursive unpacking, 1 unpacks only the first nested dictionaries, 2 only the first and second nested dictionaries, and so on. A negative number represents no limit. A single max_level is used for all the columns. To unpack columns, each with a different maximum unpacking level, call this function more than once.

Returns:

the input dataframe with the selected columns unpacked.

here.geopandas_adapter.utils.dataframe.unprefix_columns(data: DataFrame, prefix: str) DataFrame[source]#

Rename all the columns that start with a prefix by removing it. The . separator after the prefix is checked and removed as well.

For example, when applied with prefix my_attr, the function renames columns my_attr.a, my_attr.b and my_attr.c to a, b and c.

Parameters:
  • data – the input dataframe

  • prefix – the prefix to remove from the columns

Returns:

a dataframe with the given prefix removed from the column names

Raises:

ValueError – in case the prefix is invalid