spinedb_api.dataframes

This module offers a Pandas interoperability layer to spinedb_api.

Warning

This is highly experimental API.

The main access points here are to_dataframe() which converts a parameter value item to a dataframe, and add_or_update_from() which creates and updates data in a database mapping based on dataframe input. Additionally, fetch_as_dataframe() grants direct-query access to parameter values. It is somewhat more involved to use than the other functions but may be faster since it relies on database queries only bypassing the complex caching in database mapping. It might be useful if you do not need to write back to the database.

The dataframes returned by the to_dataframe() and fetch_as_dataframe() functions have the following structure:

Dataframe columns left-to-right

Column

Content

entity_class_name

class names

<dimension 1>

“leaf” entity names (bynames)

more entity bynames

parameter_definition_name

parameter names

alternative_name

alternative names

<index 1>

parameter indices, e.g. time stamps

more parameter indices

value

“leaf” values

For example, say we want to get unit_capacity time series for a relationship between power_plant_a and electricity_node and plot it. This is straightforward with to_dataframe():

import matplotlib.pyplot as plt
from spinedb_api import DatabaseMapping
from spinedb_api.dataframes import to_dataframe

with DatabaseMapping(url) as db_map:
    value_item = db_map.get_parameter_value_item(
        entity_class_name="unit__to_node",
        entity_byname=("power_plant_a", "electricity_node"),
        parameter_definition_name="unit_capacity",
        alternative_name="Base",
    )
    df = to_dataframe(value_item)

figure, axes = plt.subplots()
df.plot(x="t", y="value", ax=axes)
plt.show()

Note

The ignore_year and repeat attributes are stored in the attrs attribute of the dataframe if it contains time series.

The dataframe can be used to add new values or update existing values in a database, proven that the target entities, parameter definitions and alternatives exist already. For example, using df from above:

with DatabaseMapping(target_url) as db_map:
    add_or_update_from(df, db_map)
    db_map.commit_session("Added unit_capacity value.")

Warning

Time series are currently added/updated as spinedb_api.parameter_value.Map values rather than as spinedb_api.parameter_value.TimeSeriesFixedResolution, because add_or_update_from() does not implement dataframe -> time series transformation yet.

To use fetch_as_dataframe() instead of to_dataframe(), FetchedMaps needs to be instantiated and a special SQLAlchemy query prepared:

with DatabaseMapping(url) as db_map:
    maps = FetchedMaps.fetch(db_map)
    query = parameter_value_sq(db_map)
    final_query = (
        db_map.query(query)
            .filter(query.c.entity_class_name == "node")
            .filter(query.c.parameter_definition_name=="state_coeff")
            .filter(query.c.alternative_name=="Base")
    ).subquery()
    df = fetch_as_dataframe(db_map, final_query, maps)

Classes

FetchedMaps

A 'cache' class that holds information required to build a dataframe with fetch_as_dataframe().

Functions

to_dataframe(→ pandas.DataFrame)

Converts parameter value item to dataframe.

add_or_update_from(→ None)

Adds or updates parameter value items from dataframe.

parameter_value_sq(→ sqlalchemy.sql.Subquery)

Returns basic parameter value subquery required by fetch_as_dataframe().

fetch_as_dataframe(→ pandas.DataFrame)

Fetches parameter values from database returning them as dataframe.

Module Contents

spinedb_api.dataframes.to_dataframe(item: spinedb_api.db_mapping_base.PublicItem) pandas.DataFrame[source]

Converts parameter value item to dataframe.

spinedb_api.dataframes.add_or_update_from(dataframe: pandas.DataFrame, db_map: spinedb_api.DatabaseMapping) None[source]

Adds or updates parameter value items from dataframe.

The dataframe is expected to contain at least entity_class_name, parameter_definition_name, alternative_name and value columns, and a column for each 0-dimensional entity from which entity bynames can be composed. Any additional column is considered as value index.

The database mapping must contain the target entity, parameter definition and alternative before this operation. This helps in finding typos in the dataframe.

class spinedb_api.dataframes.FetchedMaps(list_value_map: IdToListValueMap, entity_class_name_map: IdToNameMap, entity_name_and_class_map: IdToNameAndClassMap, entity_element_map: IdToIdListMap, entity_dimension_map: IdToIdMap)[source]

A ‘cache’ class that holds information required to build a dataframe with fetch_as_dataframe().

classmethod fetch(db_map: spinedb_api.DatabaseMapping) FetchedMaps[source]

Instantiates a FetchedMaps with data queried from a database.

spinedb_api.dataframes.parameter_value_sq(db_map: spinedb_api.DatabaseMapping) sqlalchemy.sql.Subquery[source]

Returns basic parameter value subquery required by fetch_as_dataframe().

spinedb_api.dataframes.fetch_as_dataframe(db_map: spinedb_api.DatabaseMapping, value_sq: sqlalchemy.sql.Subquery, fetched_maps: FetchedMaps) pandas.DataFrame[source]

Fetches parameter values from database returning them as dataframe.