GoodData Pandas Documentation

GoodData Pandas contains a thin layer that utilizes GoodData Python SDK and allows you to conveniently create pandas series and data frames from the computations done against semantic model in your GoodData.CN workspace.

Installation

Requirements

  • Python 3.7 or newer

  • GoodData.CN installation; either running on your cloud infrastructure or the free Community Edition running on your workstation

Installation

Run the following command to install the gooddata-pandas package on your system:

pip install gooddata-pandas

Examples

Here are a couple of introductory examples how to create indexed and not-indexed series and data frames:

Series

from gooddata_pandas import GoodPandas

# GoodData.CN host in the form of uri eg. "http://localhost:3000"
host = "http://localhost:3000"
# GoodData.CN user token
token = "some_user_token"
# initialize the adapter to work on top of GD.CN host and use the provided authentication token
gp = GoodPandas(host, token)

workspace_id = "demo"
series = gp.series(workspace_id)

# create indexed series
indexed_series = series.indexed(index_by="label/label_id", data_by="fact/measure_id")

# create non-indexed series containing just the values of measure sliced by elements of the label
non_indexed = series.not_indexed(data_by="fact/measure_id", granularity="label/label_id")

Data Frames

from gooddata_pandas import GoodPandas

# GoodData.CN host in the form of uri eg. "http://localhost:3000"
host = "http://localhost:3000"
# GoodData.CN user token
token = "some_user_token"
# initialize the adapter to work on top of GD.CN host and use the provided authentication token
gp = GoodPandas(host, token)

workspace_id = "demo"
frames = gp.data_frames(workspace_id)

# create indexed data frame
indexed_df = frames.indexed(
    index_by="label/label_id",
    columns=dict(
        first_label='label/first_label_id',
        second_label='label/second_label_id',
        first_metric='metric/first_metric_id',
        second_metric='fact/fact_id'
    )
)

# create data frame with hierarchical index
indexed_df = frames.indexed(
    index_by=dict(first_label='label/first_label_id', second_label='label/second_label_id'),
    columns=dict(first_metric='metric/first_metric_id', second_metric='fact/fact_id')
)

# create non-indexed data frame
non_indexed_df = frames.not_indexed(
    columns=dict(
        first_label='label/first_label_id',
        second_label='label/second_label_id',
        first_metric='metric/first_metric_id',
        second_metric='fact/fact_id'
    )
)

# creates data frame based on the contents of the insight. if the insight contains labels and
#  measures, the data frame will contain index or hierarchical index.
insight_df = frames.for_insight('insight_id')

# creates data frame based on the content of the items dict. if the dict contains both labels
# and measures, the frame will contain index or hierarchical index.
df = frames.for_items(
    items=dict(
        first_label='label/first_label_id',
        second_label='label/second_label_id',
        first_metric='metric/first_metric_id',
        second_metric='fact/fact_id'
    )
)

API

gooddata_pandas

gooddata_sdk

The gooddata-sdk package aims to provide clean and convenient Python APIs to interact with GoodData.CN.