GoodData Pandas Documentation¶
GoodData Pandas contains a thin layer that utilizes GoodData Python SDK and allows you to conveniently create pandas series and data frames from the computations done against semantic model in your GoodData.CN workspace.
Getting Started¶
Requirements¶
GoodData.CN installation; either running on your cloud infrastructure or the free Community Edition running on your workstation
Python 3.7 or newer
Installation¶
Run the following command to install the gooddata-pandas
package on your system:
pip install gooddata-pandas
Examples¶
Here are a couple of introductory examples how to create indexed and not-indexed series and data frames:
Series¶
from gooddata_pandas import GoodPandas
# GoodData.CN host in the form of uri eg. "http://localhost:3000"
host = "http://localhost:3000"
# GoodData.CN user token
token = "some_user_token"
# initialize the adapter to work on top of GD.CN host and use the provided authentication token
gp = GoodPandas(host, token)
workspace_id = "demo"
series = gp.series(workspace_id)
# create indexed series
indexed_series = series.indexed(index_by="label/label_id", data_by="fact/measure_id")
# create non-indexed series containing just the values of measure sliced by elements of the label
non_indexed = series.not_indexed(data_by="fact/measure_id", granularity="label/label_id")
Data Frames¶
from gooddata_pandas import GoodPandas
# GoodData.CN host in the form of uri eg. "http://localhost:3000"
host = "http://localhost:3000"
# GoodData.CN user token
token = "some_user_token"
# initialize the adapter to work on top of GD.CN host and use the provided authentication token
gp = GoodPandas(host, token)
workspace_id = "demo"
frames = gp.data_frames(workspace_id)
# create indexed data frame
indexed_df = frames.indexed(
index_by="label/label_id",
columns=dict(
first_label='label/first_label_id',
second_label='label/second_label_id',
first_metric='metric/first_metric_id',
second_metric='fact/fact_id'
)
)
# create data frame with hierarchical index
indexed_df = frames.indexed(
index_by=dict(first_label='label/first_label_id', second_label='label/second_label_id'),
columns=dict(first_metric='metric/first_metric_id', second_metric='fact/fact_id')
)
# create non-indexed data frame
non_indexed_df = frames.not_indexed(
columns=dict(
first_label='label/first_label_id',
second_label='label/second_label_id',
first_metric='metric/first_metric_id',
second_metric='fact/fact_id'
)
)
# creates data frame based on the contents of the insight. if the insight contains labels and measures, the data
# frame will contain index or hierarchical index.
insight_df = frames.for_insight('insight_id')
# creates data frame based on the content of the items dict. if the dict contains both labels and measures, the
# frame will contain index or hierarchical index.
df = frames.for_items(
items=dict(
first_label='label/first_label_id',
second_label='label/second_label_id',
first_metric='metric/first_metric_id',
second_metric='fact/fact_id'
)
)
API Documentation¶
Check out the API section for further information.