gooddata_pandas.dataframe.DataFrameFactory

class gooddata_pandas.dataframe.DataFrameFactory(sdk: GoodDataSdk, workspace_id: str)

Bases: object

Factory to create pandas.DataFrame instances.

Methods:
  • indexed(self, index_by: IndexDef, columns:ColumnsDef, filter_by: Optional[Union[Filter, list[Filter]]] = None)

    -> pandas.DataFrame:

  • not_indexed(self, columns: ColumnsDef, filter_by: Optional[Union[Filter, list[Filter]]] = None)

    -> pandas.DataFrame:

  • for_items(self, items: ColumnsDef, filter_by: Optional[Union[Filter, list[Filter]]] = None,

    auto_index: bool = True) -> pandas.DataFrame:

  • for_insight(self, insight_id: str, auto_index: bool = True)

    -> pandas.DataFrame:

  • result_cache_metadata_for_exec_result_id(self, result_id: str)

    -> ResultCacheMetadata:

  • for_exec_def(self, exec_def: ExecutionDefinition, label_overrides: Optional[LabelOverrides] = None,

    result_size_dimensions_limits: ResultSizeDimensions = (), result_size_bytes_limit: Optional[int] = None, page_size: int = _DEFAULT_PAGE_SIZE,) -> Tuple[pandas.DataFrame, DataFrameMetadata]:

  • for_exec_result_id(self, result_id: str, label_overrides: Optional[LabelOverrides] = None,

    result_cache_metadata: Optional[ResultCacheMetadata] = None, result_size_dimensions_limits: ResultSizeDimensions = (), result_size_bytes_limit: Optional[int] = None, use_local_ids_in_headers: bool = False, page_size: int = _DEFAULT_PAGE_SIZE,) -> Tuple[pandas.DataFrame, DataFrameMetadata]:

__init__(sdk: GoodDataSdk, workspace_id: str) None
Args:

sdk (GoodDataSdk): GoodData SDK instance. workspace_id (str): Workspace identifier.

Methods

__init__(sdk, workspace_id)

Args:

for_exec_def(exec_def[, label_overrides, ...])

Creates a data frame using an execution definition.

for_exec_result_id(result_id[, ...])

Retrieves a DataFrame and DataFrame metadata for a given execution result identifier.

for_insight(insight_id[, auto_index])

Creates a data frame with columns based on the content of the insight with the provided identifier.

for_items(items[, filter_by, auto_index])

Creates a data frame for named items.

indexed(index_by, columns[, filter_by])

Creates a data frame indexed by values of the label.

not_indexed(columns[, filter_by])

Creates a data frame with columns created from metrics and or labels.

result_cache_metadata_for_exec_result_id(...)

Retrieves result cache metadata for given :result_id:

for_exec_def(exec_def: ExecutionDefinition, label_overrides: Optional[Dict[str, Dict[str, Dict[str, str]]]] = None, result_size_dimensions_limits: Tuple[Optional[int], ...] = (), result_size_bytes_limit: Optional[int] = None, page_size: int = 100) Tuple[DataFrame, DataFrameMetadata]

Creates a data frame using an execution definition.

Each dimension may be sliced by multiple labels. The factory will create MultiIndex for the dataframe’s row index and the columns.

Example of label_overrides structure:

{
    "labels": {
        "local_attribute_id": {
            "title": "My new attribute label"
        ,...
    },
    "metrics": {
        "local_metric_id": {
            "title": "My new metric label"
        },...
    }
}
Args:

exec_def (ExecutionDefinition): Execution definition. label_overrides (Optional[LabelOverrides]): Label overrides for metrics and attributes. result_size_dimensions_limits (ResultSizeDimensions): A tuple containing maximum size of result dimensions. result_size_bytes_limit (Optional[int]): Maximum size of result in bytes. page_size (int): Number of records per page.

Returns:

Tuple[pandas.DataFrame, DataFrameMetadata]: Tuple holding DataFrame and DataFrame metadata.

for_exec_result_id(result_id: str, label_overrides: Optional[Dict[str, Dict[str, Dict[str, str]]]] = None, result_cache_metadata: Optional[ResultCacheMetadata] = None, result_size_dimensions_limits: Tuple[Optional[int], ...] = (), result_size_bytes_limit: Optional[int] = None, use_local_ids_in_headers: bool = False, page_size: int = 100) Tuple[DataFrame, DataFrameMetadata]

Retrieves a DataFrame and DataFrame metadata for a given execution result identifier.

Example of label_overrides structure:

{
    "labels": {
        "local_attribute_id": {
            "title": "My new attribute label"
        ,...
    },
    "metrics": {
        "local_metric_id": {
            "title": "My new metric label"
        },...
    }
}
Args:

result_id (str): Execution result identifier. label_overrides (Optional[LabelOverrides]): Label overrides for metrics and attributes. result_cache_metadata (Optional[ResultCacheMetadata]): Cache metadata for the execution result. result_size_dimensions_limits (ResultSizeDimensions): A tuple containing maximum size of result dimensions. result_size_bytes_limit (Optional[int]): Maximum size of result in bytes. use_local_ids_in_headers (bool): Use local identifier in headers. page_size (int): Number of records per page.

Returns:

Tuple[pandas.DataFrame, DataFrameMetadata]: Tuple holding DataFrame and DataFrame metadata.

for_insight(insight_id: str, auto_index: bool = True) DataFrame

Creates a data frame with columns based on the content of the insight with the provided identifier.

Args:

insight_id (str): Insight identifier. auto_index (bool): Default True. Enables creation of DataFrame with index depending on the contents

of the insight.

Returns:

pandas.DataFrame: A DataFrame instance.

for_items(items: Dict[str, Union[Attribute, Metric, ObjId, str]], filter_by: Optional[Union[Filter, list[gooddata_sdk.compute.model.base.Filter]]] = None, auto_index: bool = True) DataFrame

Creates a data frame for named items. This is a convenience method that will create DataFrame with or without index based on the context of the items that you pass.

Args:

items (ColumnsDef): Dictionary mapping item name to its definition. filter_by (Optional[Union[Filter, list[Filter]]]): Optionally specify filters to apply during computation

on the server.

auto_index (bool): Default True. Enables creation of DataFrame with index depending on the contents

of the items.

Returns:

pandas.DataFrame: A DataFrame instance.

indexed(index_by: Union[Attribute, ObjId, str, Dict[str, Union[Attribute, ObjId, str]]], columns: Dict[str, Union[Attribute, Metric, ObjId, str]], filter_by: Optional[Union[Filter, list[gooddata_sdk.compute.model.base.Filter]]] = None) DataFrame

Creates a data frame indexed by values of the label. The data frame columns will be created from either metrics or other label values.

Note that depending on composition of the labels, the DataFrame’s index may or may not be unique.

Args:

index_by (IndexDef): One or more labels to index by. columns (ColumnsDef): Dictionary mapping column name to its definition. filter_by (Optional[Union[Filter, list[Filter]]]):

Optional filters to apply during computation on the server.

Returns:

pandas.DataFrame: A DataFrame instance.

not_indexed(columns: Dict[str, Union[Attribute, Metric, ObjId, str]], filter_by: Optional[Union[Filter, list[gooddata_sdk.compute.model.base.Filter]]] = None) DataFrame

Creates a data frame with columns created from metrics and or labels.

Args:

columns (ColumnsDef): Dictionary mapping column name to its definition. filter_by (Optional[Union[Filter, list[Filter]]]): Optionally specify filters to apply during

computation on the server.

Returns:

pandas.DataFrame: A DataFrame instance.

result_cache_metadata_for_exec_result_id(result_id: str) ResultCacheMetadata

Retrieves result cache metadata for given :result_id:

Args:

result_id (str): ID of execution result to retrieve the metadata for.

Returns:

ResultCacheMetadata: Corresponding result cache metadata.