GDataframe

class GDataframe(regs=None, meta=None)[source]

Class holding the result of a materialization of a GMQLDataset. It is composed by two data structures:

  • A table with the region data
  • A table with the metadata corresponding to the regions
to_dataset_files(local_path=None, remote_path=None)[source]

Save the GDataframe to a local or remote location

Parameters:
  • local_path – a local path to the folder in which the data must be saved
  • remote_path – a remote dataset name that wants to be used for these data
Returns:

None

to_GMQLDataset(local_path=None, remote_path=None)[source]

Converts the GDataframe in a GMQLDataset for later local or remote computation

Returns:a GMQLDataset
project_meta(attributes)[source]

Projects the specified metadata attributes to new region fields

Parameters:attributes – a list of metadata attributes
Returns:a new GDataframe with additional region fields
to_matrix(index_regs=None, index_meta=None, columns_regs=None, columns_meta=None, values_regs=None, values_meta=None, **kwargs)[source]

Transforms the GDataframe to a pivot matrix having as index and columns the ones specified. This function is a wrapper around the pivot_table function of Pandas.

Parameters:
  • index_regs – list of region fields to use as index
  • index_meta – list of metadata attributes to use as index
  • columns_regs – list of region fields to use as columns
  • columns_meta – list of metadata attributes to use as columns
  • values_regs – list of region fields to use as values
  • values_meta – list of metadata attributes to use as values
  • kwargs – other parameters to pass to the pivot_table function
Returns:

a Pandas dataframe having as index the union of index_regs and index_meta, as columns the union of columns_regs and columns_meta and as values ths union of values_regs and values_meta