Building expressionsΒΆ
When doing a selection (using meta_select()
, reg_select()
)
or a projection (using meta_project()
, reg_project()
)
you are required to specify an expression on the metadata or region fields.
An expression can therefore use metadata attributes or region fields.
Given a GMQLDataset dataset
, one can access its region fields by typing:
dataset.field1
dataset.field2
dataset.chr
dataset.start
...
and one can access its metadata attributes by typing:
dataset['metadata_attribute_1']
dataset['metadata_attribute_2']
dataset['metadata_attribute_3']
...
The expressions in PyGMQL can be of two types:
Predicate: a logical condition that enables to select a portion of the dataset. This expression is used in selection. Some example of predicates follow:
# region predicate (dataset.chr == 'chr1' || dataset.pValue < 0.9) # region predicate with access to metadata attributes dataset.score > dataset['size']It is possible, based on the function that requires a predicate, to mix region fields and metadata attributes in a region condition. Of course it is not possible to mix metadata and region conditions in a metadata selection (this is due to the fact that to each metadata attribute can be associated multiple values for each region field).
Extension: a mathematical expression describing how to build new metadata or region fields based on the existent ones. Some examples of expression follow:
# region expression dataset.start + dataset.stop dataset.p_value / dataset.q_value # metadata expression dataset['size'] * 8.9 dataset['score'] / dataset['size']It is possible to mix region fields and metadata attributes in region extensions:
# region expression using metadata attributes (dataset.pvalue / 2) + dataset['metadata'] + 1