Genometric Query Language

GMQL is a declarative language for genomic region and metadata manipulation with a SQL-inspired syntax. With GMQL the user can perform complex queries on the basis of positional, categorical and numeric features of the datasets.

You can find more information about the language at the following links:

NB: In order to use PyGMQL one should have at least clear the semantics of the GMQL operators, but the library is designed to be self contained and can be used without a strong background knowledge of the language.

GMQL engine

The GMQL engine is composed by various sub-systems:
  • A repository, which enables the user to store his/her datasets, the results of the queries and to access the public datasets shared between the users of the same GMQL instance
  • An engine implementation, which implements the GMQL operators. Currently the Spark engine is the most updated and complete implementation and it is the one used also by PyGMQL

GMQL WEB interface

The GMQL system is publicly available at this link.