Parsers¶
Predefined parsers¶
Customizable parser¶
All the parsers in PyGMQL extend the RegionParser
-
class
RegionParser
(gmql_parser=None, chrPos=None, startPos=None, stopPos=None, strandPos=None, otherPos=None, delimiter='t', coordinate_system='0-based', schema_format='del', parser_name='parser')[source]¶ Creates a custom region dataset
Parameters: - chrPos – position of the chromosome column
- startPos – position of the start column
- stopPos – position of the stop column
- strandPos – (optional) position of the strand column. Default is None
- otherPos – (optional) list of tuples of the type [(pos, attr_name, typeFun), …]. Default is None
- delimiter – (optional) delimiter of the columns of the file. Default ” “
- coordinate_system – can be {‘0-based’, ‘1-based’, ‘default’}. Default is ‘0-based’
- schema_format – (optional) type of file. Can be {‘tab’, ‘gtf’, ‘vcf’, ‘del’}. Default is ‘del’
- parser_name – (optional) name of the parser. Default is ‘parser’
-
static
parse_strand
(strand)[source]¶ Defines how to parse the strand column
Parameters: strand – a string representing the strand Returns: the parsed result
-
parse_regions
(path)[source]¶ Given a file path, it loads it into memory as a Pandas dataframe
Parameters: path – file path Returns: a Pandas Dataframe