Introduction

This is a tool to query data in Conll format !

It is composed of several sub modules with different features.

We list them in order of complexity and expressiveness, from the lowest to the highest level

ConllGraphParser

The base class from where all the others inherit. It can be use to compute syntactic paths between two words of the conll tree/graph.

Use it if you want to:

  • Find the ids of the nodes in the path from the ith word to the jth word
  • Nothing else…

ConllRelator

Inherits from ConllGraphParser and adds supplementary functionalities to access tokens in a sentence using parenthood relations.

You should use this module is you want things like:

  • Find the parent of the 3rd token
  • Find the children and grandsons of the 5th token
  • Find the brothers of the 7th token
  • Find the brothers and uncles of the 7th token

ConllSelector

Inherits from ConllGraphParser and adds supplementary functionalities to access token’s attributes

You should use this module if you want to access to attributes of tokens in a sentence. For example:

  • Find the Part-of-Speech of the 3rd token
  • Find the Dependency Relation of the 5th token
  • Find the Dependency Relation and Lemma of the 11th token

ConllRegExp

Inherits from ConllRelator and ConllSelector. It provides a very expressive RegExp module that allows to query complex patterns in the Conll Tree/Graph.

You should use this module if you want to match patterns into a sentence. Examples:

  • Get all the tokens that begin with a consonant
  • Get all the Parts-of-Speech of tokens that begin with a consonant
  • Get all the tokens with a prepositional child and a parent that is a verb
  • Get all the tokens with a child that is an adverb and begins with ‘fa’

RegExp can be as complex as you wish…

  • Get all the tokens that do not contains numbers and have a child that is an adjective and a parent that is a verb that begins with an f
  • Get all the tokens that do not contains numbers and have a child that is an adjective and this child itself has another child that is a conjunction that starts with a and has 3 letters

To learn all about RegExp in Conll trees read ConllRegExp

ConllSimpleQuery

Inherits from ConllRelator and ConllSelector. It provides a simple SQL alike language to index a Conll Tree/Graph.

You should use this module if you want to make SQL alike queries on your sentences. For example:

  • Select the POS and Lemmas of the Children of the ith word
  • Select the Dependency Relations of the Brothers of the ith word

ConllQuery

Inherits from ConllRegExp, ConllSelector and ResponsePostProcessor. It provides a more complex but fully complete SQL alike language to query a Conll Tree/Graph.

You should use this module if you want to make SQL alike queries on your sentences. For example:

  • Select the Lemmas of the Children of the ith word that are Prepositions
  • Select the first 3 DependencyRelations of the Brothers of the ith word sorted alphabetically
  • Select all the words that have a verb parent with a subject attachment and a prepositional child with a given lemma
  • Build the path of Lemmas and POS from ith token to jth token
  • Count how many Adjective children does the kth word has