ConllSimpleQuery

Inherits from ConllRelator and ConllSelector. It provides a simple SQL alike language to index a Conll Tree/Graph.

You should use this module if you want to make SQL alike queries on your sentences. For example:

  • Select the POS and Lemmas of the Children of the ith word
  • Select the Dependency Relations of the Brothers of the ith word

Parameters

# user can set the column indexes if format is not Conll
ConllSimpleQuery( IDX_ID=0, IDX_FORM=1, IDX_LEMMA=2, 
                  IDX_UPOS=3, IDX_XPOS=4, IDX_MORPH=5, 
                  IDX_HEAD=6, IDX_DEPREL=7, IDX_GRAPH=8 )  

Methods

ConllSimpleQuery.set_sentence( sentence  )  
> sentence (object or list(list(str))): contains the Conll
ConllSimpleQuery.set_target( target_index )  
> target_index (int): index of the target starting from zero
resp = ConllSimpleQuery.query( id_ , relation , feature )  
> id_ (int): index of the token we want to select
> relation (RELATION): relation or tuple of relations to find
> feature (FEATURE): feature or tuple of features to get from relation
> resp (str): string containing the values of the selected features
resp = ConllSimpleQuery.find( id_ , relation )  
> id_ (int): index of the token we want to select
> relation (RELATION): relation or tuple of relations to find
> resp (list(int)): list of the indexes of tokens found
resp = ConllSimpleQuery.select( id_ , feature )  
> id_ (int): index of the token we want to select
> feature (FEATURE): name of the feature or tuple of features to get
> resp (str): string containing the values of the selected features
path = ConllSimpleQuery.get_path_to_target( id_ )
> id_ (int): index of the token were we start the path to target  
> path (list(int)): list of ids of tokens in the path to target
path = ConllSimpleQuery.get_path_to_root( id_ )
> id_ (int):  index of the token were we start the path to root  
> path (list(int)): list of ids of tokens in the path to root

A ConllSentence

ID FORM LEMMA UPOS XPOS MORPH HEAD DEPREL GRAPH
1 While while SCONJ IN _ 2 mark _
2 working work VERB VBG VerbForm=Ger 11 advcl _
3 at at ADP IN _ 5 case _
4 a a DET DT Definite=Ind|PronType=Art 5 det _
5 supermarket supermarket NOUN NN Number=Sing 2 obl _
6 as as ADP IN _ 8 case _
7 a a DET DT Definite=Ind|PronType=Art 8 det _
8 bagger bagger NOUN NN Number=Sing 2 obl _
9 , , PUNCT , _ 11 punct _
10 he he PRON PRP Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs 11 nsubj 2:nsubj
11 released release VERB VBD Mood=Ind|Tense=Past|VerbForm=Fin 0 root _
12 music music NOUN NN Number=Sing 11 obj _
13 as as ADP IN _ 16 case _
14 an a DET DT Definite=Ind|PronType=Art 16 det _
15 independent independent ADJ JJ Degree=Pos 16 amod _
16 artist artist NOUN NN Number=Sing 11 obl _

Usage

from ConllQuery import ConllSimpleQuery
from ConllQuery.enumerations import FEATURES as F
from ConllQuery.enumerations import RELATION as R

# instantiation
parser = ConllSimpleQuery()

# set the sentence object that will be processed
parser.set_sentence( sentence ) 

# UPOS of the Parent of 7th word
parser.query( 7, R.PARENT, F.UPOS ) 
> ['VERB']  # there is only one parent of bagger(7) and has UPOS=VERB 

# To refer to word itself use the TOKEN relation
parser.query( 7, R.TOKEN, F.FORM ) 
> ['bagger']  # Similar to ConllSelector but retrieves a list
# Set the target to while
parser.set_target( 0 )

# selector argument can handle tuples, in this case it means join   
parser.query( 11, R.TOKEN, (F.PATH_TO_TARGET,F.DIRECTION_TO_TARGET) )
> ['10:1:0###uud'] # both strings are joint using '###' 

# find argument can handle tuples, in this case is a logic or
parser.query( 7, (R.TOKEN,R.PARENT), F.LEMMA ) 
> ['bagger','work']  # retrieves both the token and the parent

# you can also provide tuples to both arguments
parser.query( 7, (R.TOKEN,R.PARENT), (F.UPOS,F.LEMMA) ) 
> ['NOUN###bagger','VERB###work']  # list of join strings