ConllGraphParser

The base class from where all the others inherit. It can be use to compute syntactic paths between two words of the conll tree/graph.

Use it if you want to:

  • Find the ids of the nodes in the path from the ith word to the jth word

Parameters

# user can set the column indexes if format is not Conll
ConllGraphParser( IDX_ID=0, IDX_FORM=1, IDX_LEMMA=2, 
                  IDX_UPOS=3, IDX_XPOS=4, IDX_MORPH=5, 
                  IDX_HEAD=6, IDX_DEPREL=7, IDX_GRAPH=8 )  

Methods

ConllGraphParser.set_sentence( sentence  )  
> sentence (object or list(list(str))): contains the Conll
ConllGraphParser.set_target( target_index )  
> target_index (int): index of the target starting from zero
path = ConllGraphParser.get_path_to_target( id_ )
> id_ (int): index of the word were we start the path to target  
> path (list(int)): list of ids of words in the path to target
path = ConllGraphParser.get_path_to_root( id_ )
> id_ (int):  index of the word were we start the path to root  
> path (list(int)): list of ids of words in the path to root

A ConllSentence

ID FORM LEMMA UPOS XPOS MORPH HEAD DEPREL GRAPH
1 While while SCONJ IN _ 2 mark _
2 working work VERB VBG VerbForm=Ger 11 advcl _
3 at at ADP IN _ 5 case _
4 a a DET DT Definite=Ind|PronType=Art 5 det _
5 supermarket supermarket NOUN NN Number=Sing 2 obl _
6 as as ADP IN _ 8 case _
7 a a DET DT Definite=Ind|PronType=Art 8 det _
8 bagger bagger NOUN NN Number=Sing 2 obl _
9 , , PUNCT , _ 11 punct _
10 he he PRON PRP Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs 11 nsubj 2:nsubj
11 released release VERB VBD Mood=Ind|Tense=Past|VerbForm=Fin 0 root _
12 music music NOUN NN Number=Sing 11 obj _
13 as as ADP IN _ 16 case _
14 an a DET DT Definite=Ind|PronType=Art 16 det _
15 independent independent ADJ JJ Degree=Pos 16 amod _
16 artist artist NOUN NN Number=Sing 11 obl _

Usage

from ConllQuery import ConllGraphParser

# instantiation
parser = ConllGraphParser()

# set the sentence object that will be processed
parser.set_sentence( sentence ) 

# set the 4th word (count starts from zero) 
# aka 'supermarket' as the target
parser.set_target( 4 ) 

# Path from music to supermarket
parser.get_path_to_target( 11 ) 
> [10, 1, 4]  # the path is music->released->working->supermarket

# Path from music to the root    
parser.get_path_to_root( 11 ) 
> [10]  # the path is music->released
# Update the target, now it is set to while
parser.set_target( 0 )

# Path from music to the While    
parser.get_path_to_target( 11 )
> [10,1,0] # the path is music->released->working->while

# Path from music to the root should remain unchanged 
parser.get_path_to_root( 11 ) 
> [10]  # the path is music->released (remains unchanged)

# Path from root to root leads to an empty list
parser.get_path_to_root( 10 )
> [] # is empty

# Path from target to target leads to an empty list
gp.get_path_to_root( gp.target_index )
> [] # is empty