ConllRelator¶
Inherits from ConllGraphParser and adds supplementary functionalities to access tokens in a sentence using parenthood relations.
You should use this module is you want things like:
- Find the parent of the 3rd token
- Find the children and grandsons of the 5th token
- Find the brothers of the 7th token
- Find the brothers and uncles of the 7th token
Parameters¶
# user can set the column indexes if format is not Conll
ConllRelator( IDX_ID=0, IDX_FORM=1, IDX_LEMMA=2,
IDX_UPOS=3, IDX_XPOS=4, IDX_MORPH=5,
IDX_HEAD=6, IDX_DEPREL=7, IDX_GRAPH=8 )
Methods¶
ConllRelator.set_sentence( sentence )
> sentence (object or list(list(str))): contains the Conll
ConllRelator.set_target( target_index )
> target_index (int): index of the target starting from zero
resp = ConllRelator.find( id_ , relation )
> id_ (int): index of the token we want to select
> relation (RELATION): relation or tuple of relations to find
> resp (list(int)): list of the indexes of tokens found
path = ConllRelator.get_path_to_target( id_ )
> id_ (int): index of the token were we start the path to target
> path (list(int)): list of ids of tokens in the path to target
path = ConllRelator.get_path_to_root( id_ )
> id_ (int): index of the token were we start the path to root
> path (list(int)): list of ids of tokens in the path to root
A ConllSentence¶
ID | FORM | LEMMA | UPOS | XPOS | MORPH | HEAD | DEPREL | GRAPH |
---|---|---|---|---|---|---|---|---|
1 | While | while | SCONJ | IN | _ | 2 | mark | _ |
2 | working | work | VERB | VBG | VerbForm=Ger | 11 | advcl | _ |
3 | at | at | ADP | IN | _ | 5 | case | _ |
4 | a | a | DET | DT | Definite=Ind|PronType=Art | 5 | det | _ |
5 | supermarket | supermarket | NOUN | NN | Number=Sing | 2 | obl | _ |
6 | as | as | ADP | IN | _ | 8 | case | _ |
7 | a | a | DET | DT | Definite=Ind|PronType=Art | 8 | det | _ |
8 | bagger | bagger | NOUN | NN | Number=Sing | 2 | obl | _ |
9 | , | , | PUNCT | , | _ | 11 | punct | _ |
10 | he | he | PRON | PRP | Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs | 11 | nsubj | 2:nsubj |
11 | released | release | VERB | VBD | Mood=Ind|Tense=Past|VerbForm=Fin | 0 | root | _ |
12 | music | music | NOUN | NN | Number=Sing | 11 | obj | _ |
13 | as | as | ADP | IN | _ | 16 | case | _ |
14 | an | a | DET | DT | Definite=Ind|PronType=Art | 16 | det | _ |
15 | independent | independent | ADJ | JJ | Degree=Pos | 16 | amod | _ |
16 | artist | artist | NOUN | NN | Number=Sing | 11 | obl | _ |
Usage¶
from ConllQuery import ConllRelator
from ConllQuery.enumerations import RELATION as R
# instantiation
parser = ConllRelator()
# set the sentence object that will be processed
parser.set_sentence( sentence )
# Parent of 7th word
parser.find( 7, R.PARENT )
> [1] # the parent of bagger(7) is working(1)
# Ancestors of 7th word
parser.find( 7, R.ANCESTORS )
> [1,10] # the ancestors of bagger(7) are working(1) and released(10)
# Set the target to while
parser.set_target( 0 )
# Path from music to the target 'While'
parser.find( 11, R.TARGET )
> [10,1,0] # the path is music->released->working->while
# If you pass a tuple to the find argument is interpreted as an OR
parser.find( 11, (R.BROTHER,R.PARENT) )
> [1,8,9,15,10] # brothers of music ([1, 8, 9, 15]) + parent([10])
# no match leads to an empty list
parser.find( 10, R.PARENT )
> [] # is empty