ConllSelector¶

Inherits from ConllGraphParser and adds supplementary functionalities to access token’s attributes

You should use this module if you want to access to attributes of tokens in a sentence. For example:

Find the Part-of-Speech of the 3rd token
Find the Dependency Relation of the 5th token
Find the Dependency Relation and Lemma of the 11th token

Parameters¶

# user can set the column indexes if format is not Conll
ConllSelector( IDX_ID=0, IDX_FORM=1, IDX_LEMMA=2, 
               IDX_UPOS=3, IDX_XPOS=4, IDX_MORPH=5, 
               IDX_HEAD=6, IDX_DEPREL=7, IDX_GRAPH=8 )  

Methods¶

ConllSelector.set_sentence( sentence  )  
> sentence (object or list(list(str))): contains the Conll

ConllSelector.set_target( target_index )  
> target_index (int): index of the target starting from zero

resp = ConllSelector.select( id_ , feature )  
> id_ (int): index of the token we want to select
> feature (FEATURE): name of the feature or tuple of features to get
> resp (str): string containing the values of the selected features

path = ConllSelector.get_path_to_target( id_ )
> id_ (int): index of the token were we start the path to target  
> path (list(int)): list of ids of tokens in the path to target

path = ConllSelector.get_path_to_root( id_ )
> id_ (int):  index of the token were we start the path to root  
> path (list(int)): list of ids of tokens in the path to root

A ConllSentence¶

ID	FORM	LEMMA	UPOS	XPOS	MORPH	HEAD	DEPREL	GRAPH
1	While	while	SCONJ	IN	_	2	mark	_
2	working	work	VERB	VBG	VerbForm=Ger	11	advcl	_
3	at	at	ADP	IN	_	5	case	_
4	a	a	DET	DT	Definite=Ind\|PronType=Art	5	det	_
5	supermarket	supermarket	NOUN	NN	Number=Sing	2	obl	_
6	as	as	ADP	IN	_	8	case	_
7	a	a	DET	DT	Definite=Ind\|PronType=Art	8	det	_
8	bagger	bagger	NOUN	NN	Number=Sing	2	obl	_
9	,	,	PUNCT	,	_	11	punct	_
10	he	he	PRON	PRP	Case=Nom\|Gender=Masc\|Number=Sing\|Person=3\|PronType=Prs	11	nsubj	2:nsubj
11	released	release	VERB	VBD	Mood=Ind\|Tense=Past\|VerbForm=Fin	0	root	_
12	music	music	NOUN	NN	Number=Sing	11	obj	_
13	as	as	ADP	IN	_	16	case	_
14	an	a	DET	DT	Definite=Ind\|PronType=Art	16	det	_
15	independent	independent	ADJ	JJ	Degree=Pos	16	amod	_
16	artist	artist	NOUN	NN	Number=Sing	11	obl	_

Usage¶

from ConllQuery import ConllSelector
from ConllQuery.enumerations import FEATURES as F

# instantiation
parser = ConllSelector()

# set the sentence object that will be processed
parser.set_sentence( sentence ) 

# Parent of 7th word
parser.select( 7, F.UPOS ) 
> 'NOUN'  # the UPOS of bagger(7) is NOUN

# Ancestors of 7th word
parser.select( 7, F.LEMMA ) 
> 'bagger'  # the LEMMA of bagger(7) is bagger

# Set the target to while
parser.set_target( 0 )

# Path from music to the target 'While'    
parser.select( 11, F.PATH_TO_TARGET )
> '10:1:0' # the path is music->released->working->while

# Path from music to the target 'While'    
parser.select( 11, F.DIRECTION_TO_TARGET )
> 'uud' # the path is music->released->working->while

# If you pass a tuple to the select argument is interpreted as join
parser.select( 7, (F.UPOS,F.LEMMA) ) 
> 'NOUN###bagger'  # both strings are joint using '###'