I'm trying to understand a Key-Bigram extractor's working and I cannot understand what does the following block of code do.
Here is the source code.
import spacy
nlp = spacy.load("en_core_web_sm")
string = '1 2 3 4 5 6 7 8 9'
def textProcessing(doc):
Words = []
doc = nlp(doc)
for possible_words in doc:
Words.append([possible_words , [child for child in possible_words.children]])
print(Words)
textProcessing(string)
Everything else is workin fine and I understood well, however I can not understand what child for child in possible_words.children
does.
CodePudding user response:
I don't know spacy's API very well, being a user of nltk ;)
But, simply put, as possible_words
is a Token
, its children
property returns all tokens that sit in the syntactic unit of the parent token.
Therefore, [child for child in possible_words.children]
turns the iterable returned by this property into a list.
I'd have written it list(possible_words.children)
, though.
CodePudding user response:
token.children
uses the dependency parse to get all tokens that directly depend on the token in question. In a visualization (try displacy), this will be all the tokens with arrows pointing away from a token; if the word is a verb this could be the subject and any objects, if the word is a noun it could be any adjectives modifying it, for example.
CodePudding user response:
looks like it's a class in Python and children is an element of the "possible_word" variable stored in "spacy" file