Python - Update tuple string element error-CodePudding

I have a dataframe where every row is a list of tuples , i.e.: tuple = (word, pos_tag). In each row, I want to change the word of some tuples by marking it and then update the tuple with the marked word. For example:

Initial dataframe row :

[('This', 'DET'), ('is', 'VERB'), ('an', 'DET'), ('example', 'NOUN'), ('text', 'NOUN'), ('that', 'DET'), ('I', 'PRON'), ('use', 'VERB'), ('in', 'ADP'), ('order', 'NOUN'), ('to', 'PART'), ('get', 'VERB'), ('an', 'DET'), ('answer', 'NOUN')]

Updated words :

updated_word : <IN>example</IN>
updated_word  : <TAR>answer</TAR>

Desired output :

[('This', 'DET'), ('is', 'VERB'), ('an', 'DET'), ('<IN>example</IN>', 'NOUN'), ('text', 'NOUN'), ('that', 'DET'), ('I', 'PRON'), ('use', 'VERB'), ('in', 'ADP'), ('order', 'NOUN'), ('to', 'PART'), ('get', 'VERB'), ('an', 'DET'), ('<TAR>answer</TAR>', 'NOUN')]

But I get an error that TypeError: 'tuple' object is not callable. Can someone help? Here's the code :

for idx, row in df.iterrows():
    doc = nlp(row['title'])
    pos_tags = [(token.text, token.pos_) for token in doc if not token.pos_ == "PUNCT"]

    for position, tuple in enumerate(pos_tags, start=1):
        word = tuple[0]
        spacy_pos_tag = tuple[1]
        word = re.sub(r'[^\w\s]', '', word)
        for col in cols:
            if position in row[col]:
                word = f'<{col.upper()}>{word}</{col.upper()}>'
            else:
                continue
            tuple = tuple(word, spacy_pos_tag)
            print(tuple)


 >>>>   Traceback (most recent call last):
 >>>>   tuple = tuple(word, spacy_pos_tag)
 >>>>   TypeError: 'tuple' object is not callable

Updated question

I have replaced tuple with tuple_ as suggested, but I still can't get back the desired output which is a list of tuples in every row. Can someone help how to update the dataframe rows? Here's the updated code :

for idx, row in df.iterrows():
    doc = nlp(row['title'])
    pos_tags = [(token.text, token.pos_) for token in doc if not token.pos_ == "PUNCT"]
    # print(idx, "tokens, pos : ", pos_tags, "\n")

    for position, tuple_ in enumerate(pos_tags, start=1):
        word = tuple_[0]
        spacy_pos_tag = tuple_[1]
        word = re.sub(r'[^\w\s]', '', word)
        for col in cols:
            if position in row[col]:
                word = f'<{col.upper()}>{word}</{col.upper()}>'
            else:
                continue
            tuple_ = (word, spacy_pos_tag)
        pos_tags.append(' '.join(position, tuple_))
    # pos_tags.append(' '.join(tuple_))
    print(idx, "tokens, pos : ", pos_tags, "\n")
    
    
>>>> Traceback (most recent call last):
>>>> pos_tag(df=df_matched)
>>>> pos_tags.append(' '.join(position, tuple_))
>>>> TypeError: join() takes exactly one argument (2 given)

CodePudding user response：

Do not use tuple as a variable name, as it is a built-in python type name. Try the following instead:

    for position, tuple_ in enumerate(pos_tags, start=1):
        word = tuple_[0]
        spacy_pos_tag = tuple_[1]
        word = re.sub(r'[^\w\s]', '', word)
        for col in cols:
            if position in row[col]:
                word = f'<{col.upper()}>{word}</{col.upper()}>'
            else:
                continue

            tuple_ = (word, spacy_pos_tag)
            print(tuple_)

CodePudding user response：

Don't use "tuple" as name of a variable. It's a type name