Let's say I have text data in a pandas data frame with multi-label.
Text Label
0 I love you A, B
1 Thank you C, D
2 You are welcome A, B, C
I wanted to convert it to a text file, where each row is the sentence and separated by label __label__
sign, and the each label is separated just by a space
Therefore, the text file will look like this:
I love you __label__ A B
Thank you __label__ C D
You are welcome __label__ A B C
CodePudding user response:
import pandas as pd
df = {
'Text': ['I love you', 'Thank you', 'You are welcome'],
'Label': ['A B', 'C D', 'A B C']
}
data = pd.DataFrame(df, columns=['Text', 'Label'])
print(data)
with open('read1me.txt', 'w') as f:
for index, row in data.iterrows():
text = row['Text']
lbl = row['Label'].replace(',', '')
f.write(f'{text}\t{"__label__"}\t{lbl}' "\n")
CodePudding user response:
You can do this with to_csv() and set the separator as ' __label__ '
:
df.to_csv('filename.txt', sep=' __label__ ', header=False, index=False)