I have csv file like below in a column 'filename' and I would like to have the label column name only the word before underscore (_) for example: output like below
filename label
1. trash____010_B201915134___page-2.png trash
2. DE_PH_Varodem_W408_2018-03____010_B211391275___page-1.png DE_PH_Varodem_W408_2018-03
how can I extract only those words from following coloumn and paste into label coloumn.
CodePudding user response:
maybe you can try it
import padndas as pd
data = pd.read_csv('file_path')
def extract_label(filename):
rerurn filename.split('____')[0]
data['label'] = data['filename'].apply(lambda filename:extract_label(filename))
CodePudding user response:
Using str.split()
method, and assuming that you import your csv as a Dataframe:
import pandas as pd
df = pd.read_csv('file_path')
df['label'] = df['filename'].str.split('____').str[0]
CodePudding user response:
df['label'] = df.apply(lambda row:row['filename'].split('____')[0], axis=1)
CodePudding user response:
Please try this line on your code.
df['Label'] = [x.split("___")[0] for x in df['filename']]