Home > Blockchain >  extracting certain words from coloumn
extracting certain words from coloumn

Time:12-21

I have csv file like below in a column 'filename' and I would like to have the label column name only the word before underscore (_) for example: output like below

filename                                                      label
1. trash____010_B201915134___page-2.png                       trash
2. DE_PH_Varodem_W408_2018-03____010_B211391275___page-1.png  DE_PH_Varodem_W408_2018-03

how can I extract only those words from following coloumn and paste into label coloumn.
enter image description here

CodePudding user response:

maybe you can try it


import padndas as pd

data = pd.read_csv('file_path')

def extract_label(filename):
    rerurn filename.split('____')[0]

data['label'] = data['filename'].apply(lambda filename:extract_label(filename))

CodePudding user response:

Using str.split() method, and assuming that you import your csv as a Dataframe:

import pandas as pd
df = pd.read_csv('file_path')
df['label'] = df['filename'].str.split('____').str[0]

CodePudding user response:

df['label'] = df.apply(lambda row:row['filename'].split('____')[0], axis=1)

CodePudding user response:

Please try this line on your code.

df['Label'] = [x.split("___")[0] for x in df['filename']]
  • Related