Home > other >  Generating 3 columns from one with .apply on dataframe
Generating 3 columns from one with .apply on dataframe

Time:05-19

I want to extract some data from each row, and make that new columns of existing or new dataframe, without repeatedly doing the same operation of re. match.

Here's how one entry of the dataframe looks:

00:00 Someones_name: some text goes here

And i have a regex that successfully takes 3 groups that I need:

re.match(r"^(\d{2}:\d{2}) (.*): (.*)$", x)

The problem I have is, how to take matched_part[1], [2], and [3] without actually matching for every new column again.

The solution that I don't want is:

new_df['time'] = old_df['text'].apply(function1)`
new_df['name'] = old_df['text'].apply(function2)`
new_df['text'] = old_df['text'].apply(function3)`

def function1(x):
  return re.match(r"^(\d{2}:\d{2}) (.*): (.*)$", x)[1]

CodePudding user response:

you can use str.extract with your pattern

df[['time','name', 'text']] = df['col1'].str.extract(r"^(\d{2}:\d{2}) (.*): (.*)$")
print(df)
#                                        col1   time           name  \
# 0  00:00 Someones_name: some text goes here  00:00  Someones_name   

#                   text  
# 0  some text goes here  
  • Related