Home > database >  Apply function to a whole dataframe
Apply function to a whole dataframe

Time:10-11

I downloaded data from an api and it got pretty messy. I have this dataframe and I am trying to apply a function to get data after ":". The catch is that there are some cells with more than one ":". I tried the following.

matrix = [('points:31','points:25'),
          ('name:Deportes wigan','name:Deportes loco'),
          ('team:id:1059','team:id:1057'),
          ('league:id:1000','league:id:1000')
         ]

Create a Dataframe object

df= pd.DataFrame(matrix, columns = list('ab'))
df

I want the following df :

31, 25
Deportes wigan,Deportes loco
1059, 1057
1000 , 1000

I tried

new_df = ejemplo.apply(lambda x : x.split(":")[-1])
new_df

CodePudding user response:

You can use DataFrame.applymap

new_df = ejemplo.applymap(lambda x : x.split(":")[-1])
print(new_df)

                a              b
0              31             25
1  Deportes wigan  Deportes loco
2            1059           1057
3            1000           1000

Or with Series.str.split

new_df = ejemplo.apply(lambda col: col.str.split(":").str[-1])

CodePudding user response:

You can post-process the data with str.extract:

df = (pd.DataFrame(matrix, columns = list('ab'))
        .apply(lambda s: s.str.extract('(?<=:)([^:]*$)', expand=False))
     )

output:

                a              b
0              31             25
1  Deportes wigan  Deportes loco
2            1059           1057
3            1000           1000
  • Related