Home > Software engineering >  How to extract a value after colon in all the rows from a pandas dataframe column?
How to extract a value after colon in all the rows from a pandas dataframe column?

Time:11-14

I have a pandas data frame with the below kind of column with 200 rows.

Let's say the name of df is data.

-----------------------------------|
B
-----------------------------------|
'animal':'cat', 'bird':'peacock'...

I want to extract the value of animal to a separate column C for all the rows.

I tried the below code but it doesn't work.

data['C'] = data["B"].apply(lambda x: x.split(':')[-2] if ':' in x else x)

Please help.

CodePudding user response:

I'm not totally sure of the structure of your data. Does this look right?

import pandas as pd
import re
df = pd.DataFrame({
   "B": ["'animal':'cat'", "'bird':'peacock'"]
})

df["C"] = df.B.apply(lambda x: re.sub(r".*?\:(.*$)", r"\1", x))

CodePudding user response:

The dictionary is unpacked with pd.json_normalize

import pandas as pd

data = pd.DataFrame({'B': [{0: {'animal': 'cat', 'bird': 'peacock'}}]})

data['C'] = pd.json_normalize(data['B'])['0.animal']
  • Related