Home > Enterprise >  strip string from column that contains JSON
strip string from column that contains JSON

Time:09-16

I have the following dataframe:

  details
0 {"id":123,"code":"","name":"abc123","email":"[email protected]","status":"good"}
1 {"id":124,"code":"","name":"abc456","email":"[email protected]","status":"bad"}

I am looking to strip out abc123 and abc456 for each of the rows in this dataframe, the data type is currently an object. I've tried to convert to a string and strip and the used the following:

lambda x: x.lstrip('name"":""').rstrip('"",""email"":'))

But it does not capture the values.

The expected output should be a dataframe with just the code values:

  code
0 abc123f
1 abc456

What would be the best method to accomplish the above question?

Any guidance is greatly appreciated.

CodePudding user response:

You could try something like:

>>> df['code'] = df.pop('details').str['name']
>>> df
     code
0  abc123
1  abc456
>>> 

Or if the dictionaries are strings:

>>> df['code'] = df.pop('details').str.extract(r"'name': '(.*?)'")
>>> df
     code
0  abc123
1  abc456
>>> 

CodePudding user response:

You can use json.loads:

import json

df['code'] = df['details'].apply(lambda x: json.loads(x).get('name'))

Output:

>>> df
                                             details    code
0  {"id":123,"code":"","name":"abc123","email":"t...  abc123
1  {"id":124,"code":"","name":"abc456","email":"t...  abc456
  • Related