I have the following dataframe:
details
0 {"id":123,"code":"","name":"abc123","email":"[email protected]","status":"good"}
1 {"id":124,"code":"","name":"abc456","email":"[email protected]","status":"bad"}
I am looking to strip out abc123
and abc456
for each of the rows in this dataframe, the data type is currently an object. I've tried to convert to a string and strip and the used the following:
lambda x: x.lstrip('name"":""').rstrip('"",""email"":'))
But it does not capture the values.
The expected output should be a dataframe with just the code values:
code
0 abc123f
1 abc456
What would be the best method to accomplish the above question?
Any guidance is greatly appreciated.
CodePudding user response:
You could try something like:
>>> df['code'] = df.pop('details').str['name']
>>> df
code
0 abc123
1 abc456
>>>
Or if the dictionaries are strings:
>>> df['code'] = df.pop('details').str.extract(r"'name': '(.*?)'")
>>> df
code
0 abc123
1 abc456
>>>
CodePudding user response:
You can use json.loads
:
import json
df['code'] = df['details'].apply(lambda x: json.loads(x).get('name'))
Output:
>>> df
details code
0 {"id":123,"code":"","name":"abc123","email":"t... abc123
1 {"id":124,"code":"","name":"abc456","email":"t... abc456