I have a dataframe with the following structure:
[
{
"name": "Book1",
"details": [
{
"id": 30278752,
"isbn": " 1594634025",
"isbn13": "9781594634024 ",
"text_reviews_count": 417,
"work_reviews_count": 3313007,
"work_text_reviews_count": 109912,
"average_rating": " 3.92"
}
]
},
{
"name": "Book2",
"details": [
{
"id": 34006942,
"isbn": " 1501173219",
"isbn13": " 9781501173219 ",
"text_reviews_count": 565,
"work_reviews_count": 2142280,
"work_text_reviews_count": 75053,
"average_rating": "4.33 "
}
]
}
]
How can i remove whitespaces (there are spaces after and before the string) please!
CodePudding user response:
names = ["isbn","isbn13" ,"average_rating"]
for i in v:
for j in names:
i["details"][0][j] = i["details"][0][j].replace(" ","")
You can try to replace them this way.
Out[17]:
[{'name': 'Book1',
'details': [{'id': 30278752,
'isbn': '1594634025',
'isbn13': '9781594634024',
'text_reviews_count': 417,
'work_reviews_count': 3313007,
'work_text_reviews_count': 109912,
'average_rating': '3.92'}]},
{'name': 'Book2',
'details': [{'id': 34006942,
'isbn': '1501173219',
'isbn13': '9781501173219',
'text_reviews_count': 565,
'work_reviews_count': 2142280,
'work_text_reviews_count': 75053,
'average_rating': '4.33'}]}]
here is the output.V is the list you have mentioned
CodePudding user response:
You should place json data into columns. Here is my approach:
# Pull first element from list
df.loc[:, 'spread_details'] = df.details.map(lambda x: x[0])
# Spread JSON into columns
df1 = pd.json_normalize(df['spread_details'])
merged_df = pd.concat([df, df1], axis=1)
merged_df = merged_df.drop(columns=['details', 'spread_details'])
# Trim required columns
cols_to_be_trimmed = ["isbn", "isbn13"]
merged_df[cols] = merged_df[cols_to_be_trimmed].apply(lambda x: x.str.strip())
merged_df.head(2)
CodePudding user response:
We can create a lambda function to strip the spaces from string values in dictionary, then map
this function on the details
column of dataframe:
strip = lambda d: {k: v.strip() if isinstance(v, str) else v for k, v in d.items()}
df['details'] = df['details'].map(lambda L: [strip(d) for d in L])
Result
>>> df.to_dict('r')
[{'name': 'Book1',
'details': [{'id': 30278752,
'isbn': '1594634025',
'isbn13': '9781594634024',
'text_reviews_count': 417,
'work_reviews_count': 3313007,
'work_text_reviews_count': 109912,
'average_rating': '3.92'}]},
{'name': 'Book2',
'details': [{'id': 34006942,
'isbn': '1501173219',
'isbn13': '9781501173219',
'text_reviews_count': 565,
'work_reviews_count': 2142280,
'work_text_reviews_count': 75053,
'average_rating': '4.33'}]}]