I have a dataframe where some of the values are empty lists, and others are lists of dicts. like this:
0 [{'text': 'Improvement in steam-engine side-va... [] [] [{'text': '@einen tetes strut ffice. IMPROV...
1 [{'text': 'Gate.', 'language': 'en', 'truncate... [] [] [{'text': 'No. 645,359. Patented Mar. 13, I900...
2 [{'text': 'Overseaming sewing-machine.', 'lang... [] [] [{'text': 'No. 64 5,8l5. Patented Mar. 20, I90...
I want to change the values where they are lists of dicts to be just one value from the first dict of the list. I would have liked to do something like this:
df.loc[df!=[]] = df[0]['text']
Which obviously doesn't work.
CodePudding user response:
So, given this toy dataframe:
import pandas as pd
df = pd.DataFrame(
[
[
[{"text": "Improvement ..."}],
[],
[],
[{"text": "@einen tete..."}],
],
[
[{"text": "Overseaming..."}],
[],
[],
[{"text": "No. 64 5,8l5..."}],
],
]
)
print(df)
# Outputs
0 1 2 3
0 [{'text': 'Improvement ...'}] [] [] [{'text': '@einen tete...'}]
1 [{'text': 'Overseaming...'}] [] [] [{'text': 'No. 64 5,8l5...'}]
You could do this:
df = df.applymap(lambda x: x[0]["text"] if x != [] else x)
print(df)
# Ouputs
0 1 2 3
0 Improvement ... [] [] @einen tete...
1 Overseaming... [] [] No. 64 5,8l5...
Alternatively, you could iterate and update values like this:
for col in df.columns:
for i in df.index:
try:
df.loc[i, col] = df.loc[i, col][0]["text"]
except IndexError:
continue
print(df)
# Ouputs
0 1 2 3
0 Improvement ... [] [] @einen tete...
1 Overseaming... [] [] No. 64 5,8l5...
CodePudding user response:
improving Laurent's great answer, solving the problem in one line using dataframe functionality:
df.applymap(lambda x:x[0]["text"] if x!=[])