I need to add a new key-value pair to the pandas dataframe column based on the condition. Destination column data is in dictionary format. So if the conditon is true, it is necessary to create pair otherwise, no action is needed. I am trying to make it through np.where:
df = pd.DataFrame({"amenity": ["1","2","3","4"], "tags": [{"building":"yes"},{"entrance": "yes"},{},{}], "sport": [None, "hockey", "football", None], "leisure":["multi", "some", "field", "wake"]})
leisure_var_add = ["field", "multi"]
df['tags']['sport'] = np.where((df['sport'] != None) | (df['leisure'].isin(leisure_var_add))), df['sport'], None)
df['tags']['leisure'] = np.where((df['sport'] == None) & (df['leisure'] !=None) & (~df['leisure'].isin(leisure_var_add)), df['leisure'], None)
I would like to get something like this:
amenity tags sport leisure
0 1 {'building':'yes','sport': 'multi'} None multi
1 2 {'entrance': 'yes','sport': 'hockey'} hokkey some
2 3 {'sport': 'football', 'leisure': 'field'} football field
3 4 {'leisure': 'wake'} None wake
I have implemented this task with a loop over each row and operations with indexes, but in this case, I lose all benefits of Pandas. Do you have any idea how it can be implemented?
CodePudding user response:
Use a comprehension:
df['tags'] = df[['sport', 'leisure']] \
.apply(lambda x: {k: v for k, v in x[x.notna()].items()}, axis=1)
Output:
>>> df
amenity tags sport leisure
0 1 {'leisure': 'multi'} None multi
1 2 {'sport': 'hokkey', 'leisure': 'some'} hokkey some
2 3 {'sport': 'football', 'leisure': 'field'} football field
3 4 {'leisure': 'wake'} None wake
CodePudding user response:
i moved all the data to columns using an apply then iterrate the rows building a dictionary of tags using the column data excluding amenity
df = pd.DataFrame({"amenity": ["1","2","3","4"], "tags": [{"building":"yes"},{"entrance": "yes"},{},{}], "sport": [None, "hockey", "football", None], "leisure":["multi", "some", "field", "wake"]})
def EmptyList(x):
if len(x)>0:
return x[0]
else:
return None
df['building']=df['tags'].apply(lambda x: [v for k,v in x.items() if k=='building']).apply(EmptyList)
df['entrance']=df['tags'].apply(lambda x: [v for k,v in x.items() if k=='entrance']).apply(EmptyList)
df.drop(['tags'],inplace=True,axis=1)
print(df)
tags_dict={}
columns=df.columns
for key,value in df.iterrows():
for column in columns:
if value[column]!=None and column != 'amenity':
#print(value[column])
tags_dict[column]=value[column]
#print(tags_dict)
df.loc[key,'tags']=str(tags_dict)
tags_dict.clear()
print(df)
output
amenity sport leisure building entrance \
0 1 None multi yes None
1 2 hockey some None yes
2 3 football field None None
3 4 None wake None None
tags
0 {'leisure': 'multi', 'building': 'yes'}
1 {'sport': 'hockey', 'leisure': 'some', 'entran...
2 {'sport': 'football', 'leisure': 'field'}
3 {'leisure': 'wake'}