How to create a new key-value pair in a dictionary in a cell in Pandas DataFrame based on a conditio-CodePudding

I need to add a new key-value pair to the pandas dataframe column based on the condition. Destination column data is in dictionary format. So if the conditon is true, it is necessary to create pair otherwise, no action is needed. I am trying to make it through np.where:

df = pd.DataFrame({"amenity": ["1","2","3","4"], "tags": [{"building":"yes"},{"entrance": "yes"},{},{}], "sport": [None, "hockey", "football", None], "leisure":["multi", "some", "field", "wake"]})

leisure_var_add = ["field", "multi"]

df['tags']['sport'] = np.where((df['sport'] != None) | (df['leisure'].isin(leisure_var_add))), df['sport'], None)
df['tags']['leisure'] = np.where((df['sport'] == None) & (df['leisure'] !=None) & (~df['leisure'].isin(leisure_var_add)), df['leisure'], None)

I would like to get something like this:

  amenity                                         tags     sport leisure
0       1          {'building':'yes','sport': 'multi'}      None   multi
1       2        {'entrance': 'yes','sport': 'hockey'}    hokkey    some
2       3    {'sport': 'football', 'leisure': 'field'}  football   field
3       4                          {'leisure': 'wake'}      None    wake

I have implemented this task with a loop over each row and operations with indexes, but in this case, I lose all benefits of Pandas. Do you have any idea how it can be implemented?

CodePudding user response：

Use a comprehension:

df['tags'] = df[['sport', 'leisure']] \
                 .apply(lambda x: {k: v for k, v in x[x.notna()].items()}, axis=1)

Output:

>>> df
  amenity                                       tags     sport leisure
0       1                       {'leisure': 'multi'}      None   multi
1       2     {'sport': 'hokkey', 'leisure': 'some'}    hokkey    some
2       3  {'sport': 'football', 'leisure': 'field'}  football   field
3       4                        {'leisure': 'wake'}      None    wake

CodePudding user response：

i moved all the data to columns using an apply then iterrate the rows building a dictionary of tags using the column data excluding amenity

df = pd.DataFrame({"amenity": ["1","2","3","4"], "tags": [{"building":"yes"},{"entrance": "yes"},{},{}], "sport": [None, "hockey", "football", None], "leisure":["multi", "some", "field", "wake"]})

def EmptyList(x):
    if len(x)>0:
        return x[0]
    else:
        return None

 df['building']=df['tags'].apply(lambda x: [v for k,v in x.items() if k=='building']).apply(EmptyList)
 df['entrance']=df['tags'].apply(lambda x: [v for k,v in x.items() if k=='entrance']).apply(EmptyList)
 df.drop(['tags'],inplace=True,axis=1)
 print(df)

 tags_dict={}
 columns=df.columns
 for key,value in df.iterrows():
    for column in columns:
        if value[column]!=None and column != 'amenity':
            #print(value[column])
            tags_dict[column]=value[column]
     #print(tags_dict)
     df.loc[key,'tags']=str(tags_dict)
     tags_dict.clear()

 print(df)

output

  amenity     sport leisure building entrance  \
0       1      None   multi      yes     None   
1       2    hockey    some     None      yes   
2       3  football   field     None     None   
3       4      None    wake     None     None   

                                            tags  
0            {'leisure': 'multi', 'building': 'yes'}  
1  {'sport': 'hockey', 'leisure': 'some', 'entran...  
2          {'sport': 'football', 'leisure': 'field'}  
3                                {'leisure': 'wake'}