I have a data frame with Id, Ids Neighbours, and their centres.
import pandas as pd
data = [[1, '[2,3,5]','c0'], [2, '[3,5]','c1'], [3, '[2]','c2'],[4, '[5]','c1'],[5,
'[1,2]','c2']]
df = pd.DataFrame(data, columns=['Id', 'Neighbors',"Center"])
df
Now I need to reassigned the centre values based on each neighbours for each Id.
My expected output is
Id Neighbors Center
1 [2,3,5] c0
2 [3,5] c1
3 [2] c2
4 [5] c1
5 [1,2] c2
1 [2,3,5] c1
1 [2,3,5] c2
1 [2,3,5] c2
2 [3,5] c2
2 [3,5] c2
3 [2] c1
4 [5] c2
5 [1,2] c0
5 [1,2] c1
Suppose, the neighbours of Id 1 are 2,3, and 5. Now I need to append new row and assigned the centres of Id 2, 3, and 5 with Id 1 values.
CodePudding user response:
Kind of like graph stuff,
need to figure out the parent-child relationship, Id
is parent's level, Neighbors
is child level, and the target is to find the children's Center
with the original parent's Info and concatenate back to them.
df['Neighbors'] = df['Neighbors'].apply(lambda x: eval(x))
df_exp = df.explode('Neighbors')
df_merged = df_exp.merge(df_exp[['Id','Center']], left_on='Neighbors', right_on='Id', suffixes=('_left', '_right')).drop_duplicates().sort_values('Id_left')
df_merged['Neighbors'] = df_merged['Id_left'].map(df.set_index('Id')['Neighbors'])
df_merged.drop(columns=['Center_left', 'Id_right'], inplace=True)
df_merged.rename(columns={'Id_left':'Id', 'Center_right':'Center'}, inplace=True)
output = pd.concat([df, df_merged], axis=0).reset_index(drop=True)
output
###
Id Neighbors Center
0 1 [2, 3, 5] c0
1 2 [3, 5] c1
2 3 [2] c2
3 4 [5] c1
4 5 [1, 2] c2
5 1 [2, 3, 5] c1
6 1 [2, 3, 5] c2
7 1 [2, 3, 5] c2
8 2 [3, 5] c2
9 2 [3, 5] c2
10 3 [2] c1
11 4 [5] c2
12 5 [1, 2] c1
13 5 [1, 2] c0
This kind of structure is not rare, in property declaration from the Ministry of Finance, this is a way to connect the wealth of families, e.g., Id
could be the parent's Id, Neighbors
could be the child's Id, and Center
could be a file name of property inventory.