I have the following dataframe:
{'POINT_ID': {0: 'ABC B4-14 c1-1', 1: 'ABC B4-14 c1-2', 2: 'ABC 14-10 c6-2', 3: 'ABC 14-10 c6-3', 4: 'ABC 14-03 c12-1', 5: 'ABC 14-03 c12-2', 6: 'ABC 14-01A c9-2', 7: 'ABC 14-01A c9-3', 8: 'DE 368 c1-1', 9: 'DE 368 c1-2', 10: 'DE 483 c3-2', 11: 'DE 483 c3-3', 12: 'FG 1 c4-1', 13: 'HI 1A c4-3'}, 'Count': {0: 1.25, 1: 2.66, 2: 3.5, 3: 6.6, 4: 9.54, 5: 10.57, 6: 11.8, 7: 2.5, 8: 1.1, 9: 1.04, 10: 2.22, 11: 3.22, 12: 1.5, 13: 1.2}}
I want to map the values in the column POINT_ID, using this dictionary:
labels = {'ABC B4-14': ['Local 1', 'Context 1'],
'ABC 14-10': ['Local 2', 'Context 2'],
'ABC 14-03': ['Local 2', 'Context 2'],
'ABC 14-01A': ['Local 1', 'Context 1'],
'DE 368': ['Local 3', 'Context 3'],
'DE 483': ['Local 3', 'Context 4'],
'FG 1': ['Local 4', 'Context 5'],
'HI 1A': ['Local 5', 'Context 6']}
So I can generate the following dataframe:
{'POINT_ID': {0: 'ABC B4-14 c1-1', 1: 'ABC B4-14 c1-2', 2: 'ABC 14-10 c6-2', 3: 'ABC 14-10 c6-3', 4: 'ABC 14-03 c12-1', 5: 'ABC 14-03 c12-2', 6: 'ABC 14-01A c9-2', 7: 'ABC 14-01A c9-3', 8: 'DE 368 c1-1', 9: 'DE 368 c1-2', 10: 'DE 483 c3-2', 11: 'DE 483 c3-3', 12: 'FG 1 c4-1', 13: 'HI 1A c4-3'}, 'Count': {0: 1.25, 1: 2.66, 2: 3.5, 3: 6.6, 4: 9.54, 5: 10.57, 6: 11.8, 7: 2.5, 8: 1.1, 9: 1.04, 10: 2.22, 11: 3.22, 12: 1.5, 13: 1.2}, 'Local': {0: 'Local 1', 1: 'Local 1', 2: 'Local 2', 3: 'Local 2', 4: 'Local 2', 5: 'Local 2', 6: 'Local 1', 7: 'Local 1', 8: 'Local 3', 9: 'Local 3', 10: 'Local 3', 11: 'Local 3', 12: 'Local 4', 13: 'Local 5'}, 'Context': {0: 'Context 1', 1: 'Context 1', 2: 'Context 2', 3: 'Context 2', 4: 'Context 2', 5: 'Context 2', 6: 'Context 1', 7: 'Context 1', 8: 'Context 3', 9: 'Context 3', 10: 'Context 4', 11: 'Context 4', 12: 'Context 5', 13: 'Context 6'}}
The dictionary map keys are contained within the POINT_ID values, but they don't match exactly. Also, from the dictionary map values, there is a list and each member of the list needs to generate a different column.
CodePudding user response:
You could craft a DataFrame from the dictionary and merge
on the extracted first part of POINT_ID:
df2 = pd.DataFrame(labels).set_axis(['Local', 'Context']).T
ID = df['POINT_ID'].str.extract('^(.*?) [\S] $', expand=False)
out = df.merge(df2, left_on=ID, right_index=True).drop(columns='key_0')
Other option, map
and join
:
ID = df['POINT_ID'].str.extract('^(.*?) [\S] $', expand=False)
out = df.join(pd.DataFrame(ID.map(labels).to_list(), columns=['Local', 'Context']))
output:
POINT_ID Count Local Context
0 ABC B4-14 c1-1 1.25 Local 1 Context 1
1 ABC B4-14 c1-2 2.66 Local 1 Context 1
2 ABC 14-10 c6-2 3.50 Local 2 Context 2
3 ABC 14-10 c6-3 6.60 Local 2 Context 2
4 ABC 14-03 c12-1 9.54 Local 2 Context 2
5 ABC 14-03 c12-2 10.57 Local 2 Context 2
6 ABC 14-01A c9-2 11.80 Local 1 Context 1
7 ABC 14-01A c9-3 2.50 Local 1 Context 1
8 DE 368 c1-1 1.10 Local 3 Context 3
9 DE 368 c1-2 1.04 Local 3 Context 3
10 DE 483 c3-2 2.22 Local 3 Context 4
11 DE 483 c3-3 3.22 Local 3 Context 4
12 FG 1 c4-1 1.50 Local 4 Context 5
13 HI 1A c4-3 1.20 Local 5 Context 6
CodePudding user response:
I'm assuming I'm taking everything up to the last space.
df2 = pd.DataFrame(
df.POINT_ID.str.rsplit(n=1).str[0].map(labels).tolist(), # Where magic happens
index=df.index, columns=['Local', 'Context']
)
df.join(df2)
POINT_ID Count Local Context
0 ABC B4-14 c1-1 1.25 Local 1 Context 1
1 ABC B4-14 c1-2 2.66 Local 1 Context 1
2 ABC 14-10 c6-2 3.50 Local 2 Context 2
3 ABC 14-10 c6-3 6.60 Local 2 Context 2
4 ABC 14-03 c12-1 9.54 Local 2 Context 2
5 ABC 14-03 c12-2 10.57 Local 2 Context 2
6 ABC 14-01A c9-2 11.80 Local 1 Context 1
7 ABC 14-01A c9-3 2.50 Local 1 Context 1
8 DE 368 c1-1 1.10 Local 3 Context 3
9 DE 368 c1-2 1.04 Local 3 Context 3
10 DE 483 c3-2 2.22 Local 3 Context 4
11 DE 483 c3-3 3.22 Local 3 Context 4
12 FG 1 c4-1 1.50 Local 4 Context 5
13 HI 1A c4-3 1.20 Local 5 Context 6