I would like do a one to many mapping with the following list and mapping dictonary:
l1 = ['a', 'b', 'c']
l2 = ['a', 'c', 'd']
l3 = ['d', 'e', 'f']
mapping_dict = ChainMap(
dict.fromkeys(l1, 'A'),
dict.fromkeys(l2, 'B'),
dict.fromkeys(l3, 'C'))
This is my dataframe:
df = pd.DataFrame({'code': ['a', 'b', 'c', 'd', 'e', 'f'], 'value': [1, 2, 3, 4, 5, 6]})
print(df)
code value
0 a 1
1 b 2
2 c 3
3 d 4
4 e 5
5 f 6
When I do the mapping as follows:
df['mapping'] = df['code'].map(mapping_dict.get)
code value mapping
0 a 1 A
1 b 2 A
2 c 3 A
3 d 4 B
4 e 5 C
5 f 6 C
The problem is that I want to do a one to many mapping and I don't capture the relationship. The desired outcome would be something like this. Which creates a new line when there are multiple relationships.
code value mapping
0 a 1 A
1 a 1 B
2 b 2 A
3 c 3 A
4 c 3 B
5 d 4 B
6 d 4 C
7 e 5 C
8 e 5 C
9 f 6 C
Thank you for your support.
CodePudding user response:
Here ChainMap
can't be used since it will not preserve all the duplicate keys. The solution is to create an intermediate dataframe from each pairs of (mapping, code) and then left merge
that with the original dataframe
pairs = [('A', l1), ('B', l2), ('C', l3)]
mapping = pd.DataFrame(pairs, columns=['mapping', 'code'])
df.merge(mapping.explode('code'), how='left')
Result
code value mapping
0 a 1 A
1 a 1 B
2 b 2 A
3 c 3 A
4 c 3 B
5 d 4 B
6 d 4 C
7 e 5 C
8 f 6 C