I have a dataframe that is structured like this:
ID | CustRef |
---|---|
0 | 111 |
1 | 222 |
2 | 333, 444, 555, 666 |
It is simple enough to convert it to a dictionary using to_dict
but where there are multiple CustRefs I would like those values to be converted to a list.
So, in this example the dict would be:
result_dict = {'0': 111, '1': 222, '2': [333, 444, 555, 666}}
Is that possible?
CodePudding user response:
You can split
and rework depending on the number of items:
s = df['CustRef'].str.split(', ')
s.loc[s.str.len().le(1)] = s.str[0]
out = dict(zip(df['ID'], s))
Or, using pure python:
out = {k: next(iter(l), None) if len(l:=v.split(', '))<2 else l
for k,v in zip(df['ID'], df['CustRef'])}
output:
{0: '111', 1: '222', 2: ['333', '444', '555', '666']}
If you really need to use to_dict
(e.g., to modify to handle more columns):
s = df['CustRef'].str.split(', ')
out = (df
.assign(CustRef=df['CustRef'].where(s.str.len().le(1), s))
.set_index('ID')['CustRef']
.to_dict()
)