My dataframe has a column of lists but let's use a series (v) for simplicity. Each list includes the values of the dictionaries that I want to create.
v = pd.Series(data=([10, 20], [10,20,30], [50]), index=['a','b','c'])
All the dictionaries need to have the same keys (k).
k = [2022, 2023, 2024]
How do I create a column of dictionaries? The result would look like this series (d):
d = pd.Series(data=({2022:10, 2023:20, 2024:np.NaN},
{2022:10, 2023:20, 2024:30},
{2022:50, 2023:np.NaN, 2024:np.NaN}), index=['a','b','c'])
CodePudding user response:
from itertools import zip_longest
d = pd.Series(dict(zip_longest(k, x, fillvalue=np.nan)) for x in v)
output:
0 {2022: 10, 2023: 20, 2024: nan}
1 {2022: 10, 2023: 20, 2024: 30}
2 {2022: 50, 2023: nan, 2024: nan}
dtype: object
edit
keeping the index:
d = pd.Series([dict(zip_longest(k,x, fillvalue=np.nan)) for x in v], index=v.index)
within a DataFrame:
df['new_col'] = [dict(zip_longest(k, x, fillvalue=np.nan)) for x in df['col']]
CodePudding user response:
Use:
d = \
pd.Series({key: {k1 : value[i] if i < len(value) else np.nan
for i, k1 in enumerate(k)}
for key, value in v.items()})
#0 {2022: 10, 2023: 20, 2024: nan}
#1 {2022: 10, 2023: 20, 2024: 30}
#2 {2022: 50, 2023: nan, 2024: nan}
#dtype: object
Or with Series.apply
v.apply(lambda l_values: {k1 : l_values[i] if i < len(l_values) else np.nan
for i, k1 in enumerate(k)})
CodePudding user response:
Using itertools.zip_longest
Series.apply
from itertools import zip_longest
res = v.apply(lambda vals: dict(zip_longest(k, vals, fillvalue=np.nan)))
Or without using itertools
def make_dict(vals, keys):
d = dict.fromkeys(keys, np.nan)
for key, val in zip(keys, vals):
d[key] = val
return d
res = v.apply(make_dict, keys=k)
Output:
>>> res
0 {2022: 10, 2023: 20, 2024: nan}
1 {2022: 10, 2023: 20, 2024: 30}
2 {2022: 50, 2023: nan, 2024: nan}
dtype: object
Setup:
import pandas as pd
import numpy as np
v = pd.Series(([10, 20], [10,20,30], [50]))
k = [2022, 2023, 2024]
CodePudding user response:
please try this:
import numpy as np
def create_dict(ls):
dict_ls = {}
for i,value in enumerate(k):
try:
dict_ls.update({value:ls[i]})
except:
dict_ls.update({value:np.nan})
return dict_ls
d = v.apply(lambda x:create_dict(x))
CodePudding user response:
Here is an indirect way of doing it, not the best I suppose
import pandas as pd
import numpy as np
v = pd.Series(([10, 20], [10,20,30], [50]))
k = [2022, 2023, 2024]
First I create a list of tuple with the right key-value pair
c = [[(k_, v__) for k_, v__ in zip(k, v_ [np.NAN]*(len(k)-len(v_)))] for v_ in v]
Then I can convert it to list of dictionaries and pass it to pd.Series
def convert_to_col(col):
d = {}
for itm in col:
d[itm[0]] = itm[1]
return d
d = pd.Series([convert_to_col(c_) for c_ in c])
and you want this if I am not wrong
0 {2022: 10, 2023: 20, 2024: nan}
1 {2022: 10, 2023: 20, 2024: 30}
2 {2022: 50, 2023: nan, 2024: nan}
dtype: object