Home > database >  Create a column of dictionaries from a column of lists that tells the values and a list that tells t
Create a column of dictionaries from a column of lists that tells the values and a list that tells t

Time:03-24

My dataframe has a column of lists but let's use a series (v) for simplicity. Each list includes the values of the dictionaries that I want to create.

v = pd.Series(data=([10, 20], [10,20,30], [50]), index=['a','b','c'])

All the dictionaries need to have the same keys (k).

k = [2022, 2023, 2024]

How do I create a column of dictionaries? The result would look like this series (d):

d = pd.Series(data=({2022:10, 2023:20, 2024:np.NaN},
               {2022:10, 2023:20, 2024:30},
               {2022:50, 2023:np.NaN, 2024:np.NaN}), index=['a','b','c'])

CodePudding user response:

Use itertools.zip_longest:

from itertools import zip_longest

d = pd.Series(dict(zip_longest(k, x, fillvalue=np.nan)) for x in v)

output:

0     {2022: 10, 2023: 20, 2024: nan}
1      {2022: 10, 2023: 20, 2024: 30}
2    {2022: 50, 2023: nan, 2024: nan}
dtype: object
edit

keeping the index:

d = pd.Series([dict(zip_longest(k,x, fillvalue=np.nan)) for x in v], index=v.index)

within a DataFrame:

df['new_col'] = [dict(zip_longest(k, x, fillvalue=np.nan)) for x in df['col']]

CodePudding user response:

Use:

d = \
pd.Series({key: {k1 : value[i] if i < len(value) else np.nan 
                 for i, k1 in enumerate(k)} 
           for key, value in v.items()})

#0     {2022: 10, 2023: 20, 2024: nan}
#1      {2022: 10, 2023: 20, 2024: 30}
#2    {2022: 50, 2023: nan, 2024: nan}
#dtype: object

Or with Series.apply

v.apply(lambda l_values: {k1 : l_values[i] if i < len(l_values) else np.nan 
                          for i, k1 in enumerate(k)})

CodePudding user response:

Using itertools.zip_longest Series.apply

from itertools import zip_longest 

res = v.apply(lambda vals: dict(zip_longest(k, vals, fillvalue=np.nan)))

Or without using itertools

def make_dict(vals, keys):
    d = dict.fromkeys(keys, np.nan)
    for key, val in zip(keys, vals):
        d[key] = val
    return d

res = v.apply(make_dict, keys=k)

Output:

>>> res

0     {2022: 10, 2023: 20, 2024: nan}
1      {2022: 10, 2023: 20, 2024: 30}
2    {2022: 50, 2023: nan, 2024: nan}
dtype: object

Setup:

import pandas as pd 
import numpy as np

v = pd.Series(([10, 20], [10,20,30], [50]))
k = [2022, 2023, 2024]

CodePudding user response:

please try this:

import numpy as np
def create_dict(ls):
  dict_ls = {}
  for i,value in enumerate(k):
    try:
      dict_ls.update({value:ls[i]})
    except:
      dict_ls.update({value:np.nan})
  return dict_ls

d = v.apply(lambda x:create_dict(x))

CodePudding user response:

Here is an indirect way of doing it, not the best I suppose

import pandas as pd
import numpy as np

v = pd.Series(([10, 20], [10,20,30], [50]))
k = [2022, 2023, 2024]

First I create a list of tuple with the right key-value pair

c = [[(k_, v__) for k_, v__ in zip(k, v_   [np.NAN]*(len(k)-len(v_)))] for v_ in v]

Then I can convert it to list of dictionaries and pass it to pd.Series

def convert_to_col(col):
    d = {}
    for itm in col:
        d[itm[0]] = itm[1]
    return d

d = pd.Series([convert_to_col(c_) for c_ in c])

and you want this if I am not wrong

0     {2022: 10, 2023: 20, 2024: nan}
1      {2022: 10, 2023: 20, 2024: 30}
2    {2022: 50, 2023: nan, 2024: nan}
dtype: object
  • Related