Home > Back-end >  Pandas create new column with specific row values from dict
Pandas create new column with specific row values from dict

Time:06-15

I have a dataframe

ID  val
1   a
2   b
3   c
4   d
5   a
7   d
6   v
8   j
9   k
10  a

I have a dictionary as follows:

{aa:3, bb: 3,cc:4}

In the dictionary the numerical values indicates the number of records. The sum of numerical values is equal to the number of rows that I have in the data frame. In this example 3 3 4 = 10 and I have 10 rows in the data frame.

I am trying to split the data frame by rows that are equal to the number given in the dictionary and fill the key as column value into a new column. The desired output is as follows:

ID  val.  new_col
1   a.    aa
2   b     aa
3   c.    aa
4   d.    bb
5   a.    bb
6   v.    bb
7.  d.    cc
8   j.    cc
9   k.    cc
10  a.    cc

The order of the fill is not important as long as the count of records match with the count given in the dict. I am trying to resolve this by iterating through the dict but I am not able to isolate specific number of records of the data frame with every new key value pair.

I have also tried using pd.cut by splitting the dict values to bins and keys as column values. However I am getting the error ValueError: bins must increase monotonically.

CodePudding user response:

d = {'aa':3, 'bb': 3,'cc':4}
df['new_col'] = pd.Series([np.repeat(i, j) for i, j in d.items()]).explode().to_numpy()

df
Out[64]: 
   ID val new_col
0   1   a      aa
1   2   b      aa
2   3   c      aa
3   4   d      bb
4   5   a      bb
5   7   d      bb
6   6   v      cc
7   8   j      cc
8   9   k      cc
9  10   a      cc
  • Related