How to fill a df column with range of values of 2 columns from another df?-CodePudding

Df1

A   B  C1  C2   D   E
a1  b1  2   4   d1  e1
a2  b2  1   2   d2  e2

Df2

A   B   C   D   E
a1  b1  2   d1  e1
a1  b1  3   d1  e1
a1  b1  4   d1  e1
a2  b2  1   d2  e2
a2  b2  2   d2  e2

How to make Df2 from Df1 in the fastest possible way? I tried using groupby and then within for loop used np.arange to fill Df2.C and then used pd.concat to make the final Df2. But this approach is very slow and doesn't seem very elegant and pythonic as well. Can somebody please help with this problem.

CodePudding user response：

One way is to melt df1, use groupby.apply to add ranges; then explode for the final output:

cols = ['A','B','D','E']
out = (df1.melt(cols, value_name='C').groupby(cols)['C']
       .apply(lambda x: range(x.min(), x.max() 1))
       .explode().reset_index(name='C'))

Output:

    A   B   D   E  C
0  a1  b1  d1  e1  2
1  a1  b1  d1  e1  3
2  a1  b1  d1  e1  4
3  a2  b2  d2  e2  1
4  a2  b2  d2  e2  2

CodePudding user response：

Try this:

df1.assign(C = [np.arange(s, e 1) for s, e in zip(df1['C1'], df1['C2'])])\
   .explode('C')

Output:

    A   B  C1  C2   D   E  C
0  a1  b1   2   4  d1  e1  2
0  a1  b1   2   4  d1  e1  3
0  a1  b1   2   4  d1  e1  4
1  a2  b2   1   2  d2  e2  1
1  a2  b2   1   2  d2  e2  2