Home > Blockchain >  Expand/ungroup dataframe
Expand/ungroup dataframe

Time:11-12

I have a dataframe with regions and their yearly goals. I would like to "ungroup" this into monthly goals.

There must be some easier way than this:

import pandas as pd

df = pd.DataFrame.from_records(data=[['region1',60000],['region2',120000]], columns=['region','goal'])

#    region    goal
#0  region1   60000
#1  region2  120000


data = {'month':[1,2,3,4,5,6,7,8,9,10,11,12]}
df2 = pd.DataFrame.from_dict(data)

df3 = df.merge(df2, how='cross')
df3['goal'] = (df3['goal']/12).astype(int)

#      region   goal  month
# 0   region1   5000      1
# 1   region1   5000      2
# 2   region1   5000      3
# 3   region1   5000      4
# 4   region1   5000      5
# 5   region1   5000      6
# 6   region1   5000      7
# 7   region1   5000      8
# 8   region1   5000      9
# 9   region1   5000     10
# 10  region1   5000     11
# 11  region1   5000     12
# 12  region2  10000      1
# 13  region2  10000      2
# 14  region2  10000      3
# 15  region2  10000      4
# 16  region2  10000      5
# 17  region2  10000      6
# 18  region2  10000      7
# 19  region2  10000      8
# 20  region2  10000      9
# 21  region2  10000     10
# 22  region2  10000     11
# 23  region2  10000     12

It looks like your post is mostly code; please add some more details.It looks like your post is mostly code; please add some more details.It looks like your post is mostly code; please add some more details.It looks like your post is mostly code; please add some more details.It looks like your post is mostly code; please add some more details.

CodePudding user response:

My opinion your solution is good, alternative with range instead list is better:

df3 = df.merge(pd.DataFrame({'month':range(1, 13)}), how='cross')
df3['goal'] = (df3['goal']/12).astype(int)

Another idea without cross join with Index.repeat for new rows and DataFrame.assign with numpy.tile for repeat months:

df = (df.loc[df.index.repeat(12)]
        .reset_index(drop=True)
        .assign(goal = lambda x: x['goal']/12,
                month = np.tile(range(1, 13), len(df.index))))
  • Related