I have a dataframe with regions and their yearly goals. I would like to "ungroup" this into monthly goals.
There must be some easier way than this:
import pandas as pd
df = pd.DataFrame.from_records(data=[['region1',60000],['region2',120000]], columns=['region','goal'])
# region goal
#0 region1 60000
#1 region2 120000
data = {'month':[1,2,3,4,5,6,7,8,9,10,11,12]}
df2 = pd.DataFrame.from_dict(data)
df3 = df.merge(df2, how='cross')
df3['goal'] = (df3['goal']/12).astype(int)
# region goal month
# 0 region1 5000 1
# 1 region1 5000 2
# 2 region1 5000 3
# 3 region1 5000 4
# 4 region1 5000 5
# 5 region1 5000 6
# 6 region1 5000 7
# 7 region1 5000 8
# 8 region1 5000 9
# 9 region1 5000 10
# 10 region1 5000 11
# 11 region1 5000 12
# 12 region2 10000 1
# 13 region2 10000 2
# 14 region2 10000 3
# 15 region2 10000 4
# 16 region2 10000 5
# 17 region2 10000 6
# 18 region2 10000 7
# 19 region2 10000 8
# 20 region2 10000 9
# 21 region2 10000 10
# 22 region2 10000 11
# 23 region2 10000 12
It looks like your post is mostly code; please add some more details.It looks like your post is mostly code; please add some more details.It looks like your post is mostly code; please add some more details.It looks like your post is mostly code; please add some more details.It looks like your post is mostly code; please add some more details.
CodePudding user response:
My opinion your solution is good, alternative with range
instead list is better:
df3 = df.merge(pd.DataFrame({'month':range(1, 13)}), how='cross')
df3['goal'] = (df3['goal']/12).astype(int)
Another idea without cross join with Index.repeat
for new rows and DataFrame.assign
with numpy.tile
for repeat months:
df = (df.loc[df.index.repeat(12)]
.reset_index(drop=True)
.assign(goal = lambda x: x['goal']/12,
month = np.tile(range(1, 13), len(df.index))))