I'm trying to come up with a program that creates multiple rows and columns for each row based on a value in a column.
Here's a look at my data
import pandas as pd
data = pd.read_excel("test data.xlsx")
Id | #ofweeks | Manhours | StartDate | EndDate | Startingyear | StartingWeek |
---|---|---|---|---|---|---|
aaa | 2 | 10 | 1/15/2023 | 1/29/2023 | 2023 | 3 |
bbb | 3 | 12 | 2/12/2023 | 3/05/2023 | 2023 | 7 |
The table needs to be expanded so that every row is expanded by the number of weeks. There needs to be columns added for the Labor hours per week and columns which count the number of weeks for each Id.
The results should look like this
Id | #ofweeks | Manhours | StartDate | EndDate | Startingyear | StartingWeek | WeekCount | Labor | Week# |
---|---|---|---|---|---|---|---|---|---|
aaa | 2 | 10 | 1/15/2023 | 1/29/2023 | 2023 | 3 | 1 | 5 | 3 |
aaa | 2 | 10 | 1/15/2023 | 1/29/2023 | 2023 | 3 | 2 | 5 | 4 |
bbb | 3 | 12 | 2/12/2023 | 3/05/2023 | 2023 | 7 | 1 | 4 | 7 |
bbb | 3 | 12 | 2/12/2023 | 3/05/2023 | 2023 | 7 | 2 | 4 | 8 |
bbb | 3 | 12 | 2/12/2023 | 3/05/2023 | 2023 | 7 | 3 | 4 | 10 |
I was able to expand the table using:
data2 = data.loc[data.index.repeat(data["#ofweeks"])].reset_index(drop=True)
data2
I also added Labor by using:
data2["Labor"] = data2["Manhours"]/data3["#ofweeks"]
data2
How can I added the WeekCount and Week# columns?
Additionally how can I account for records that go past one calendar year?
CodePudding user response:
Once you have expended the dataframe, I would add the columns such as:
df['WeekCount'] = df.groupby('Id')['Id'].cumcount() 1
To account for activities that go past a calendar year you can try:
df['Week#'] = np.where((df['StartingWeek'] df['WeekCount']-1) > 52,
(df['StartingWeek'] df['WeekCount']-53),
df['StartingWeek'] df['WeekCount']-1)
CodePudding user response:
Using the range function, create a new column called 'WeekCount' and repeat for each '#ofweeks'.
data2['WeekCount'] = list(range(1, data2['#ofweeks'].max() 1)) * len(data2['#ofweeks'].unique())
data2['WeekCount'] = data2['WeekCount'][:len(data2)]
Create a new column 'Week#' by combining the 'StartingWeek' and 'WeekCount' columns.
data2['Week#'] = data2['StartingWeek'] data2['WeekCount'] - 1