Home > OS >  How to create miultiple rows from a single row?
How to create miultiple rows from a single row?

Time:02-02

I'm trying to come up with a program that creates multiple rows and columns for each row based on a value in a column.

Here's a look at my data

import pandas as pd

data = pd.read_excel("test data.xlsx")
Id #ofweeks Manhours StartDate EndDate Startingyear StartingWeek
aaa 2 10 1/15/2023 1/29/2023 2023 3
bbb 3 12 2/12/2023 3/05/2023 2023 7

The table needs to be expanded so that every row is expanded by the number of weeks. There needs to be columns added for the Labor hours per week and columns which count the number of weeks for each Id.

The results should look like this

Id #ofweeks Manhours StartDate EndDate Startingyear StartingWeek WeekCount Labor Week#
aaa 2 10 1/15/2023 1/29/2023 2023 3 1 5 3
aaa 2 10 1/15/2023 1/29/2023 2023 3 2 5 4
bbb 3 12 2/12/2023 3/05/2023 2023 7 1 4 7
bbb 3 12 2/12/2023 3/05/2023 2023 7 2 4 8
bbb 3 12 2/12/2023 3/05/2023 2023 7 3 4 10

I was able to expand the table using:

data2 = data.loc[data.index.repeat(data["#ofweeks"])].reset_index(drop=True)
data2

I also added Labor by using:

data2["Labor"] = data2["Manhours"]/data3["#ofweeks"]
data2

How can I added the WeekCount and Week# columns?

Additionally how can I account for records that go past one calendar year?

CodePudding user response:

Once you have expended the dataframe, I would add the columns such as:

df['WeekCount'] = df.groupby('Id')['Id'].cumcount()   1

To account for activities that go past a calendar year you can try:

df['Week#'] = np.where((df['StartingWeek']   df['WeekCount']-1) > 52,
                       (df['StartingWeek']   df['WeekCount']-53),
                        df['StartingWeek']   df['WeekCount']-1)

CodePudding user response:

Using the range function, create a new column called 'WeekCount' and repeat for each '#ofweeks'.

data2['WeekCount'] = list(range(1, data2['#ofweeks'].max()   1)) * len(data2['#ofweeks'].unique())
data2['WeekCount'] = data2['WeekCount'][:len(data2)]

Create a new column 'Week#' by combining the 'StartingWeek' and 'WeekCount' columns.

data2['Week#'] = data2['StartingWeek']   data2['WeekCount'] - 1
  • Related