I have a data frame like below. I have made the data frame shorter to put the idea across easily
`Category | Issue | WK 1 | WK 2 |
---|---|---|---|
Pending B | C | 1 | 2 |
Pending B | E | 3 | 4 |
Pending B | R | 4 | 5 |
Pending C | C | 1 | 2 |
Pending C | E | 3 | 4 |
Pending C | R | 4 | 5 |
Resolved | C | 1 | 2 |
Resolved | E | 3 | 4 |
Resolved | R | 4 | 5 |
----------: | --------------: | ----: | :---: |
Total | 24 | 33 |
` Using the formulars below:
formular for WK 1 column
C-WK 1/(Total WK 1 - (sum of pending C WK 1))
E-WK 1/(Total WK 1 - (sum of pending C WK 1))
R-WK 1/(Total WK 1 - (sum of pending C WK 1))
formular for WK 2 column
C-WK 2/(Total WK 2 - (sum of pending C WK 2))
E-WK 2/(Total WK 2 - (sum of pending C WK 2))
R-WK 2/(Total WK 2 - (sum of pending C WK 2))`
at the end i want to have a data frame like below.
Category | Issue | WK 1 | WK 2 | WK 1(R) | WK 2(R) |
---|---|---|---|---|---|
Resolved | C | 1 | 2 | 0.0625 | 0.090909 |
Resolved | E | 3 | 4 | 0.1875 | 0.181818 |
Resolved | R | 4 | 5 | 0.25 | 0.227273 |
----------: | --------------: | ----: | :---: | -------: | :------: |
CodePudding user response:
You can construct a function to calculate the columns of WK 1(R)
and WK 2(R)
and after that, use .loc
to select the rows where "Category"
is "Resolved"
.
def calculate_wk_resolution(df):
# Calculate WK 1(R) and WK 2(R)
df["WK 1(R)"] = df["WK 1"] / (df["WK 1"].sum() - df[df["Category"] == "Pending C"]["WK 1"].sum())
df["WK 2(R)"] = df["WK 2"] / (df["WK 2"].sum() - df[df["Category"] == "Pending C"]["WK 2"].sum())
calculate_wk_resolution(df)
out = df.loc[df["Category"] == "Resolved", ["Category", "Issue", "WK 1", "WK 2", "WK 1(R)", "WK 2(R)"]]
print(out)
Output:
Category Issue WK 1 WK 2 WK 1(R) WK 2(R)
6 Resolved C 1 2 0.0625 0.090909
7 Resolved E 3 4 0.1875 0.181818
8 Resolved R 4 5 0.2500 0.227273