I want to filter all qualified boss in the dataset. The rule is, each group has one boss, if the status of the boss is "Active" then its valid, if the statue of boss is "Inactive" then we need to replace it using the second row of that group.
For example, in row 1 group A, Allen is Boss but Inactive, then the second row of group A need to replace the first row. In group B, Kim is Boss and Active, so its valid.
Group | Name | Title | Status |
---|---|---|---|
A | Allen | Boss | Inactive |
A | Lisa | n | Active |
A | Louis | n | Active |
B | Kim | Boss | Active |
B | Derek | n | Active |
B | Andrew | n | Active |
B | Katie | n | Active |
C | Lux | Boss | Inactive |
In the end, all inactive Boss row should be filtered out, end result should be like below.
Group | Name | Title | Status |
---|---|---|---|
A | Lisa | n | Active |
B | Kim | Boss | Active |
Thanks in advance!
CodePudding user response:
How's something like this?
df[df.Status.eq("Active")].drop_duplicates(subset=["Group"], ignore_index=True)
Output:
Group Name Title Status
0 A Lisa n Active
1 B Kim Boss Active
Stepping through it:
df[df.Status.eq("Active")]
grabs only the rows where "Status" is "Active"drop_duplicates(subset=["Group"]
drops all rows after the first occurrence of a new value in "Group" ... e.g returns the first row with group A, then the first row with Group B, etcignore_index=True)
ignores the above rows' index and resets the index to start back at 0. Without this the index would be[1, 3]
CodePudding user response:
Drop the Inactive
rows, and then take the first line from each group.
df[df.Status.ne('Inactive')].groupby('Group', as_index=False).first()
Output:
Group Name Title Status
0 A Lisa n Active
1 B Kim Boss Active
CodePudding user response:
solustion
import pandas as pd
data=pd.read_csv("./level.csv")
# print(data)
group_A=data[data["Group"]=="A"]
group_B=data[data["Group"]=="B"]
group_C=data[data["Group"]=="C"]
for (key,value) in group_A.iterrows():
if value["Status"]=="Active":
value["Title"]="Boss"
break
print(group_A)
output
Group Name Title Status
0 A Allen Boss Inactive
1 A Lisa Boss Active
2 A Louis n Active
CodePudding user response:
Extract rows with specific property
import pandas
with open("./group.csv") as group:
data = pandas.read_csv(group)
# get a row data having Boss title
row_boss = data[data.Title == "Boss"]
print(row_boss)
# get a row data having Active status
row_active = data[data["Status"] == "Active"]
print(row_active)
no difference each of other ▶ data["Status"]
= data.Status