Home > Mobile >  How to count duplicated value as one per day?
How to count duplicated value as one per day?

Time:10-11

A cat shelter has many cats living there. Stray cat that frequently enters the shelter is given a collar containing an RF (Radio Frequency) tag. However, sometimes not all cats enter the cat shelter every day.

At the shelter door there is a sensor that detects a collar containing an RF tag when a cat enters. So, one cat can be detected several times in and out of the shelter per day. This is sensor data (simplified).

data = {'Date': ['01-09-2022', '01-09-2022', '01-09-2022', '01-09-2022', '02-09-2022', '02-09-2022', '02-09-2022', '02-09-2022', '03-09-2022', '03-09-2022', '03-09-2022', '03-09-2022', '03-09-2022'],
        'Name': ['A', 'A', 'A', 'A', 'B', 'C', 'C', 'B', 'D', 'C', 'C', 'D', 'A']}
df = pd.DataFrame(data)
df

The data like this

    Date        Name
0   01-09-2022  A
1   01-09-2022  A
2   01-09-2022  A
3   01-09-2022  A
4   02-09-2022  B
5   02-09-2022  C
6   02-09-2022  C
7   02-09-2022  B
8   03-09-2022  D
9   03-09-2022  C
10  03-09-2022  C
11  03-09-2022  D
12  03-09-2022  A

The question is, how do I know how many days a cat named "A" comes to the shelter per three days? (although the cat named "A" appears several times per day, it still counts as one).

CodePudding user response:

You can use .groupby() to see how many times a cat in a day, and how many different days the cat entered.

print(ans.groupby(["Name", "Date"])["Date"].count())

You get an output like this:

Name  Date      
A     01-09-2022    4
      03-09-2022    1
B     02-09-2022    2
C     02-09-2022    2
      03-09-2022    2
D     03-09-2022    2
Name: Date, dtype: int64

Then, save this and check a cat that enters in different days.

df_count = ans.groupby(["Name", "Date"]).agg({"Date": "count"})
print(df_count.groupby("Name").count())

I think, this is the answer of your question.

      Date
Name      
A        2
B        1
C        2
D        1

CodePudding user response:

What is your desired output?

Doing df[df['Name']=='A'].drop_duplicates() you know how many days cat 'A' enter in the shelter

  • Related