In the following dataframe, "day" is a string column for a 7-character binary code to specify whether or not an event occurs on a particular day. The first character indicates whether or not the event occurs on Monday, and final character indicates whether the event occurs on Sunday.
For example:
event day
0 A 1000010
1 B 1010100
2 C 0100010
3 D 0000011
Event A occurs on Monday and Saturday, event B occurs on Monday, Wednesday and Friday, and event D occurs on Saturday and Sunday.
Question: How can I filter a dataframe using a specific character of the "day" column? For example, if I want to show all rows for events on Saturday, something like day[5]=="1"
should output rows 2 and 3 (containing events "C" and "D").
I've tried various combinations such as df.loc[(df['day'][5]=="1")]
based on other examples but they don't work for filtering by a single character of a string.
(I know it's unconventional but the system has served me well using Bash scripts with Awk; just trying to develop it further in Python with Pandas).
CodePudding user response:
As you have strings, you can use slicing and comparison to '1'
:
day = 0
df[df['day'].str[day].eq('1')] # if Monday = 0
# or
day = 1
df[df['day'].str[day-1].eq('1')] # if Monday = 1
output:
event day
0 A 1000010
1 B 1010100
CodePudding user response:
You can make the string to dataframe each column for one week day
s = df.day.apply(lambda x : pd.Series(list(x)))
df[s[0]=='1']
CodePudding user response:
You can use this.for check 1 in 2nd index.
index = 1
df.loc[(df['day'].str[index]=="1")
output is
event day
2 C 0100010
CodePudding user response:
you could create a column for each day:
import pandas as pd
df = {'event': ['A','B','C','D'], 'day': ['1000010','1010100','0100010','0000011']}
df = pd.DataFrame(data=df)
df
df['Mon'] = df['day'].astype(str).str[0]
df['Tue'] = df['day'].astype(str).str[1]
df['Wed'] = df['day'].astype(str).str[2]
df['Thu'] = df['day'].astype(str).str[3]
df['Fri'] = df['day'].astype(str).str[4]
df['Sat'] = df['day'].astype(str).str[5]
df['Sun'] = df['day'].astype(str).str[6]
print(df)