I have a df:
pid ts
sid vid
1 A page1 t1
A page2 t2
A page3 t3
2 B page1 t4
3 C page1 t5
I want to drop all rows for each sid
the size is equal to some number, lets say = 1
psuedo-code
for every sid in df:
if sid.size() == 1:
remove sid from df
The result would look like:
pid ts
sid vid
1 A page1 t1
A page2 t2
A page3 t3
CodePudding user response:
You could groupby
the first index level and filter
the groups whith length greater than 1:
df.groupby(level=0).filter(lambda g: len(g)>1)
output:
pid ts
sid vid
1 A page1 t1
A page2 t2
A page3 t3
NB. you could also use the level name: df.groupby(level='sid').filter(lambda g: len(g)>1)
used input:
df = (pd.DataFrame({'pid': {(1, 'A'): 'page3', (2, 'B'): 'page1', (3, 'C'): 'page1'},
'ts': {(1, 'A'): 't3', (2, 'B'): 't4', (3, 'C'): 't5'}})
.rename_axis(['sid', 'vid'])
)
# pid ts
# sid vid
# 1 A page3 t3
# 2 B page1 t4
# 3 C page1 t5