Let say I have the following pandas df
import pandas as pd
d = [0.0, 1.0, 2.0]
e = pd.Series(d, index = ['a', 'b', 'c'])
df = pd.DataFrame({'A': 1., 'B': e, 'C': pd.Timestamp('20130102')})
Now I have another array
select = ['c', 'a', 'x']
Clearly, the element 'x'
is not available in my original df
. How can I select rows of df
based on select
but choose only available rows without any error? i.e. in this case, I want to select only rows corresponding to 'c'
and 'a'
maintaining this order.
Any pointer will be very helpful.
CodePudding user response:
You could use reindex
dropna
:
out = df.reindex(select).dropna()
you could also filter select before reindex
:
out = df.reindex([i for i in select if i in df.index])
Output:
A B C
c 1.0 2.0 2013-01-02
a 1.0 0.0 2013-01-02