Home > OS >  Filtering the dataframe with conditions in Pandas and reaching the rest of the dataframe (.loc)
Filtering the dataframe with conditions in Pandas and reaching the rest of the dataframe (.loc)

Time:08-17

I would like to create another dataframe with a condition from one column. This method I am trying is working:

   Y = X.loc[(X['ColumnnA'] == "22.33.44.55")
                                                  | (X['ColumnnA'] == "12.12.32.44") 
                                                  | (X['ColumnnA'] == "45.142.22.22") 
                                                  | (X['ColumnnA'] == "55.197.55.8") 
                                                  | (X['ColumnnA'] == "44.44.211.254") 
                                                  | (X['ColumnnA'] == "33.44.234.83") 
                                                  | (X['ColumnnA'] == "33.33.221.240") 
                                                  | (X['ColumnnA'] == "33.33.33.1") 
                                                 ] 

But with .loc function, I cannot use this:

restdataframe = X[~Y]
Y=X[Y]

How can I use this with .loc?

Strange but I was using the below method last week and it was working for another dataframe, with the same columns now this runs but it provides me a wrong "shape". With .loc, it gives a correct answer. I want to understand what I am doing wrong with below code? Why it does not work properly?

Y = (X['ColumnnA'] == "22.33.44.55")
| (X['ColumnnA'] == "12.12.32.44") 
| (X['ColumnnA'] == "45.142.22.22") 
| (X['ColumnnA'] == "55.197.55.8") 
| (X['ColumnnA'] == "44.44.211.254") 
| (X['ColumnnA'] == "33.44.234.83") 
| (X['ColumnnA'] == "33.33.221.240") 
| (X['ColumnnA'] == "33.33.33.1") 

Note: I run it in one line because of the invalid syntax

Example of X: enter image description here

CodePudding user response:

Without code to reproduce you X dataframe it is hard to say for sure, but you probably want to use:

restdataframe = X[X!=Y]
Y = X[X==Y]

When you use Y = X.loc[.....] you dont create a mask but select the values from X which meet you conditions. Therefore you have to recreate a mask to make your selection from X. You can do this by comparing where X and Y are/arent the same.

Alternatively you can create Y as a mask:

Y = ((X['ColumnnA'] == "22.33.44.55")
| (X['ColumnnA'] == "12.12.32.44") 
| (X['ColumnnA'] == "45.142.22.22") 
| (X['ColumnnA'] == "55.197.55.8") 
| (X['ColumnnA'] == "44.44.211.254") 
| (X['ColumnnA'] == "33.44.234.83") 
| (X['ColumnnA'] == "33.33.221.240") 
| (X['ColumnnA'] == "33.33.33.1"))
                                                 

then run your original:

restdataframe = X[~Y]
Y=X[Y]
  • Related