Home > Mobile >  How to delete some specific columns in a large measurenent data which donot contain a some values?
How to delete some specific columns in a large measurenent data which donot contain a some values?

Time:03-24

i have a large measurement data which contain 35O columns after filtering(for example to A49,B0to B49,F0 toF49) with some random numbers. Now i want to look in to (B0 to B49) whether it has values in the range(say: between 20 and 30).If not I want to delete that columns from the measurement data.

How to do this in python with pandas? I want to know some faster methods for this filtering?
sample data:https://docs.google.com/spreadsheets/d/17Xjc81jkjS-64B4FGZ06SzYDRnc6J27m/edit?usp=sharing&ouid=106137353367530025738&rtpof=true&sd=true

CodePudding user response:

(In Pandas) You can apply a function on all elements of an array using the applymap function. You can also apply aggregating actions to have a single value out of a whole column. You put those two things together to have what you want.

For instance, you want to know if a given set of columns (the "B" ones) have value in some range (say, 20:30). So, you want to verify the values at the element level, but collect the column names as output.

You can do that with the following code. Execute them separately/progressively to understand what they are doing.

>>> b_cols_of_interest_indx = df.filter(regex='^B').applymap(lambda x:20<x<30).any()
>>> b_cols_of_interest_indx[b_cols_of_interest_indx]
B19    True
B21    True
dtype: bool
  • Related