Home > Blockchain >  Find and remove columns from array based on another array python
Find and remove columns from array based on another array python

Time:08-26

I have two .CSV files:

One is a dataset with over 1500 features and 300 samples,

the second is an RFECV ranking of features: Image showing an example of two files

I'm trying to remove each column of a feature from the dataset, that does not have a ranking of 1.

So we only should have something like this:

enter image description here

What would be the proper way of doing something like that in Python?

I was thinking of transposing the second array, finding the indexes with ones and moving columns with these indexes from the dataset to an another array.

CodePudding user response:

Try:

rank_1 = df2[df2.Ranking == 1].Features
new_df = df1[rank_1]

CodePudding user response:

import pandas as pd

df1 = pd.read_csv("path-to-first-csv-file.csv")
df2 = pd.read_csv("path-to-second-csv-file.csv")

result = df1[df2[df2["Ranking"] == 1]["Features"]]
  • Related