Home > database >  select columns in df from variables in another df
select columns in df from variables in another df

Time:10-08

I have 2 pandas dataframes, where one lists column names I want to select in the other one when I run it through a loop.

For example:

df1:

selected column 1 selected column 2
A C
B C

df2

A B C D
value value value value
value value value value

I want: (first run):

A C
value value
value value

(second run)

B C
value value
value value

CodePudding user response:

If you don't want to use iterrows (generally frowned upon when you're working with DataFrames) you can create the DataFrames using comprehension like so:

records = [
    {
        'A': 'A1', 'B': 'B1', 'C': 'C1', 'D': 'D1'
    },
    {
        'A': 'A2', 'B': 'B2', 'C': 'C2', 'D': 'D2'
    }
]

selectors = [
    {'C1': 'A', 'C2': 'C'},
    {'C1': 'B', 'C2': 'C'}
]

data_df = pd.DataFrame.from_records(records)
sel_df = pd.DataFrame.from_records(selectors)

#.T transposes the dataframe and gives us the rowindexes to use in iloc
dfs = [pd.DataFrame(data=data_df, columns=x) for x in [sel_df.iloc[s] for s in sel_df.T]] 

for df in dfs: 
    print(df)

CodePudding user response:

Try this:

import pandas as pd

df1 = pd.DataFrame([["A", "C"], ["B", "C"]],
                    columns=["selected_columns_1", "selected_columns_2"])
df2 = pd.DataFrame([["value", "value", "value", "value"],
                    ["value", "value", "value", "value"]], columns=["A", "B", "C", "D"])

for index, row in df1.iterrows():
  print(df2[[row["selected_columns_1"], row["selected_columns_2"]]])
  • Related