I have 2 pandas dataframes, where one lists column names I want to select in the other one when I run it through a loop.
For example:
df1:
selected column 1 | selected column 2 |
---|---|
A | C |
B | C |
df2
A | B | C | D |
---|---|---|---|
value | value | value | value |
value | value | value | value |
I want: (first run):
A | C |
---|---|
value | value |
value | value |
(second run)
B | C |
---|---|
value | value |
value | value |
CodePudding user response:
If you don't want to use iterrows (generally frowned upon when you're working with DataFrames) you can create the DataFrames using comprehension like so:
records = [
{
'A': 'A1', 'B': 'B1', 'C': 'C1', 'D': 'D1'
},
{
'A': 'A2', 'B': 'B2', 'C': 'C2', 'D': 'D2'
}
]
selectors = [
{'C1': 'A', 'C2': 'C'},
{'C1': 'B', 'C2': 'C'}
]
data_df = pd.DataFrame.from_records(records)
sel_df = pd.DataFrame.from_records(selectors)
#.T transposes the dataframe and gives us the rowindexes to use in iloc
dfs = [pd.DataFrame(data=data_df, columns=x) for x in [sel_df.iloc[s] for s in sel_df.T]]
for df in dfs:
print(df)
CodePudding user response:
Try this:
import pandas as pd
df1 = pd.DataFrame([["A", "C"], ["B", "C"]],
columns=["selected_columns_1", "selected_columns_2"])
df2 = pd.DataFrame([["value", "value", "value", "value"],
["value", "value", "value", "value"]], columns=["A", "B", "C", "D"])
for index, row in df1.iterrows():
print(df2[[row["selected_columns_1"], row["selected_columns_2"]]])