Given the following list of lists representing column names:
names = [['a','b'],['c','c'],['b','c']]
and the following dataframe
df
a b c
0 1 2 6
1 1 3 2
2 4 6 4
I would like to generate the list with the same dimensions as names
with the following functionality:
lst = []
for idx, cols in enumerate(names):
lst.append([])
for col in cols:
lst[-1].append(df.iloc[idx][col])
lst:
[[1,2],[2,2],[6,4]
I.e, the names
array points to the pulled columns from df
in the relevant row_idx
.
I'm trying to avoid the nested loop.
CodePudding user response:
You can select multiple columns with list
lst = []
for idx, cols in enumerate(names):
lst.append(df.iloc[idx][cols].tolist())
# or list comprehension
lst = [df.iloc[idx][cols].tolist() for idx, cols in enumerate(names)]
print(lst)
[[1, 2], [2, 2], [6, 4]]
CodePudding user response:
As you said that the length of names
is same as dataframe length, and you don't want to loop over names
nor perform nested loop. In that case, would looping over range
be allowed?
index = range(len(names))
[df.iloc[i][names[i]].tolist() for i in index]
Out[16]: [[1, 2], [2, 2], [6, 4]]
Or df.loc
[df.loc[i,names[i]].tolist() for i in index]
Out[35]: [[1, 2], [2, 2], [6, 4]]