Home > Back-end >  How to convert a list of DataFrame columns (len(list) == len(df)) into a list of column values at th
How to convert a list of DataFrame columns (len(list) == len(df)) into a list of column values at th

Time:08-05

Given a DataFrame

df = pd.DataFrame(
    {'col0': [3, 4, 5, -3, -4, -5],
     'col1': [6, 7, 8, -6, -7, -8], 
     'col2': [9, 10, 11, -9, -10, -11], 
     'col3': [12, 13, 14, -12, -13, -14], 
     'col4': [15, 16, 17, -15, -16, -17]}
    )

and a list of df column names with the strict condition of being the same length as df

cols = ['col4', 'col4', 'col1', 'col3', 'col0', 'col2']

The following list comprehension calls to each df column at the index of the column in cols:

vals = [df[col][i] for col, i in zip(cols, df.index)]

Resulting in a list of values

vals
>>> [15, 16, 5, -12, -4, -11]

What techniques would you use to get the same results as vals? Any modules are welcome, and execution time comparisons are ultra welcome.

EDIT: Resulting in a list of values

vals
>>> [15, 16, 8, -12, -4, -11]

CodePudding user response:

You can use transpose of the dataframe, so you will have columns as indices and then more importantly you want to take the diagonal of the transposed df, because of the way you want to access the specific items. So use numpy.diag

np.diag(df.T.loc[cols]).tolist()

output:

[15, 16, 8, -12, -4, -11]

CodePudding user response:

An efficient method that does not require you to upscale the DataFrame is to perform an indexing lookup:

idx, col = pd.factorize(cols)
vals = df.reindex(col, axis=1).to_numpy()[np.arange(len(df)), idx]

output:

array([ 15,  16,   8, -12,  -4, -11])
  • Related