How can i create this nested list using list comprehension?-CodePudding

Every time I try to create a nested list using list comprehension it ends up being a major headache or comes out incorrectly. I have a transposed data frame of four variables that I'm working with that has 9 columns of each variable. For example:

Date0, Date1, Date2, Date3 ... Date 9
GMV0, GMV1, GMV2, GMV3 .... GMV9
Revenue0, Revenue1, Revenue2, Revenue3 .... Revenue9

I am trying to create a nested list for each of these columns. The desired list is as follows:

[[Date0, GMV0, Revenue0], [Date1, GMV1, Revenue1], [Date2, GMV2, Revenue2] ... [Date9, GMV9, Revenue9]]

I can currently create the desired list using

date=[col for col in test.columns if 'Date' in col]
gmv=[col for col in test.columns if 'GMV' in col]
rev=[col for col in test.columns if 'Gross Revenue' in col]

vars=[[Date[i], gmv[i], rev[i]] for i in range(len(Date))]

But this is quite inefficient and I'm quite positive this is a one-liner code.

Can someone help with the correct list comprehension (or possibly some other method that is specific to transposed data) and help me wrap my head around it?

CodePudding user response：

You can use to_dict:

>>> df
          0         1         2         3         4
0     Date0     Date1     Date2     Date3     Date9
1      GMV0      GMV1      GMV2      GMV3      GMV9
2  Revenue0  Revenue1  Revenue2  Revenue3  Revenue9

>>> list(df.to_dict(orient='list').values())

[['Date0', 'GMV0', 'Revenue0'],
 ['Date1', 'GMV1', 'Revenue1'],
 ['Date2', 'GMV2', 'Revenue2'],
 ['Date3', 'GMV3', 'Revenue3'],
 ['Date9', 'GMV9', 'Revenue9']]

Update

>>> df
  Date0 Date1 Date2 Date3 GMV0 GMV1 GMV2 GMV3 Revenue0 Revenue1 Revenue2 Revenue3
0     A     B     C     D    E    F    G    H        I        J        K        L


>>> [list(t.columns) for _, t in df.groupby(df.columns.str.extract(r'(\d )', expand=False), axis=1)]

[['Date0', 'GMV0', 'Revenue0'],
 ['Date1', 'GMV1', 'Revenue1'],
 ['Date2', 'GMV2', 'Revenue2'],
 ['Date3', 'GMV3', 'Revenue3']]

CodePudding user response：

You can use list comprehension with nested for clause.

vars = [
    col
    for key in ['Date', 'GMV', 'Gross Revenue']
    for col in test.columns if key in col
]

reference: https://docs.python.org/3/reference/expressions.html#displays-for-lists-sets-and-dictionaries

or, if you have already three lists, you can use built-in function zip. It is like a transpose.

vars = list(zip(date, gmv, rev))

update:

Sorry for misunderstanding the question. If you need nested list, following code will work.

vars = list(zip(*(
    [col for col in test if key in col]
    for key in ['Date', 'GMV', 'Gross Revenue']
)))

If you are using DataFrame already, @Corralien's answer would be better. This answer is useful when you want to do it by vanilla Python.

CodePudding user response：

if the input list is:

>>> test = [['a1','a2','a3'],['b1', 'b2','b3'],['c1','c2','c3']]

then

>>> b = [[test[x][i] for x in range(len(test))] for i in range(len(test[0]))]

>>> b
[['a1', 'b1', 'c1'], ['a2', 'b2', 'c2'], ['a3', 'b3', 'c3']]

for understanding: try the result of inner cycle wit i=0 then i=1 ...

>>> i = 0
>>> [test[x][i] for x in range(len(test))]