How to stack observations from different dataframes by the positions of their columns and not the na-CodePudding

I am trying to stack the values of several dataframes into one single dataframe. They all have 65 columns but with different names. For instance, lets say that I have two df1 and df2

df1 = pd.DataFrame(
        {"A": ["A0", "A1", "A2", "A3"],
        "B": ["B0", "B1", "B2", "B3"],
        "C": ["C0", "C1", "C2", "C3"],
        "D": ["D0", "D1", "D2", "D3"], })

df2 = pd.DataFrame(
        {"F": ["A0", "A1", "A2", "A3"],
        "G": ["B0", "B1", "B2", "B3"],
        "K": ["C0", "C1", "C2", "C3"],
        "Z": ["D0", "D1", "D2", "D3"], })

What I want is to concatenate rows using the position of the columns as keys, and not the column names. So A has the first position in df1, and F has the first position in df2. I want to stack the values of columns A and F together given that they are both in the first position of their data frames.

I want to do the same for every column: stack B in df1 with G in df2 and so on. The result would be something like this:


A0 B0 C0 D0
A1 B1 C1 D1
A2 B2 C2 D2
A3 B3 C3 D3
A0 B0 C0 D0
A1 B1 C1 D1
A2 B2 C2 D2
A3 B3 C3 D3

And how could I pass this instruction so that multiple datasets follow the same instruction with some type of for loop? I have been looking at the documentation for the merge(), concat(), and join() functions but none seem to do what I want.

CodePudding user response：

You can concat after changing the df2 column names to match the df1 columns by position

df1 = pd.DataFrame(
        {"A": ["A00", "A01", "A02", "A03"],
        "B": ["B00", "B01", "B02", "B03"],
        "C": ["C00", "C01", "C02", "C03"],
        "D": ["D00", "D01", "D02", "D03"], })

df2 = pd.DataFrame(
        {"F": ["A10", "A11", "A12", "A13"],
        "G": ["B10", "B11", "B12", "B13"],
        "K": ["C10", "C11", "C12", "C13"],
        "Z": ["D10", "D11", "D12", "D13"], })

df3 = pd.DataFrame(
        {"Q1": ["A20", "A21", "A22", "A23"],
        "Q2": ["B20", "B21", "B22", "B23"],
        "Q3": ["C20", "C21", "C22", "C23"],
        "Q4": ["D20", "D21", "D22", "D23"], })


pd.concat(
    (
        df1,
        df2.set_axis(df1.columns, axis=1),
        df3.set_axis(df1.columns, axis=1),
    ),
    ignore_index=True,
)

Output

CodePudding user response：

You can use np.concatenate() or np.r_ to combine multiple df's

df_list = [df1,df2]
np.concatenate(df_list)

np.r_[df1,df2]

Then pass either one into pd.DataFrame() For example:

pd.DataFrame(np.concatenate(df_list),columns = df1.columns)

Output:

    A   B   C   D
0  A0  B0  C0  D0
1  A1  B1  C1  D1
2  A2  B2  C2  D2
3  A3  B3  C3  D3
4  A0  B0  C0  D0
5  A1  B1  C1  D1
6  A2  B2  C2  D2
7  A3  B3  C3  D3