Home > Mobile >  How can we can select and concat a serie of columns in a single dataframe
How can we can select and concat a serie of columns in a single dataframe

Time:07-23

I'm trying to achieve this :

screenshot of the desired output

But when I run the script below, I get an empty dataframe.

import pandas as pd

df1 = pd.DataFrame({'Column1': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'],
                   'Column2': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
                   'Column3': ['I', 'II', 'III', 'IV', 'V', 'VI', 'VII', 'VIII', 'IX', 'X'],
                   'Column4': [pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA],
                   'Column5': ['K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T'],
                   'Column6': [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
                   'Column7': ['XI', 'XII', 'XIII', 'XIV', 'XV', 'XVI', 'XVII', 'XVIII', 'XIX', 'XX'],
                   'Column8': [pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA],
                   'Column9': ['U', 'V', 'W', 'X', 'Y', 'Z', '', '', '', ''],
                   'Column10': [21, 22, 23, 24, 25, 26, pd.NA, pd.NA, pd.NA, pd.NA],
                   'Column11': ['XXI', 'XXII', 'XXIII', 'XXIV', 'XXV', 'XXVI', '', '', '', '']})

column_names = ['Letters', 'Numbers', 'RomanNumerals']
df4 = pd.DataFrame(columns = column_names)

while i<len(df1.columns):
    df2 = df1.iloc[:, i:i 3]
    df3 = df2.rename(index={0: 'Letters', 1: 'Numbers', 2: 'RomanNumerals'})
    df4 = pd.concat(df4, df3)
    i =4
    
print(df4)
Empty DataFrame
Columns: [Letters, Numbers, RomanNumerals]
Index: []

Did I miss something ?

CodePudding user response:

Your code looks okay. You dont need a df3. just name the columns appropriately and it works.

i = 0
column_names = ['Letters', 'Numbers', 'RomanNumerals']
result = pd.DataFrame(columns=column_names)
while i<len(df1.columns):
    df2 = df1.iloc[:, i:i 3]
    df2.columns = column_names
    result = pd.concat([result, df2], axis=0)
    i  = 4

result.dropna(inplace=True)

CodePudding user response:

df1 = df1.replace([pd.NA, ''], np.nan)
df1 = df1.dropna(axis=1, how='all')

num_cols = 3
column_names = ['Letters', 'Numbers', 'RomanNumerals']
columns = df1.columns

dfs = []
for i in range(len(columns)//num_cols):
    temp_df = df1[columns[i*num_cols:(i 1)*num_cols]]
    temp_df.columns = column_names
    dfs.append(temp_df)

df1 = pd.concat(dfs, ignore_index=True)
df1 = df1.dropna(how='all')
print(df1)

Output:

   Letters  Numbers RomanNumerals
0        A      1.0             I
1        B      2.0            II
2        C      3.0           III
3        D      4.0            IV
4        E      5.0             V
5        F      6.0            VI
6        G      7.0           VII
7        H      8.0          VIII
8        I      9.0            IX
9        J     10.0             X
10       K     11.0            XI
11       L     12.0           XII
12       M     13.0          XIII
13       N     14.0           XIV
14       O     15.0            XV
15       P     16.0           XVI
16       Q     17.0          XVII
17       R     18.0         XVIII
18       S     19.0           XIX
19       T     20.0            XX
20       U     21.0           XXI
21       V     22.0          XXII
22       W     23.0         XXIII
23       X     24.0          XXIV
24       Y     25.0           XXV
25       Z     26.0          XXVI

CodePudding user response:

Try this:

df1 = df1.replace([pd.NA, ""], np.nan)
df1 = df1.dropna(axis=1, how="all")

num_cols = 3
column_names = ["Letters", "Numbers", "RomanNumerals"]
columns = df1.columns


pd.concat(
    [
        df1[df1.columns[i :: num_cols]].unstack().reset_index(drop=True)
        for i in range(num_cols)
    ],
    axis=1,
).dropna().set_axis(column_names, axis=1)

Output:

    Letters  Numbers RomanNumerals
0        A      1.0             I
1        B      2.0            II
2        C      3.0           III
3        D      4.0            IV
4        E      5.0             V
5        F      6.0            VI
6        G      7.0           VII
7        H      8.0          VIII
8        I      9.0            IX
9        J     10.0             X
10       K     11.0            XI
11       L     12.0           XII
12       M     13.0          XIII
13       N     14.0           XIV
14       O     15.0            XV
15       P     16.0           XVI
16       Q     17.0          XVII
17       R     18.0         XVIII
18       S     19.0           XIX
19       T     20.0            XX
20       U     21.0           XXI
21       V     22.0          XXII
22       W     23.0         XXIII
23       X     24.0          XXIV
24       Y     25.0           XXV
25       Z     26.0          XXVI
  • Related