I'm trying to achieve this :
screenshot of the desired output
But when I run the script below, I get an empty dataframe.
import pandas as pd
df1 = pd.DataFrame({'Column1': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'],
'Column2': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'Column3': ['I', 'II', 'III', 'IV', 'V', 'VI', 'VII', 'VIII', 'IX', 'X'],
'Column4': [pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA],
'Column5': ['K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T'],
'Column6': [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
'Column7': ['XI', 'XII', 'XIII', 'XIV', 'XV', 'XVI', 'XVII', 'XVIII', 'XIX', 'XX'],
'Column8': [pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA, pd.NA],
'Column9': ['U', 'V', 'W', 'X', 'Y', 'Z', '', '', '', ''],
'Column10': [21, 22, 23, 24, 25, 26, pd.NA, pd.NA, pd.NA, pd.NA],
'Column11': ['XXI', 'XXII', 'XXIII', 'XXIV', 'XXV', 'XXVI', '', '', '', '']})
column_names = ['Letters', 'Numbers', 'RomanNumerals']
df4 = pd.DataFrame(columns = column_names)
while i<len(df1.columns):
df2 = df1.iloc[:, i:i 3]
df3 = df2.rename(index={0: 'Letters', 1: 'Numbers', 2: 'RomanNumerals'})
df4 = pd.concat(df4, df3)
i =4
print(df4)
Empty DataFrame
Columns: [Letters, Numbers, RomanNumerals]
Index: []
Did I miss something ?
CodePudding user response:
Your code looks okay. You dont need a df3. just name the columns appropriately and it works.
i = 0
column_names = ['Letters', 'Numbers', 'RomanNumerals']
result = pd.DataFrame(columns=column_names)
while i<len(df1.columns):
df2 = df1.iloc[:, i:i 3]
df2.columns = column_names
result = pd.concat([result, df2], axis=0)
i = 4
result.dropna(inplace=True)
CodePudding user response:
df1 = df1.replace([pd.NA, ''], np.nan)
df1 = df1.dropna(axis=1, how='all')
num_cols = 3
column_names = ['Letters', 'Numbers', 'RomanNumerals']
columns = df1.columns
dfs = []
for i in range(len(columns)//num_cols):
temp_df = df1[columns[i*num_cols:(i 1)*num_cols]]
temp_df.columns = column_names
dfs.append(temp_df)
df1 = pd.concat(dfs, ignore_index=True)
df1 = df1.dropna(how='all')
print(df1)
Output:
Letters Numbers RomanNumerals
0 A 1.0 I
1 B 2.0 II
2 C 3.0 III
3 D 4.0 IV
4 E 5.0 V
5 F 6.0 VI
6 G 7.0 VII
7 H 8.0 VIII
8 I 9.0 IX
9 J 10.0 X
10 K 11.0 XI
11 L 12.0 XII
12 M 13.0 XIII
13 N 14.0 XIV
14 O 15.0 XV
15 P 16.0 XVI
16 Q 17.0 XVII
17 R 18.0 XVIII
18 S 19.0 XIX
19 T 20.0 XX
20 U 21.0 XXI
21 V 22.0 XXII
22 W 23.0 XXIII
23 X 24.0 XXIV
24 Y 25.0 XXV
25 Z 26.0 XXVI
CodePudding user response:
Try this:
df1 = df1.replace([pd.NA, ""], np.nan)
df1 = df1.dropna(axis=1, how="all")
num_cols = 3
column_names = ["Letters", "Numbers", "RomanNumerals"]
columns = df1.columns
pd.concat(
[
df1[df1.columns[i :: num_cols]].unstack().reset_index(drop=True)
for i in range(num_cols)
],
axis=1,
).dropna().set_axis(column_names, axis=1)
Output:
Letters Numbers RomanNumerals
0 A 1.0 I
1 B 2.0 II
2 C 3.0 III
3 D 4.0 IV
4 E 5.0 V
5 F 6.0 VI
6 G 7.0 VII
7 H 8.0 VIII
8 I 9.0 IX
9 J 10.0 X
10 K 11.0 XI
11 L 12.0 XII
12 M 13.0 XIII
13 N 14.0 XIV
14 O 15.0 XV
15 P 16.0 XVI
16 Q 17.0 XVII
17 R 18.0 XVIII
18 S 19.0 XIX
19 T 20.0 XX
20 U 21.0 XXI
21 V 22.0 XXII
22 W 23.0 XXIII
23 X 24.0 XXIV
24 Y 25.0 XXV
25 Z 26.0 XXVI