I'm trying to add the column 'Information' to my dataframe (df3) and filling it with string values ('True' if the index is 0 and 'False', otherwise). The problem is pandas put 'False'
in every single row, even in the ones having an index 0 (see the output below).
Input :
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'Column1': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'],
'Column2': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'Column3': ['I', 'II', 'III', 'IV', 'V', 'VI', 'VII', 'VIII', 'IX', 'X'],
'Column4': ['K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T'],
'Column5': [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
'Column6': ['XI', 'XII', 'XIII', 'XIV', 'XV', 'XVI', 'XVII', 'XVIII', 'XIX', 'XX'],
'Column7': ['U', 'V', 'W', 'X', 'Y', 'Z', '', '', '', ''],
'Column8': [21, 22, 23, 24, 25, 26, pd.NA, pd.NA, pd.NA, pd.NA],
'Column9': ['XXI', 'XXII', 'XXIII', 'XXIV', 'XXV', 'XXVI', '', '', '', '']})
column_names = ['Letters', 'Numbers', 'RomanNumerals']
df3 = pd.DataFrame(columns = column_names)
i=0
while i<len(df1.columns):
df2 = df1.iloc[:, i:i 3]
df2.columns = column_names
df3 = pd.concat([df3, df2])
i =3
df3.dropna(inplace=True)
for index, row in df3.iterrows():
df3['Information'] = np.where(index == 0, True, False)
display(df3)
Output :
Letters | Number | RomanNumeral | Information | |
---|---|---|---|---|
0 | A | 1 | I | FALSE |
1 | B | 2 | II | FALSE |
2 | C | 3 | III | FALSE |
3 | D | 4 | IV | FALSE |
4 | E | 5 | V | FALSE |
5 | F | 6 | VI | FALSE |
6 | G | 7 | VII | FALSE |
7 | H | 8 | VIII | FALSE |
8 | I | 9 | IX | FALSE |
9 | J | 10 | X | FALSE |
0 | K | 11 | XI | FALSE |
1 | L | 12 | XII | FALSE |
2 | M | 13 | XIII | FALSE |
3 | N | 14 | XIV | FALSE |
4 | O | 15 | XV | FALSE |
5 | P | 16 | XVI | FALSE |
6 | Q | 17 | XVII | FALSE |
7 | R | 18 | XVIII | FALSE |
8 | S | 19 | XIX | FALSE |
9 | T | 20 | XX | FALSE |
0 | U | 21 | XXI | FALSE |
1 | V | 22 | XXII | FALSE |
2 | W | 23 | XXIII | FALSE |
3 | X | 24 | XXIV | FALSE |
4 | Y | 25 | XXV | FALSE |
5 | Z | 26 | XXVI | FALSE |
Is there an explanation to this scenario ?
CodePudding user response:
Change the for loop with this snippet
df3['Information']= df3.index.map(lambda x: x==0)
What happen in the for loop is you actually make a new column based on a scalar. Not that you typed
df3['Information'] = np.where(index == 0, True, False)
Instead of
row['Information'] = np.where(index == 0, True, False)
But even the code above won't work because you assign to nothing
Edit:
Another way to do this (for further explanation you can check pandas dataframe apply)
def get_information(index):
if index==0:
return True
else:
return False
df3['Information']= df3.index.map(get_information)