Home > OS >  How to fill (based on the index of a dataframe) an empty column
How to fill (based on the index of a dataframe) an empty column

Time:07-24

I'm trying to add the column 'Information' to my dataframe (df3) and filling it with string values ('True' if the index is 0 and 'False', otherwise). The problem is pandas put 'False' in every single row, even in the ones having an index 0 (see the output below).

Input :

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'Column1': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'],
                    'Column2': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
                    'Column3': ['I', 'II', 'III', 'IV', 'V', 'VI', 'VII', 'VIII', 'IX', 'X'],
                    'Column4': ['K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T'],
                    'Column5': [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
                    'Column6': ['XI', 'XII', 'XIII', 'XIV', 'XV', 'XVI', 'XVII', 'XVIII', 'XIX', 'XX'],
                    'Column7': ['U', 'V', 'W', 'X', 'Y', 'Z', '', '', '', ''],
                    'Column8': [21, 22, 23, 24, 25, 26, pd.NA, pd.NA, pd.NA, pd.NA],
                    'Column9': ['XXI', 'XXII', 'XXIII', 'XXIV', 'XXV', 'XXVI', '', '', '', '']})

column_names = ['Letters', 'Numbers', 'RomanNumerals']
df3 = pd.DataFrame(columns = column_names)

i=0
while i<len(df1.columns):
    df2 = df1.iloc[:, i:i 3]
    df2.columns = column_names
    df3 = pd.concat([df3, df2])
    i =3

df3.dropna(inplace=True)

for index, row in df3.iterrows():
    df3['Information'] = np.where(index == 0, True,  False)

display(df3)

Output :

Letters Number RomanNumeral Information
0 A 1 I FALSE
1 B 2 II FALSE
2 C 3 III FALSE
3 D 4 IV FALSE
4 E 5 V FALSE
5 F 6 VI FALSE
6 G 7 VII FALSE
7 H 8 VIII FALSE
8 I 9 IX FALSE
9 J 10 X FALSE
0 K 11 XI FALSE
1 L 12 XII FALSE
2 M 13 XIII FALSE
3 N 14 XIV FALSE
4 O 15 XV FALSE
5 P 16 XVI FALSE
6 Q 17 XVII FALSE
7 R 18 XVIII FALSE
8 S 19 XIX FALSE
9 T 20 XX FALSE
0 U 21 XXI FALSE
1 V 22 XXII FALSE
2 W 23 XXIII FALSE
3 X 24 XXIV FALSE
4 Y 25 XXV FALSE
5 Z 26 XXVI FALSE

Is there an explanation to this scenario ?

CodePudding user response:

Change the for loop with this snippet


df3['Information']= df3.index.map(lambda x: x==0)

What happen in the for loop is you actually make a new column based on a scalar. Not that you typed


df3['Information'] = np.where(index == 0, True,  False)

Instead of


row['Information'] = np.where(index == 0, True,  False)

But even the code above won't work because you assign to nothing

Edit:

Another way to do this (for further explanation you can check pandas dataframe apply)


def get_information(index):
    if index==0:
        return True
    else:
        return False

df3['Information']= df3.index.map(get_information)

  • Related