Home > database >  changing data in pandas dataframe column
changing data in pandas dataframe column

Time:09-25

I have a pandas dataframe with a column of integer values between 0 and 4 with these values :

0    108286
1      2042
2       183
3        13
4         3

I want to replace all non zero values with one
I tried the following code:

def to_bool(array):
    for index, value in enumerate(array):
        if value !=0:
            array[index] = 1
to_bool(df.handcap)

with expected output:

0    108286
1      2241

but after running the function I got:

0    107690
1      2817
2        16
3         2
4         1

what's the problem there? and how can I convert all of them to either 0 or 1?

CodePudding user response:

Convert all non zero index values to 1 and sum each group (0 and 1):

>>> df
   handcap
0   108286
1     2042
2      183
3       13
4        3

>>> df.groupby(df.index.astype(bool).astype(int)).sum()
   handcap
0   108286
1     2241

CodePudding user response:

Try this:

>>> df
    handcap
0    108286
1      2042
2       183
3        13
4         3

>>> df.groupby(df.index > 0).sum()

    handcap
0    108286
1      2241

CodePudding user response:

please check this

for i in range(1,4):    
  df.rename(index={i:1}, inplace=True)
df_new = df.groupby( level = 0 , as_index=False).agg(np.sum) 
  • Related