How to sum two consequent columns in pandas and retrieve one as result?-CodePudding

I have a large dataset of 670 columns and 2856 rows. The idea is to sum two consequent rows and retrieve a single column and value as result. It's important to not have replacement in the way the fist column second, then the third the fourth not the second third.

Index	ID1	ID2	ID3	ID4
First	0	1	0	1
Second	0	0	1	1

the result should be

Index	ID12	ID34
First	1	1
Second	0	2

The example dataframe:

df = pd.DataFrame({"ID1" : [0,0,0,1,1,1] , "ID2" :[1,1,1,0,0,0], "ID3" : [0,1,1,1,0,1]},"ID4" : [0,0,0,0,0,0])
result = pd.DataFrame({"ID1/2" : [1,1,1,0,0,0] , "ID3/4" :[0,1,1,1,0,1]})

I have tried:

res = []
for i in range(len(df)):
              for j in range(1,len(df.columns),2):
                            res.append(data.iloc[i,j] data.iloc[i,j-1])
result = pd.DataFrame(res)

In R the result is:

result <- matrix(nrow = nrow(df), ncol = ncol(df),)
for (i in seq(1,ncol(df),2)){
  result[,i] <- df[,i] df[,i 1]
}
#Erasing the NAs columns
result <- result [,-seq(2,ncol(result ),2)]

CodePudding user response：

N = len(df.columns)
new_df = df.groupby(np.arange(N) // 2, axis="columns").sum()
new_df.columns = [f"ID{j}{j 1}" for j in range(1, N, 2)]

groupby every-2 columns and sum
- floordivision of the range(len(columns)) by 2 gives the group numbers: 0, 0, 1, 1, 2, 2...
form new columns with stepping by 2 over range(N) to get 1, 3...

to get

>>> new_df

        ID12  ID34
Index
First      1     1
Second     0     2

CodePudding user response：

With magic slicing:

res = pd.DataFrame(df.values[:, ::2]   df.values[:, 1::2],
                   columns=df.columns[::2]   df.columns[1::2].str[2:], index=df.index)

         ID12  ID34
First       1     1
Second      0     2