Exclude column in pandas-CodePudding

I have a dataframe that looks like this:

ENSG	3dir_S2_S23_L004_R1_001	7dir_S2_S25_L004_R1_001	i3dir_S2_S29_L004_R1_001	i7dir_S2_S31_L004_R1_001
ENSG00000000003.15	349.0	183.0	199.0	165.0
ENSG00000000419.13	133.0	82.0	190.0	168.0
ENSG00000000457.14	62.0	56.0	95.0	111.0
ENSG00000000460.17	191.0	122.0	300.0	285.0
ENSG00000001036.14	507.0	286.0	326.0	317.0
ENSG00000001084.13	205.0	192.0	310.0	320.0
ENSG00000001167.14	406.0	324.0	379.0	309.0
ENSG00000001460.18	93.0	78.0	146.0	120.0

I'm attempting to perform a calculation on each row of each column, excluding the column ENSG.

Something like this, where I divide each row value by the sum of the entire column:

df = df.transform(lambda x: x / x.sum())

How can I exclude the column ENSG from this calculation? Could I use iloc?

CodePudding user response：

Use set_index to exclude ENSG from columns then transform and reset_index after:

out = df.set_index('ENSG').transform(lambda x: x / x.sum()).reset_index()
print(out)

# Output:
                 ENSG  3dir_S2_S23_L004_R1_001  7dir_S2_S25_L004_R1_001  i3dir_S2_S29_L004_R1_001  i7dir_S2_S31_L004_R1_001
0  ENSG00000000003.15                 0.179342                 0.138322                  0.102314                  0.091922
1  ENSG00000000419.13                 0.068345                 0.061980                  0.097686                  0.093593
2  ENSG00000000457.14                 0.031860                 0.042328                  0.048843                  0.061838
3  ENSG00000000460.17                 0.098150                 0.092215                  0.154242                  0.158774
4  ENSG00000001036.14                 0.260534                 0.216175                  0.167609                  0.176602
5  ENSG00000001084.13                 0.105344                 0.145125                  0.159383                  0.178273
6  ENSG00000001167.14                 0.208633                 0.244898                  0.194859                  0.172145
7  ENSG00000001460.18                 0.047790                 0.058957                  0.075064                  0.066852

CodePudding user response：

Assuming ENSG is the first column, yes, you can use iloc:

df.iloc[:, 1:] = df.iloc[:, 1:] / np.sum(df.iloc[:, 1:], axis=0)