Let's consider data frame following:
import pandas as pd
df = pd.DataFrame([[1, -2, 3, -5, 4 ,2 ,7 ,-8 ,2], [2, -4, 6, 7, -8, 9, 5, 3, 2], [2, 4, 6, 7, 8, 9, 5, 3, 2], [1, 2, 3, 4, 5, 6, 7, 8, 9]]).transpose()
df.columns = ["A", "B", "C", "D"]
A B C D
0 1 2 2 1
1 -2 -4 4 2
2 3 6 6 3
3 -5 7 7 4
4 4 -8 8 5
5 2 9 9 6
6 7 5 5 7
7 -8 3 3 8
8 2 2 2 9
I want to add at the end of the column name "pos" if column contain only positive values. What I would do with it is:
pos_idx = df.loc[:, (df>0).all()].columns
df[pos_idx].columns = df[pos_idx].columns "pos"
However it seems not to work - it returns no error, however it does not change column names. Moreover, what is very interesting, is that code:
df.columns = df.columns "anything"
actually add to column names word "anything". Could you please explain to me why it happens (works in general case, but it does not work on index case), and how to do this correctly?
CodePudding user response:
You are saving the new column names onto a copy of the dataframe. The below statement is not overwriting column names of df
, but only of the slice df[pos_idx]
df[pos_idx].columns = df[pos_idx].columns "pos"
Your second code example directly acccesses df
, that's why that one works
How to make it work? --> Define the "full columns list" (separately). Afterwards write it into df directly.
How to define the "full list"? Add "pos" as a suffix to all cols which don't have any occurrence of values that are <=0.
my_col_list = [col (count==0)*"_pos" for col, count in (df <= 0).sum().to_dict().items()]
df.columns = my_col_list
CodePudding user response:
First of all, use .rename() function to change the name of a column.
To add 'pos' to columns with non negative values you can use this:
renamed_columns = {i:i ' pos' for i in df.columns if df[i].min()>=0}
df.rename(columns=renamed_columns,inplace=True)