Home > OS >  rename columns according to list
rename columns according to list

Time:11-02

I have 3 lists of data frames and I want to add a suffix to each column according to whether it belongs to a certain list of data frames. its all in order, so the first item in the suffix list should be appended to the columns of data frames in the first list of data frames etc. I am trying here but its adding each item in the suffix list to each column.

In the expected output

  • all columns in dfs in cat_a need group1 appended
  • all columns in dfs in cat_b need group2 appended
  • all columns in dfs in cat_c need group3 appended

data and code are here

df1, df2, df3, df4 = (pd.DataFrame(np.random.randint(0,10,size=(10, 2)), columns=('a', 'b')), 
                      pd.DataFrame(np.random.randint(0,10,size=(10, 2)), columns=('c', 'd')),
                      pd.DataFrame(np.random.randint(0,10,size=(10, 2)), columns=('e', 'f')),
                      pd.DataFrame(np.random.randint(0,10,size=(10, 2)), columns=('g', 'h')))

cat_a = [df1, df2]
cat_b = [df3, df4, df2]
cat_c = [df1]

suffix =['group1', 'group2', 'group3']
dfs = [cat_a, cat_b, cat_c]

for x, y in enumerate(dfs):
    for i in y:
        suff=suffix
        i.columns = i.columns   '_'   suff[x]

thanks for taking a look!

CodePudding user response:

Assuming you want to have multiple suffixes for some dataframes, I think this is what you want?:

suffix_mapper = {
    'group1': [df1, df2],
    'group2': [df3, df4, df2],
    'group3': [df1]
}

for suffix, dfs in suffix_mapper.items():
    for df in dfs:
        df.columns = [f"{col}_{suffix}" for col in df.columns]

CodePudding user response:

Brian Joseph's answer is great*, but I'd like to point out that you were very close, you just weren't renaming the columns correctly. Your last line should be like this:

i.columns = [col   '_'   suff[x] for col in i.columns]

instead of this:

i.columns = i.columns   '_'   suff[x]

CodePudding user response:

I think the issue is because you're not taking a copy of the dataframe so each cat dataframe is referencing a df dataframe multiple times.

Try:

cat_a = [df1.copy(), df2.copy()]
cat_b = [df3.copy(), df4.copy(), df2.copy()]
cat_c = [df1.copy()]
  • Related