How to rename column names according to value of columns-CodePudding

I need to arrange a Pandas DataFrame with values that aren't in the right columns. I would like to rearrange the values in the cells according to a prefix that I have, and push the 'unknown' columns with their values to the end of the dataframe.

I have the following dataframe:

The output I am looking for is: the 'known' values have a header while the unknowns (5, 6) are to the end. the 'rule': if there is no cell with '|' in the column then the column name will not be changed.

any suggestions that I could try would be really helpful in solving this.

CodePudding user response：

Try this:

import pandas as pd
rename_dict = {} # reset rename dictionay
df = pd.DataFrame({'1':['name | Steve', 'name | John'],  
                   '2':[None, None],
                    '3':[None , 'age | 50']})
for col in df.columns:
    vals = df[col].values # look at values in each column
    vals =  [x for x in vals if x] # remove Nulls
    vals = [x for x in vals if '|' in x] # leave values with | only
    if len(vals) > 0:
        new_col_name = vals[0].split('|')[0] # getting the new column name
        rename_dict[col] = new_col_name # add column names to rename dictionay
df.rename(columns=rename_dict, inplace = True) # renaming the column name
df

          name      2      age 
0  name | Steve  None      None
1   name | John  None  age | 50

CodePudding user response：

it looks a bit tricky and not exactly what you expected, but it might give you an idea how to solve your task:

df = pd.DataFrame([['email | [email protected]','name | name1','surname | surname1','','',''],
                   ['email | [email protected]','','name | name2','occupation | student','surname | surname2','abc | 123']])


df.apply(lambda x: pd.Series(dict([tuple(i.split(' | ')) for i in x.tolist() if i])),axis=1)

>>> out
'''
   abc       email   name occupation   surname
0  NaN  [email protected]  name1        NaN  surname1
1  123  [email protected]  name2    student  surname2