Pandas : How to use one value from a column (value repeats itself) as a header from another column,-CodePudding

I have a data with multiple inputs from a semi-structured csv and I am trying to use one single (first) value from a set of columns (more then 500) as a header for other set of columns containing similar headers (another 500 rows)

After reading it I got something like this

import pandas as pd, numpy as np

df = pd.DataFrame({'Service': np.arange(8),
               'Ticket': np.random.rand(8),
               'Var_1': np.random.rand(8), # values column
               'Var_1_View': 'temp temp temp temp temp temp temp temp'.split(), # header of values of column
               'Var_2': np.arange(8), 
               'Var_2_View': 'pres pres pres pres pres pres pres pres'.split(),
               'Var_3': np.arange(8) * 2,
               'Var_3_View': 'shift shift shift shift shift shift shift shift'.split(),
               'D': np.arange(8) * 5,
               'Mess_3': np.random.rand(8),
               'Mess_3_View': 'id id id id id id id id'.split(),
               'E': np.arange(8)})

Headers containing values end with up to a 3 Digit number _# up to _### (more then 500 to be precise). Headers with description about values end with text : _View

I've created two dfs, one containing and other not containing the expression _View

df_headers =df.iloc[:,df.columns.str.contains('View')] # wanted headers on columns containing values
df_values =df.iloc[:,~df.columns.str.contains('View')] # headers should be replaced here

My idea was to extract first values from the df_headers as a list and using df.replace or df.rename, change headers on the df_values containing the values.

I could do it manually, but i have a huge df with different prefixxes and suffixes, but always using the _View as description to nearest column containing values.

As a result I would have the df_dont with new headers and with columns were this rule doesnt apply (Ticket, D, E, etc).

Since its my first question I would be great to have feedback, about clearness, explanation, or any other positive comments are welcome.

CodePudding user response：

It's not completely clear to me what you want to achieve so this might be off:

view_cols = {col for col in df.columns if col.endswith("_View")}
rename_dict = {
    col.replace("_View", ""): df[col].iat[0] for col in view_cols
}
new_cols = [col for col in df.columns if col not in view_cols]
df_new = df[new_cols].rename(columns=rename_dict)

Result:

   Service    Ticket      temp  pres  shift   D        id  E
0        0  0.623941  0.934402     0      0   0  0.644999  0
1        1  0.122866  0.918892     1      2   5  0.675976  1
2        2  0.472081  0.790443     2      4  10  0.825020  2
3        3  0.914086  0.849609     3      6  15  0.357074  3
4        4  0.684477  0.729126     4      8  20  0.010928  4
5        5  0.132002  0.673680     5     10  25  0.884599  5
6        6  0.841921  0.224638     6     12  30  0.197387  6
7        7  0.721800  0.412439     7     14  35  0.875199  7