Home > Back-end >  Can the columns of a data frame be named based on the values of specific rows in python?
Can the columns of a data frame be named based on the values of specific rows in python?

Time:04-28

Can the columns of a data frame be named based on the values of specific rows in python? The figures attached show an example of how I would like to modify the data frame:

Starting data frame

Starting data frame

Desirable output

Desirable output

The column names should match the following pattern: [condition3condition2 condition1] and the first three rows reporting the condition values should be deleted from the data frame together with the top index.

CodePudding user response:

You can create a dataframe

data = [
   ...:     ["condition1", 29, 37, 45],
   ...:     ["condition2", "name1", "name2", "name3"],
   ...:     ["condition3", "A", "D", "R"],
   ...:     ["vial-A", 0.8, 1.7, 0.9],
   ...:     ["vial-B", 2.2, 4.5, 2.1],
   ...:     ["vial-C", 0.5, 6.1, 5.3],
   ...: ]
df = pd.DataFrame(data=data, columns=[1, 2, 3, 4], index=[1])
df_t = df.set_index(1).T

#output
1 condition1 condition2 condition3 vial-A vial-B vial-C
2         29      name1          A    0.8    2.2    0.5
3         37      name2          D    1.7    4.5    6.1
4         45      name3          R    0.9    2.1    5.3

# `col_name` will become the pivot
df_t["col_name"] = df_t[["condition1", "condition2", "condition3"]].apply(
   lambda x: f"{x[2]}{x[0]} {x[1]}", axis=1
)

output = (
    df_t[["vial-A", "vial-B", "vial-C", "col_name"]]
    .set_index("col_name")
    .T.rename_axis(index=None, columns=None
)
#output
       A29 name1 D37 name2 R45 name3
vial-A       0.8       1.7       0.9
vial-B       2.2       4.5       2.1
vial-C       0.5       6.1       5.3

CodePudding user response:

You can use .rename().

#assuming we already have our dataframe named df
columns= list(df.columns) # this will return a list of all the columns in the dataframe
for i in len(columns):
    old_name=columns[i]
    new_name=str(columns[i][0]) str(columns[i][1]) str(columns[i][2]) #concatenates the conditions
                                                                         #into a single variable
    df.rename(columns={old_name:new_name},inplace=True)
#After renaming the columns, we can now drop the unwanted rows
df.drop([0,1,2], axis=0, inplace=True)
  • Related