Can the columns of a data frame be named based on the values of specific rows in python? The figures attached show an example of how I would like to modify the data frame:
Starting data frame
Desirable output
The column names should match the following pattern: [condition3condition2 condition1] and the first three rows reporting the condition values should be deleted from the data frame together with the top index.
CodePudding user response:
You can create a dataframe
data = [
...: ["condition1", 29, 37, 45],
...: ["condition2", "name1", "name2", "name3"],
...: ["condition3", "A", "D", "R"],
...: ["vial-A", 0.8, 1.7, 0.9],
...: ["vial-B", 2.2, 4.5, 2.1],
...: ["vial-C", 0.5, 6.1, 5.3],
...: ]
df = pd.DataFrame(data=data, columns=[1, 2, 3, 4], index=[1])
df_t = df.set_index(1).T
#output
1 condition1 condition2 condition3 vial-A vial-B vial-C
2 29 name1 A 0.8 2.2 0.5
3 37 name2 D 1.7 4.5 6.1
4 45 name3 R 0.9 2.1 5.3
# `col_name` will become the pivot
df_t["col_name"] = df_t[["condition1", "condition2", "condition3"]].apply(
lambda x: f"{x[2]}{x[0]} {x[1]}", axis=1
)
output = (
df_t[["vial-A", "vial-B", "vial-C", "col_name"]]
.set_index("col_name")
.T.rename_axis(index=None, columns=None
)
#output
A29 name1 D37 name2 R45 name3
vial-A 0.8 1.7 0.9
vial-B 2.2 4.5 2.1
vial-C 0.5 6.1 5.3
CodePudding user response:
You can use .rename()
.
#assuming we already have our dataframe named df
columns= list(df.columns) # this will return a list of all the columns in the dataframe
for i in len(columns):
old_name=columns[i]
new_name=str(columns[i][0]) str(columns[i][1]) str(columns[i][2]) #concatenates the conditions
#into a single variable
df.rename(columns={old_name:new_name},inplace=True)
#After renaming the columns, we can now drop the unwanted rows
df.drop([0,1,2], axis=0, inplace=True)