I have extracted a dataframe of n columns, where the first column is the index column with no header followed by pairs of "success" and "fail" columns. I managed to extract the success only columns and place a header on the index column with this code:
df2 = df1.iloc[:,0::2]
df3 = df2
df3.reset_index(inplace=True)
df4 = df3.rename(columns = {'index':'out_date'})
df4
Output of the code can be found here
I would like to sort the "out_date" column in ascending order using sort_values but for that to work, the "success" columns need to be unique. I have this line of code that is able to rename the headers to "success1", "success2", "success3",..., but I can't figure out how to exclude the "out_date" column.
numcolumn = df4.shape[1]
df4.columns = ["success" str(x) for x in range(1,numcolumn 1)]
df4
Any help given will be appreciated. Thank you.
CodePudding user response:
I suggest set new columns names before converting index
to out_date
column with enumerate
and f-string
s:
df2 = df1.iloc[:,0::2]
df2.columns = [f"success{i}" for i, x in enumerate(df2.columns, 1)]
df4 = df2.rename_axis('out_date').reset_index()
If need your solution is possible add first value like list:
df4.columns = df4.columns[:1].tolist() ["success" str(x) for x in range(1,numcolumn)]
CodePudding user response:
I had posted a generic way for de-duplicating column names while leaving the first one untouched in this answer:
def suffix():
yield ''
i = 0
while True:
i = 1
yield f'_{i}'
def dedup(df):
from collections import defaultdict
d = defaultdict(suffix)
df.columns = df.columns.map(lambda x: x next(d[x]))
dedup(df)
example:
df = pd.DataFrame([range(7)], columns=['out_date'] ['success']*6)
# out_date success success success success success success
# 0 0 1 2 3 4 5 6
dedup(df)
print(df)
output:
out_date success success_1 success_2 success_3 success_4 success_5
0 0 1 2 3 4 5 6