I have a dataframe with a variety of columns, but the key part of data I am looking to extract is in columns which are named using datetime values which hold a floating point number for currency.
I am basically just looking to find the max value of any column that is of a date value (i.e. 2021-01-15 00:00:00) per row. I originally used list() to try find any column with '-' in but guessing due to the format I can't directly reference the datetime values?
Example df:
index, ID, Cost, 2021-01-01 00:00:00, 2021-01-08 00:00:00, 2021-01-15 00:00:00
0, 1, 4000, 40.50, 50.55, 60.99
0, 1, 500, 20.50, 80.55, 160.99
0, 1, 4000, 40.50, 530.55, 1660.99
0, 1, 5000, 40.50, 90.55, 18860.99
0, 1, 9000, 40.50, 590.55, 73760.99
CodePudding user response:
You can find the 'date' columns using a list comprehension which will return the columns that contain /
. Then you can use max(axis=1)
to create the column which will show the highest value per row, of your date like columns:
date_cols = [c for c in list(df) if '/' in c]
df['max_per_row'] = df[date_cols].max(axis=1)
prints:
index ID Cost ... 08/01/2021 00:00 15/01/2021 00:00 max_per_row
0 0 1 4000 ... 50.55 60.99 60.99
1 0 1 500 ... 80.55 160.99 160.99
2 0 1 4000 ... 530.55 1660.99 1660.99
3 0 1 5000 ... 90.55 18860.99 18860.99
4 0 1 9000 ... 590.55 73760.99 73760.99
CodePudding user response:
Use DataFrame.iloc
for select all columns without first 2:
df['new'] = df.iloc[:, 2:].max(axis=1)
If need select float columns use DataFrame.select_dtypes
:
df['new'] = df.select_dtypes('float').max(axis=1)
For columns with -
use DataFrame.filter
:
df['new'] = df.filter(like='-').max(axis=1)