Home > OS >  indexing a pandas data frame without integer number
indexing a pandas data frame without integer number

Time:01-26

If I have a dataframe

date
01.01.2003
02.01.2003
03.01.2003
05.01.2003
06.01.2003

And I apply this code

for i in (df['date']):
    if df['date'].iloc[i 1]-df['date'].iloc[i] == 1 :
        df['Max'] = df['date'].iloc[i 1]
    else :
        df['Max'] = ''

it will turn an error

Addition/subtraction of integers and integer-arrays with Timestamp is no longer supported.  Instead of adding/subtracting `n`, use `n * obj.freq`

but if I change i to timdelta(days=1) it will turn error says that cannnot indexing if not integer.

Then how the code supposed to like ? I want to define 'max' on every consecutive days.

this is the output I like

date         max
01.01.2003   
02.01.2003
03.01.2003   03.01.2003
05.01.2003
06.01.2003   06.01.2003

*Note that it only write the max one in consecutive, and leave other empty. from 03.01.2003 to 05.01.2003 is not consecutive so start new

CodePudding user response:

Convert column to datetimes and then get maximal value to new column by consecutive datetimes by Series.diff and compare by 1 day with GroupBy.transform and max:

df['date'] = pd.to_datetime(df['date'], dayfirst=True)

df['Max'] = df.groupby(df['date'].diff().dt.days.ne(1).cumsum())['date'].transform('max')

#thank you Corralien for alternative
df['Max'] = df.groupby(df['date'].diff().ne('1D').cumsum())['date'].transform('max')
print (df)
        date        Max
0 2003-01-01 2003-01-03
1 2003-01-02 2003-01-03
2 2003-01-03 2003-01-03

If need remove consecutive maximal datetimes use:

df['date'] = pd.to_datetime(df['date'], dayfirst=True)

s = df['date'].diff().dt.days.ne(1).cumsum()
df['Max'] = df.groupby(s)['date'].transform('max').mask(s.duplicated(keep='last'))

#thank you Corralien for alternative
s = df['date'].diff().ne('1D').cumsum()
df['Max'] = df.groupby(s)['date'].transform('max').mask(s.duplicated(keep='last'))
print (df)
        date        Max
0 2003-01-01        NaT
1 2003-01-02        NaT
2 2003-01-03 2003-01-03
3 2003-01-05        NaT
4 2003-01-06 2003-01-06
  • Related