Home > Software design >  Print 10 last rows of a dataframe for specific month excluding the last row
Print 10 last rows of a dataframe for specific month excluding the last row

Time:11-04

I would like to print 10 last rows of my dataframe for February 2007, but not considering the last row.

The dataframe includes date as index with several columns including numerical values.

I have tried this approach so far:

df['2007-02'].iloc[:-1,:].last('10D')

Is there another way to print but ignore the last row? Rather than removing it. Also, by using this approach I get a futurewarning, are there other techniques I might use that won't be deprecated shortly?

CodePudding user response:

Your FutureWarning is raised because you don't use loc for df['2007-02'].

You can get the last 11 rows and exclude the last one:

>>> df.loc['2007-02'].last('11D')[:-1]
# OR
>>> df.loc['2007-02'][:-1].last('10D') 

            A   B  C
2007-02-18  2   1  1
2007-02-19  6  10  7
2007-02-20  4   8  2
2007-02-21  2   3  3
2007-02-22  5   5  1
2007-02-23  7   2  5
2007-02-24  6   1  3
2007-02-25  5   5  6
2007-02-26  5   8  7
2007-02-27  5   8  7

Setup

dti = pd.date_range('2007-02-10', '2007-03-10', freq='D')
df = pd.DataFrame(np.random.randint(1, 10, (len(dti), 3)),
                  index=dti, columns=list('ABC'))
print(df)

# Output:
            A   B  C
2007-02-10  4   5  6
2007-02-11  2   7  5
2007-02-12  1   2  2
2007-02-13  4   6  1
2007-02-14  6   2  6
2007-02-15  8   4  4
2007-02-16  1   7  7
2007-02-17  3   8  5
2007-02-18  2   1  1
2007-02-19  6  10  7
2007-02-20  4   8  2
2007-02-21  2   3  3
2007-02-22  5   5  1
2007-02-23  7   2  5
2007-02-24  6   1  3
2007-02-25  5   5  6
2007-02-26  5   8  7
2007-02-27  5   8  7
2007-02-28  2   6  4
2007-03-01  7   4  7
2007-03-02  4   1  6
2007-03-03  2   1  5
2007-03-04  4   8  6
2007-03-05  7   6  6
2007-03-06  3   2  1
2007-03-07  2   3  6
2007-03-08  7   7  3
2007-03-09  5   6  8
2007-03-10  3   4  7

CodePudding user response:

You can try something like this:

random.seed(1)

df = pd.DataFrame({'Date':[random.choice(['2007-02','2007-03','2007-04','2007-05']) for x in range(100)],
                    'col1':[random.choice(["a", "b", "c", "d", "e", "f", "g"]) for x in range(100)],
                    'col2':[random.choice(range(10)) for x in range(100)]})

df.loc[df.Date=="2007-02",:][::-1][1:11]

Output:

Date col1 col2
95 2007-02 c 5
86 2007-02 d 8
81 2007-02 b 3
73 2007-02 b 9
67 2007-02 b 2
65 2007-02 g 1
60 2007-02 f 7
58 2007-02 a 5
50 2007-02 a 2
42 2007-02 f 9
  • Related