I have a continually growing dataframe and periodically I want to retrieve the last row.
# dbdf.info(memory_usage='deep')
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 6652 entries, 2022-10-23 17:15:00-04:00 to 2022-10-28 08:06:00-04:00
Freq: T
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 open 6592 non-null float64
1 high 6592 non-null float64
2 low 6592 non-null float64
3 close 6592 non-null float64
dtypes: float64(4)
memory usage: 259.8 KB
The dataframe doesn't occupy a very large memory footprint, but notwithstanding, I'd like to understand the most efficient method of retrieving the last row to the extent that I can then call .to_dicts()
on that last row.
I can certainly do something naive like:
bars = dbdf.to_dict(orient="records")
print(bars[-1])
And in this particular case it would likely be just fine given the small size of the dataframe, but if the dataframe was orders of magnitude larger in memory footprint and rows, is there a better way to achieve the same that could also be considered a best common practise regardless as to the dataframe's footprint?
CodePudding user response:
First select last row by DataFrame.iloc
and then convert to dictionary by Series.to_dict
:
d = df.iloc[-1].to_dict()
CodePudding user response:
There are 2 ways:
- Use Tail Function
The tail Function is used to show the last rows from the dataFrame. specifying the number 1 will show the last row of df.
df.tail(1)
- Use Iloc Function.
iloc is an indexed-based selection technique which means that we have to pass integer index in the method to select a specific row/column
df.iloc[-1]