I started learning data science, and I was taught that I could use matplotlib.pyplot
to plot my data. Here's a little preview of those (OHLC prices from 1 hour timeframe):
And also here's the output of SOLUSDT_close.head().to_dict()
for debugging purposes:
{'Close Price': {'2021-06-29 00:59:59.999': 33.65642566,
'2021-06-29 01:59:59.999': 33.309,
'2021-06-29 02:59:59.999': 32.76168975,
'2021-06-29 03:59:59.999': 33.118,
'2021-06-29 04:59:59.999': 33.267}}
So, I'm only interested in plotting the Close Price
against the End Date
, and I wrote the following code to do so:
import pandas as pd
import matplotlib.pyplot as plt
# Import the csv file without index
SOLUSDT = pd.read_csv('C:/Users/ResetStoreX/Downloads/Binance futures data/SolUSDT-Mark_Prices_Klines_1h_Timeframe/zip/SOLUSDT-1h-June-29-2021-February-13-2022.csv', index_col=0)
# Create a new df that only contains the date and close price from the previous imported df
SOLUSDT_close = SOLUSDT[['End Date','Close Price']]
# Set the 'Date' column as the actual index
SOLUSDT_close.set_index('End Date', inplace=True)
# set the breadth and length of the plot as a good mix of values
plt.figure(figsize=(14,5))
# set a grid background to the plot
plt.grid(True)
# set the color of the trend as blue
plt.plot(SOLUSDT_close, 'b')
# give a title to the plot
plt.title('SOLUSDT close price from June 29 2021 to February 13 2022')
# give a label to the x axis
plt.xlabel('Date')
# gove a label to the y axis
plt.ylabel('Close Price')
# plot it
plt.plot(SOLUSDT_close)
However, I ended up with an output which was just disappointing:
As can be seen, the grid lines merged with a grey background, and also it didn't show some of the End Date values that one would expect at the x axis.
If I decide to delete the plt.grid(True)
statement and run the code again, it ends up plotting the following chart:
What I would like to achieve, is to plot a chart like this one from
May I get some assistance here please?
CodePudding user response:
The only issue here is that you have too many dates and each of them is plotting its line on the x-axis. One way is to use the keyword axis
in the
The other would be to only select certain times (dates) which you want to put on the x-axis - as you can see now, there are too many of them. Also, the second line for plotting is not necessary. This would be a very quick and dirty example of that but you get the idea:
import pandas as pd
import matplotlib.pyplot as plt
df= pd.read_csv('./sol-usd-max.csv', usecols=['snapped_at','price'])
df.set_index('snapped_at', inplace=True)
# set the breadth and length of the plot as a good mix of values
plt.figure(figsize=(14,5))
# set a grid background to the plot
plt.grid(True, axis='y')
plt.grid(True, axis='x', which='major')
# set the color of the trend as blue
plt.plot(df[::10], 'b')
# give a title to the plot
plt.title('SOLUSDT close price from June 29 2021 to February 13 2022')
# give a label to the x-axis
plt.xlabel('Date')
# give a label to the y axis
plt.ylabel('Close Price')
Update
As for the wrong datetime format, converting it to datetime (either by loading it as a datetime or converting it using df["snapped_at"] = pd.to_datetime(df["snapped_at"])
) also fixes the issue:
import pandas as pd
import matplotlib.pyplot as plt
df= pd.read_csv('./sol-usd-max.csv', usecols=['snapped_at','price'], parse_dates=["snapped_at"])
# df.set_index('snapped_at', inplace=True)
# set the breadth and length of the plot as a good mix of values
plt.figure(figsize=(14,5))
# set a grid background to the plot
plt.grid(True)
# set the color of the trend as blue
plt.plot(df["snapped_at"], df["price"], c='b')
# give a title to the plot
plt.title('SOLUSDT close price from June 29 2021 to February 13 2022')
# give a label to the x axis
plt.xlabel('Date')
# give a label to the y axis
plt.ylabel('Close Price')
but that was not the main concern of the question. I mean, it would not have happened if the dtypes were correct but it's not directly causing the background to be grey.