Home > Net >  Want to understand matplotlib polar/radial plots
Want to understand matplotlib polar/radial plots

Time:10-13

I am using enter image description here

Now, I want to covert that into a radial or polar plot, like this: ( i used another tool to do that), where the angular axis is time and the radial is year month.

enter image description here

My question is, how do I calculate r and theta to plot that using matplotlib? In the how I mean the logic to convert cartesian to polar, not the actual code. I am looking to understand how it works in general.

c = ax.scatter(theta, r)

I have seen a few examples online, but none explains the logic behind? Thanks!

CodePudding user response:

I got it, most examples use radial coordinates while in this case I need to convert the cartesian coordinates to polar.

CodePudding user response:

Polar plotting in matplotlib can be challenging because of the coordinate conversion, as you mentioned, and more so when you add the date/time to the x/y axis like in your case. Here is a stab at it.

The first important step is to open the data and condition it. We can use the pandas library to open the csv with your data (fig_tweets

Okay now on to the polar part. For polar coordinates you need r, radial coordinate, and theta, an angle. For more info, you can check fig_polar

As a sanity check, both graphs have a gap between the hours of 6 and 10, and a gap around September- October.

Hope this helps, matplotlib has a nice Polar plot example. I have to admit this was a challenging problem and I enjoyed finding an answer, thanks for the question. Cheers!

Here is a copy of the full script just in case:

from datetime import timezone, datetime
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates


df_csv = pd.read_csv('data-tweets.csv', encoding="ISO-8859-1", parse_dates=['created'])
# For some reason, the data is not in utf-8 so it needs encoding ISO-8859-1
print(f'Available columns: {df_csv.columns.values}')

# Filter for 2016 as an example
df = df_csv[(df_csv['created'] >= '2016-01-01 00:00:00') & (df_csv['created'] <= '2016-12-31 00:00:00')]
# Reset index with the 2016 data
df.reset_index(inplace=True)

# Extract year/month/day to plot as y-axis
date = pd.to_datetime(df['created']).dt.strftime('%Y/%m/%d').to_numpy()
# Extract hour:minute:second to plot as x-axis
time = pd.to_datetime(df['created']).dt.strftime('%H:%M:%S').to_numpy()

# Time and date are currently strings, we need them to be datetime (dt) objects so matplotlib can understand
date_dt = [datetime.strptime(x, '%Y/%m/%d').replace(tzinfo=timezone.utc) for x in date]
time_dt = [datetime.strptime(x, '%H:%M:%S').replace(tzinfo=timezone.utc) for x in time]

# Initiate figure
fig, ax = plt.subplots(figsize=(6, 6), layout='tight')
ax.plot(time_dt, date_dt, 'o')

# Format date/time for both x and y axes
x_fmt = mdates.DateFormatter('%H:%M:%S')
ax.xaxis.set_major_formatter(x_fmt)
y_fmt = mdates.DateFormatter('%Y/%m/%d')
ax.yaxis.set_major_formatter(y_fmt)
ax.tick_params(axis='x', rotation=45)

# Label axes
ax.set_xlabel('Time [HH:MM:SS]')
ax.set_ylabel('Date [Y/m/d]')
fig.suptitle('Tweet timeline')

plt.show()


# We need date and time to be timestamps (i.e. a number, not a datetime object) so we can operate with them
# Chose directly to do matplotlib timestamps as we can later format the axes like we did before
date_timestamp = mdates.date2num(date_dt)

# Now we need to make the time (hours/minutes/second) into matplotlib timestamps too
# However matplotlib timestamps only works with full datetime timestamps(not just hours)
# Get the full datetime timestamps
dates = pd.to_datetime(df['created']).dt.strftime('%Y/%m/%d %H:%M:%S').to_numpy()
# Make a datetime object like previously
date_and_time_dt = np.array([datetime.strptime(x, '%Y/%m/%d %H:%M:%S').replace(tzinfo=timezone.utc) for x in dates])
# Make them into matplotlib timestamps
date_and_time_timestamp = mdates.date2num(date_and_time_dt)

# Take out the 'year/month/day' part so we can keep the 'hour/minute/second' part
# Now we have the hour information in matplotlib timestamps
time_timestamp = date_and_time_timestamp - date_timestamp

# Convert the cartesian x and y coordinates into polar coordinates
r = np.sqrt(time_timestamp ** 2   date_timestamp ** 2)
# We need the percentages of 24 hours for theta
# Calculate what one day is in matplotlib timestamps
delta_one_day_plt = mdates.date2num(datetime(2016, 1, 2)) - mdates.date2num(datetime(2016, 1, 1))
# Divide each hour/minute/second by the max amount in a day, and transform it into radians with 2 * PI
theta = (time_timestamp / delta_one_day_plt) * 2 * np.pi

# Initiate polar figure
fig, ax = plt.subplots(figsize=(6, 6), subplot_kw={'projection': 'polar'}, layout='tight')
ax.scatter(theta, r, alpha=0.3)

# Make the labels go clockwise
ax.set_theta_direction(-1)
# Place Zero at Top
ax.set_theta_offset(np.pi/2)
# Set the circumference ticks
ax.set_xticks(np.linspace(0, 2 * np.pi, 24, endpoint=False))
# Set the label names
ticks = np.arange(0, 24, 1)
ax.set_xticklabels(ticks)
ax.set_xlabel('Date [Y/m/d] & Time [HH]')
# Set y lim so that it focuses on the dates in 2016
ax.set_ylim([mdates.date2num(datetime(2016, 1, 1)), mdates.date2num(datetime(2017, 1, 1))])
# Set y ticks so that it is in that 'gap' in the data and doesn't cover the points
ax.set_rlabel_position(140)
# Format y-axis for dates
y_fmt = mdates.DateFormatter('%Y/%m/%d')
ax.yaxis.set_major_formatter(y_fmt)

fig.suptitle('Tweet timeline')

plt.show()
  • Related