I would like to plot a bar graph using relevant libraries, matplotlib
or seaborn
or pandas
.
Two bar graphs that show the total number of both, American Airline & American Eagle Airline flights, for every year.
But not a stacked bar plot.
My current df
looks like this:
df = pd.DataFrame({'Date':['2005-07-01','2005-07-01','2005-07-01','2005-08-01',
'2007-08-01', '2007-22-04', '2008-07-06'],
'Flight Name':['American Airline','American Airline','American Airline','American Eagle Airline',
'American Eagle Airline','American Airline','American Eagle Airline'],
'GEO Summary':['Domestic','Domestic','Domestic','International',
'International','Domestic','International'],
'Flight Name Variable':[1,1,1,0,0,1,0]})
Date | Flight Name | GEO Summary | Flight Name Variable |
---|---|---|---|
2005-07-01 | American Airline | Domestic | 1 |
2005-07-01 | American Airline | Domestic | 1 |
2005-07-01 | American Airline | Domestic | 1 |
2006-08-01 | American Eagle Airline | International | 0 |
2007-08-01 | American Eagle Airline | International | 0 |
2007-22-04 | American Airline | Domestic | 1 |
2008-07-06 | American Eagle Airline | International | 0 |
What I have tried so far and its not working;
ax = df['Flight Name Variable'].value_counts().plot.bar(color=["SkyBlue","IndianRed"], rot=0, title="test")
plt.tight_layout()
plt.show()
I cant seem to find a solution to get the yearly dates to display on the x-axis. Any suggestions.
CodePudding user response:
Are you looking for something like this? It is using matplotlib.pyplot and seaborn and shows the total count of those flights on that date:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.DataFrame({'Date':['2005-07-01','2005-07-01','2005-07-01','2005-08-01','2005-07-01', '2005-08-01', '2005-08-01'],
'Flight Name':['American Airline','American Airline','American Airline','American Eagle Airline','American Eagle Airline','American Airline','American Eagle Airline'],
'GEO Summary':['Domestic','Domestic','Domestic','International','International','Domestic','International'],
'Flight Name Variable':[1,1,1,0,0,1,0]})
df.Date = pd.to_datetime(df.Date).dt.date # Convert the string dates to actual datetime values
grouped = df.groupby(by=["Flight Name", "Date"], as_index=False).count()
sns.barplot(data=grouped, x="Date", y="Flight Name Variable", hue="Flight Name")
plt.title("Flights")
plt.show()
Output:
Or with more grouping/variables for the hue setting:
df = pd.DataFrame({'Date':['2005-07-01','2005-07-01','2005-07-01','2005-08-01','2005-07-01', '2005-08-01', '2005-08-01'],
'Flight Name':['American Airline','American Airline','American Airline','American Eagle Airline','American Eagle Airline','American Airline','American Eagle Airline'],
'GEO Summary':['International','Domestic','Domestic','International','International','Domestic','International'],
'Flight Name Variable':[1,1,1,0,0,1,0]})
df.Date = pd.to_datetime(df.Date).dt.date # Convert the string dates to actual datetime values
grouped = df.groupby(by=["Flight Name", "Date", "GEO Summary"], as_index=False).count()
# Define your Hue to be both the Flight Name and the GEO Summary and place them in a nice looking format with lambda
hue = grouped[['Flight Name', 'GEO Summary']].apply(lambda row: f"{row['Flight Name']}, {row['GEO Summary']}", axis=1)
sns.barplot(data=grouped, x="Date", y="Flight Name Variable", hue=hue)
plt.title("Flights")
plt.legend(loc="best")
plt.show()