Home > Software engineering >  Plotting count values over time for specific country names with pandas
Plotting count values over time for specific country names with pandas

Time:05-02

I have a dataframe, df, containing information about a company, the country they are located in, and the year they were founded. I now need to plot the development of the amount of companies founded per country for each year in the dataset (between 1995 - 2015) in a line, however all I manage to create is a pie chart with the total companies funded per country, but not including the year_founded information.

The data looks like this:

Company Country Year_founded
A USA 1996
B NLD 2004
C CAN 2014
D USA 2000
E NLD 1999
F CAN 2000
etc.

Ideally I would like to plot the total amount of companies per country in a line chart with different lines per country.

Any ideas on how to approach this problem?

CodePudding user response:

IIUC, you can use crosstab and plot.line:

ax = pd.crosstab(df['Year_founded'], df['Country']).plot.line()
ax.set_ylabel('Number of founded companies')
from matplotlib.ticker import MaxNLocator
ax.xaxis.set_major_locator(MaxNLocator(integer=True))

output:

enter image description here

crosstab:

Country       CAN  NLD  USA
Year_founded               
1996            0    0    1
1999            0    1    0
2000            1    0    1
2004            0    1    0
2014            1    0    0

CodePudding user response:

You could use groupby and reindex so that all years from 1995-2015 are in your graph:

data = df.groupby(["Country", "Year_founded"])["Company"].count().unstack(0).reindex(range(1995,2016)).fillna(0)

>>> data.plot()

enter image description here

>>> data
Country       CAN  NLD  USA
Year_founded               
1995          0.0  0.0  0.0
1996          0.0  0.0  1.0
1997          0.0  0.0  0.0
1998          0.0  0.0  0.0
1999          0.0  1.0  0.0
2000          1.0  0.0  1.0
2001          0.0  0.0  0.0
2002          0.0  0.0  0.0
2003          0.0  0.0  0.0
2004          0.0  1.0  0.0
2005          0.0  0.0  0.0
2006          0.0  0.0  0.0
2007          0.0  0.0  0.0
2008          0.0  0.0  0.0
2009          0.0  0.0  0.0
2010          0.0  0.0  0.0
2011          0.0  0.0  0.0
2012          0.0  0.0  0.0
2013          0.0  0.0  0.0
2014          1.0  0.0  0.0
2015          0.0  0.0  0.0
  • Related