I have a dataset with athletes, their age-range and the year they competed:
import ipywidgets as widgets
from ipywidgets import interact
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
data = {'Name':['Tom', 'Nick', 'Chris', 'Jack'],
'Age':['20-25', '19<', '>75', '20-25'],
'Year':[2019,2019,2018,2018],
'Count': [1,1,1,1]}
df = pd.DataFrame(data)
Mind you, this is 1000s on entries long, but I didn't want to use that much space so I'm using a much simplified version.
I am trying to make an interactive bar plot with ipywidgets using that data.
age = ['20-25', '19<', '>75','20-25']
@interact(Age = age)
def f(Age):
df2= df[df['age'].str.contains(Age)].groupby('Year').sum()
df2['year'] = ['2018','2019']
return sns.barplot(x="Year", y="Count", data=df2)
The x axis is the year, the y axis is the number of participants that year and you can interactively filter via a scroll bar and see the number of participants by age range of your choosing.
I didn't know how to do that, so I tried to 'brute force it' and added a count column of 1, so that I would get the total number of participants even when the dataset got filtered. I also brute forced the years, but some age ranges don't compete in all years so It breaks and it's not working.
Age is all the various options of age range, I put them all in a list.
CodePudding user response:
- Fix the spelling issues, since you are interchanging upper and lowercase. Also none of the values in the dataframe are numeric (both issue fixed in an edit to the OP).
- Create
age
withage = df.Age.unique()
- Reset the index instead of using
df2['year'] = ['2018','2019']
age = df.Age.unique()
@interact(Age = age)
def f(Age):
df2 = df[df['Age'].str.contains(Age)].groupby('Year').count().reset_index()
return sns.barplot(x="Year", y="Count", data=df2)
- Without adding a
'Count'
column - There's no need to add a
'Count'
column to the dataframe because.groupby('Year').count()
will provide the count for each existing column, in which casey=
can be'Name'
or'Age'
, but then theylabel
would need to be changed.
age = df.Age.unique()
@interact(Age = age)
def f(Age):
df2 = df[df['Age'].str.contains(Age)].groupby('Year').count().reset_index()
p = sns.barplot(x="Year", y="Age", data=df2)
p.set_ylabel('Count')
return p
- Use
seaborn.countplot
instead
age = df.Age.unique()
@interact(Age = age)
def f(Age):
data = df[df['Age'].str.contains(Age)]
return sns.countplot(data=data, x='Year')