For example, there is a dataset of vacancies with salary mentioned, so I want to make a histogram showing the distribution of salaries by number of vacancies. I want to organise bins by regular steps, e.g. a new bin every 50000 "money". I can write them directly, like in documentation:
df['salary'].hist(bins=(50000, 100000, 150000, 200000, 250000))
and so on, but with dataset values up to 2000000 the line will be hell long. I think it would be elegant to organise bins using a slice step ([::x]), but I have no idea how.
Humbly asking the community for their ideas and insights.
CodePudding user response:
what about bins=[bin for bin in range(0, 2000000, 50000)]
to calculate the max use
[bin for bin in range(0, df['salary'].max() 50000, 50000)]