Home > Software engineering >  How to change color of data points on a scatter plot according to an age range?
How to change color of data points on a scatter plot according to an age range?

Time:11-18

!(https://i.stack.imgur.com/FX1vB.png) !(https://i.stack.imgur.com/mGajr.png)

Hello everyone,

I am very new to Python so bear with me. I am sure this is an easy answer.

Above is my scatter plot, with GOLF Data from Kaggle. The X variable is Fairway Hit% and the Y variable is Average Driving Distance. I can see there is a slight negative correlation in the data.

Each red dot is a player. I want to make each dot a different color based on the age of the player. There is a whole series in my data set titled 'AGE' and it varies from 21 to 49. For example, I want to have players that are aged 20-29 be a blue dot, aged 30-39 be a red dot, and aged 40-49 be a yellow dot.

I have tried to research this to not much avail, as I tried to write code like the third picture above. I tried to define a subseries of 'AGE' as something like 'AGE' >= 20 <= 29.

I haven't had any luck and I'm sure this isn't too difficult, so any help would be appreciated. Thank you. INCORRECT DATA

I tried to make each dot a different color that was representative of the age of the golfer.

CodePudding user response:

import pandas as pd
df = pd.DataFrame({'Age': [18, 22, 26,36, 47,78]})
YOUNG = df[(df['Age']>=20) & (df['Age']<=29)]
YOUNG

Or if the type of Age is string,

import pandas as pd
df = pd.DataFrame({'Age': ['18', '22', '26', '36', '47', '78']})
df['Age'] = df['Age'].astype('int64')
YOUNG = df[(df['Age']>=20) & (df['Age']<=29)]
YOUNG
  • Related