i am having a problem displaying a legend in a scatter plot. A simple example to show you my problem is this one: each integer from 1 to 10 has color. And I want a label shown for the ten numbers and their colors (basically : color and its corresponding number)
I have all the values in a dataframe (the dataframe i show you is only an example, the real example is consists of hundreds of lines)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
x = np.array([1,2,3,4,5,6,7,8,9,10])
y = 2*x
df = pd.DataFrame()
palette = {1: "blue", 2:"orange", 3:"green", 4:"red", 5:"purple", 6:"brown", 7:"pink", 8:"gray", 9:"olive", 10:"cyan"}
df["first"] = x
df["second"] = y
df["third"] = df["first"].apply(lambda x: palette[x])
plt.scatter(df["first"], df["second"], c=df["third"])
plt.legend()
plt.show()
adding the argument to the scatter line does not help (legend = c=df2["third"])
I cannot find a solution for this.
Thank you if you have any pointers
CodePudding user response:
Each line in a standard legend corresponds to one plot with a label. You can draw the scatter plot one color at a time, and assign the corresponding label.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
y = 2 * x
df = pd.DataFrame()
palette = {1: "blue", 2: "orange", 3: "green", 4: "red", 5: "purple", 6: "brown", 7: "pink", 8: "gray", 9: "olive", 10: "cyan"}
df["first"] = x
df["second"] = y
df["third"] = df["first"].apply(lambda x: palette[x])
for number, color in palette.items():
plt.scatter(x="first", y="second", c=color, data=df[df["first"] == number], label=number)
plt.legend()
plt.show()
The process can be simplified a lot, working with hue in Seaborn:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
y = 2 * x
df = pd.DataFrame()
palette = {1: "blue", 2: "orange", 3: "green", 4: "red", 5: "purple", 6: "brown", 7: "pink", 8: "gray", 9: "olive", 10: "cyan"}
df["first"] = x
df["second"] = y
df["third"] = df["first"].apply(lambda x: palette[x])
sns.set()
sns.scatterplot(data=df, x="first", y="second", hue="first", palette=palette)
plt.show()