Home > other >  Parallel Coordinates with custom colors based on type
Parallel Coordinates with custom colors based on type

Time:10-21

I am working with a data set on Pokemon from pokemon parallel coordinates

I would like the labels to correspond to the color dictionary defined above, and it looks like I currently have repeating colors. TIA!

CodePudding user response:

Here's how you can do this:

import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (10,10)
df = pd.read_csv("/kaggle/input/pokemon/Pokemon.csv")
COLORS = {'Normal' : '#AAAA77',
     'Fire': '#ff4422',
     'Water': '#3399ff',
     'Electric': '#ffcc33',
     'Grass': '#77cc55',
     'Ice': '#66ccff',
     'Fighting': '#bb5544',
     'Poison': '#aa5599',
     'Ground': '#ddbb55',
     'Flying': '#8899ff',
     'Psychic': '#ff5599',
     'Bug': '#aabb22',
     'Rock': '#bbaa66',
     'Ghost': '#6666bb',
     'Dragon': '#7766ee',
     'Dark': '#775544',
     'Steel': '#aaaabb',
     'Fairy': '#ee99ee'}
df = df.loc[df["Generation"] == 1]
colors = [COLORS.get(i) for i in df["Type 1"]]
df.index = colors
# In gen 1 special attack and special defense are the same
df.rename(columns={"Sp. Atk":"Special"}, inplace=True) 
ax = df[["HP", "Attack", "Defense", "Special", "Speed"]].T.plot(color=colors)
# Remove duplicate legend labels, but maintain colors
l, d = [], set()
for t, line in zip(plt.legend().texts, plt.legend().legendHandles):
    if t.__repr__() in d:
        continue
    d.add(t.__repr__())
    l.append((t, line))
ax.get_legend().remove()
flip_colors = {v:k for k, v in COLORS.items()}
plt.legend([j[1] for j in l], [flip_colors[j[0]._text] for j in l])
plt.show()

Yields:

enter image description here

CodePudding user response:

  • The point of a visualization is to clearly convey information, so a line plot connecting discrete categories is not an appropriate visualization for this data.
    • A bar plot should be used
  • Convert the wide dataframe to a long dataframe with melt
  • Select the Generation 1 data with Boolean indexing
  • Sort the data
  • Plot the data with seaborn.catplot with kind='bar'.
    • Use colors as the palette for hue.
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

# load the data
df = pd.read_csv('Pokemon.csv')

# specify colors
colors = {'Normal': '#AAAA77', 'Fire': '#ff4422', 'Water': '#3399ff', 'Electric': '#ffcc33', 'Grass': '#77cc55',
          'Ice': '#66ccff', 'Fighting': '#bb5544', 'Poison': '#aa5599', 'Ground': '#ddbb55', 'Flying': '#8899ff',
          'Psychic': '#ff5599', 'Bug': '#aabb22', 'Rock': '#bbaa66', 'Ghost': '#6666bb', 'Dragon': '#7766ee',
          'Dark': '#775544', 'Steel': '#aaaabb', 'Fairy': '#ee99ee'}

# create a list of the stats columns
cols = ['HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed']

# melt the dataframe into a long form
sel = df.melt(id_vars=['Name', 'Type 1', 'Generation'], value_vars=cols, var_name='Stats')

# for this example select only generation 1 and sort by Type 1 and Name
g1 = sel[sel.Generation.eq(1)].sort_values(['Type 1', 'Name'])

# plot with seaborn, a high-level api for matplotlib
p = sns.catplot(kind='bar', data=g1, x='Name', y='value', hue='Type 1', palette=colors, row='Stats',
                height=4, aspect=5.25, dodge=False, sharey=False)

# add xtick labels for each row if desired
for ax in p.axes.flatten():
    ax.tick_params(labelbottom=True)

# rotate the xtick labels
p.set_xticklabels(rotation=90, size=8)

# fix the layout so the full xticklabel will show
p.fig.tight_layout()

# move the legend
sns.move_legend(p, "lower center", bbox_to_anchor=(.5, 1), ncol=3, frameon=False)

# save the figure
p.savefig("pokemon.png")

# show the plot
plt.show()

enter image description here

  • Related