I have this toy dataframe
data = {'Column 1' : [1., 2., 3., 4.],
'Column 2' : [1.2, 2.2, 3.2, 4.2]
}
df = pd.DataFrame(data, index=["Apples", "Oranges", "Puppies", "Ducks"])
How can I make a dot/scatter plot of the dataframe with column names on the x-axis and the shape of the dots are different based on the index values, something similar to this?
CodePudding user response:
To create a dot/scatter plot of a dataframe with the column names on the x-axis and the shape of the dots being different based on the index values, you can use the matplotlib.pyplot.scatter function. Here is an example of how you could accomplish this using the toy dataframe you provided:
import matplotlib.pyplot as plt
# Create a scatter plot using the dataframe's column names as the x-axis values
# and the index values as the y-axis values
plt.scatter(df.columns, df.index)
# Set the x-axis label to "Columns"
plt.xlabel("Columns")
# Set the y-axis label to "Index"
plt.ylabel("Index")
# Show the plot
plt.show()
This will create a scatter plot with the column names on the x-axis and the index values on the y-axis. The shape of the dots will be the default shape used by the scatter function, which is a circular dot.
If you want to customize the shape of the dots based on the index values, you can create a dictionary mapping the index values to the desired marker shapes, and then use the marker parameter of the scatter function to specify the marker shapes for each index value. Here is an example of how you could do this:
# Define a dictionary mapping the index values to the desired marker shapes
marker_shapes = {"Apples": "^", "Oranges": "s", "Puppies": "*", "Ducks": "o"}
# Create a scatter plot using the dataframe's column names as the x-axis values
# and the index values as the y-axis values, using the marker shapes defined in
# the marker_shapes dictionary
plt.scatter(df.columns, df.index, marker=df.index.map(marker_shapes))
# Set the x-axis label to "Columns"
plt.xlabel("Columns")
# Set the y-axis label to "Index"
plt.ylabel("Index")
# Show the plot
plt.show()
This will create a scatter plot with the same x- and y-axis values as before, but the shape of the dots will be different based on the index values, as specified in the marker_shapes dictionary.
CodePudding user response:
You could use seaborn's scatterplot which supports marker styles out of the box. Seaborn works easiest with data in "long form", which can be created via pd.melt()
.
Here is an example:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
data = {'Column 1': [1., 2., 3., 4.],
'Column 2': [1.2, 2.2, 3.2, 4.2]}
df = pd.DataFrame(data, index=["Apples", "Oranges", "Puppies", "Ducks"])
df.index.name = 'Item' # give the index a name, to be used as column name by reset_index()
df_long = df.reset_index().melt(id_vars='Item')
marker_styles = {"Apples": "^", "Oranges": "X", "Puppies": "*", "Ducks": "v"}
ax = sns.scatterplot(data=df_long, x='variable', y='value',
style='Item', markers=marker_styles, hue='Item', palette='dark')
sns.move_legend(ax, loc='upper left', bbox_to_anchor=(1.01, 1.01))
ax.set(xlabel='', ylabel='')
plt.tight_layout()
plt.show()