Home > database >  Treat x axis as categorical in plotly express violin plot
Treat x axis as categorical in plotly express violin plot

Time:11-11

I have a dataset df[type, x, y] - [96 rows x 3 columns]. With 4 unique values of x: (0. 0.322, 1.89, 3.460)

Here is a sample:

print (df.groupby('x').apply(lambda df: df.sample(4)))

OUT: 
         type      x     y
x                         
0.000 5     2  0.000  1123
      6     2  0.000  1178
      7     2  0.000   850
      3     1  0.000   515
0.322 72    1  0.322   174
      42    1  0.322   182
      79    2  0.322   450
      10    1  0.322   340
1.890 54    2  1.890   140
      71    2  1.890   126
      80    1  1.890    61
      19    1  1.890    60
3.460 30    2  3.460   120
      88    1  3.460    35
      26    1  3.460    40
      92    2  3.460    98

I am using plotly express to make a violin plot, however since my x values aren't evenly spaced, it is stretching my x axis a lot making it hard to read:

fig = px.violin(df, y="y", x="x", color="type", box=True, violinmode='overlay',points="all", 
          hover_data=df.columns)
fig.show()

Violin Plot

Is there a way with Plotly express to treat the x axis as categorical - so to not stretch it?

CodePudding user response:

I have found an ok workaround

I added a "-" character to df["x"]

df["x"]=df['x'].astype(str)   "-"
fig = px.violin(df, y="y", x="x", color="type", box=True, violinmode='overlay',points="all", 
              hover_data=df.columns)

fig.show()

new graph

just adding a white-space " " didn't work.

If anyone has a better solution please post.

CodePudding user response:

You can set the xaxes to be categorical:

fig.update_xaxes(type='category')

For example, I created a DataFrame pretty similar to yours where the x column is also numerical, and now Plotly will interpret it as categorical when plotting, with even spacing between each unique x value:

import numpy as np
import pandas as pd
import plotly.express as px

## create some data similar to yours with 80 rows
np.random.seed(42)
values = np.random.randint(30,size=80)
df = pd.DataFrame({
    'x':[0,0.322,1.89,3.46]*20,
    'y':np.random.randint(30,size=80),
    'type':[1]*40   [2]*40
 })

fig = px.violin(df, y="y", x="x", color="type", box=True, violinmode='overlay',points="all",hover_data=df.columns)
fig.update_xaxes(type='category')
fig.show()

enter image description here

  • Related