Home > Mobile >  Matplotlib re-ordering y axis
Matplotlib re-ordering y axis

Time:07-18

Attempting to make a simple 2D matplotlib plot, but it's re-ordering the y-axis labels to keep the graph linear. How can I avoid this?

Code:

data = np.array([
    ['Jun 1', 1.2, 0.2],
    ['Jun 2', 1.3, 1.2],
    ['Jun 3', 1.4, 0.9],
    ['Jun 4', 1.1, 0.5],
    ['Jun 5', 1.6, 1.2],
    ['Jun 6', 2.2, 0.2],
    ['Jun 7', 3.4, 1.6]
])
df = pd.DataFrame(data, columns=['date', 'income', 'spent'])
plt.plot(df['income'])

enter image description here

I'd just like a normal graph: X axis: index Y axis: df.income (where the visual plot range is auto-calculated)

The same issue with the scatterplot:

enter image description here

CodePudding user response:

It's because of your types. When numpy sees a string, it assumes everything in the array to be an object. So your income and spent data is actually text if you dig into the dtypes. Reconvert them before plotting:

data = np.array([
    ['Jun 1', 1.2, 0.2],
    ['Jun 2', 1.3, 1.2],
    ['Jun 3', 1.4, 0.9],
    ['Jun 4', 1.1, 0.5],
    ['Jun 5', 1.6, 1.2],
    ['Jun 6', 2.2, 0.2],
    ['Jun 7', 3.4, 1.6]
])

df = pd.DataFrame(data, columns=['date', 'income', 'spent'])

for c in ['income', 'spent']:
    df[c] = df[c].astype(float)

plt.plot(df['income'])

or (better practice):

data = [
    ['Jun 1', 1.2, 0.2],
    ['Jun 2', 1.3, 1.2],
    ['Jun 3', 1.4, 0.9],
    ['Jun 4', 1.1, 0.5],
    ['Jun 5', 1.6, 1.2],
    ['Jun 6', 2.2, 0.2],
    ['Jun 7', 3.4, 1.6]
]

df = pd.DataFrame(data, columns=['date', 'income', 'spent'])

plt.plot(df['income'])

CodePudding user response:

This is happening because your data types are objects. You can see the data types of your dataframe by using dataframe.dtypes.

df.dtypes

> date      object
> income    object
> spent     object
> dtype: object

What you need is that the column 'income' be a float64, so you can manually set up its data type:

df['income'] = df['income'].astype('float64')

> date      object
> income    float64
> spent     object
> dtype: object

Now, if you plot the df, you get:

df plot

You can also set the date to be the x axis by setting your date column as an index of the dataframe:

df = df.set_index('date')

df plot with date as index

CodePudding user response:

I don't see the issue for the x-axis, now for the y-axis the issue is that when you converted the np.array you didn't specify the dtype of the array.

if you add this simple line it should work just fine.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

data = np.array([
    ['Jun 1', 1.2, 0.2],
    ['Jun 2', 1.3, 1.2],
    ['Jun 3', 1.4, 0.9],
    ['Jun 4', 1.1, 0.5],
    ['Jun 5', 1.6, 1.2],
    ['Jun 6', 2.2, 0.2],
    ['Jun 7', 3.4, 1.6]
], dtype=object)
df = pd.DataFrame(data, columns=['date', 'income', 'spent'])

df.plot(x='date', y='income', kind='line')
plt.show()

sometimes pandas has problems of understanding the data, the problem was that it was treating the income row as a string.

  • Related