Home > Enterprise >  KeyError: u"None of [Index([..], dtype='object')] are in the [columns]"
KeyError: u"None of [Index([..], dtype='object')] are in the [columns]"

Time:03-17

This is how my table looks like. I am trying to plot it as a scatter plot or bar graph but i am getting keyError dont know the reason.

def clean_data(data_path):
  df = change_table(data_path)
  df = (df.stack()
          .rename_axis(('Type','Gender'))
          .str.split(expand=True)
          .stack()
          .reset_index(name='Word'))
  df = df.assign(Type = df['Type'].str.split(',')).explode('Type')
  stop_words = set(stopwords.words('english'))
  df['Word'] = df['Word'].str.replace(r'[^\w\s] ', '')
  df['Word'] = df['Word'].str.replace(r'[\d-] ', '')
  df = df[~df['Word'].isin(stop_words)]
  df1 = pd.crosstab([df['Gender'], df['Word']], df['Type']).reset_index()
  df2 = pd.crosstab(df['Word'], df['Type'])
  df3 = df2.iloc[1:]
  return df3
output = clean_data(data_path)
Type      |Female   |Male   |None|
Word      |         |       |    |
----------|---------|-------|----|
A         |2        |12     |50  |
AN        |0        |0      |1   |
Aaron     |0        |0      |2   |
Abbey     |0        |0      |1   |
Abbotsford|0        |1      |0   |
# import plotly.express as px
x = output[["Female", "Male", "None"]][0:10]
y = output.index[0:10]
# fig = px.bar(x, y)
# fig.show()
df.plot(y, x, kind = 'scatter')
plt.show()  

getting error KeyError: "None of [Index(['A', 'AN', 'Aaron', 'Abbey', 'Abbotsford', 'Abdul', 'Abdullah',\n 'AbdurRahman', 'Abell', 'Abells'],\n dtype='object', name='Word')] are in the [columns]" While trying to plot

CodePudding user response:

Did you mean something like this?

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

dataFrame = pd.DataFrame(
    {"Word": ["A", "An", "AAron"],
     "Female": [4, 5, 6],
     "Male": [7, 8, 9],
     "None": [10, 11, 12]})

# set width of bar
barWidth = 0.25
fig = plt.subplots(figsize=(12, 8))

# set height of bar
Fe = dataFrame["Female"]
Ma = dataFrame["Male"]
No = dataFrame["None"]

# Set position of bar on X axis
br1 = np.arange(len(Fe))
br2 = [x   barWidth for x in br1]
br3 = [x   barWidth for x in br2]

# Make the plot
plt.bar(br1, Fe, color='r', width=barWidth,
        edgecolor='grey', label='Female')
plt.bar(br2, Ma, color='g', width=barWidth,
        edgecolor='grey', label='Male')
plt.bar(br3, No, color='b', width=barWidth,
        edgecolor='grey', label='None')

# Adding Xticks
plt.xlabel('Names', fontweight='bold', fontsize=15)
plt.ylabel('Numbers', fontweight='bold', fontsize=15)
plt.xticks([r   barWidth for r in range(len(Fe))],
       dataFrame["Word"])

plt.legend()
plt.show()

Picture of the bar graph

This method uses matplotlib.pyplot and pandas. I don't know why you get the error but thought that this alternative method might be useful.

  • Related