Home > Software engineering >  Column Does not Show Up in Pandas?
Column Does not Show Up in Pandas?

Time:11-20

Here is the code we're working with; basically just takes data from multiple scrapped datasets and then concatenates them.

import pandas as pd
import numpy as np # for numeric python functions
from pylab import * # for easy matplotlib plotting
from bs4 import BeautifulSoup
import requests
url1='http://openinsider.com/screener?s=&o=&pl=&ph=&ll=&lh=&fd=30&fdr=&td=0&tdr=&fdlyl=&fdlyh=&daysago=&xp=1&vl=&vh=&ocl=&och=&sic1=-1&sicl=100&sich=9999&grp=0&nfl=&nfh=&nil=&nih=&nol=&noh=&v2l=&v2h=&oc2l=&oc2h=&sortcol=0&cnt=100&page=1'
df1 = pd.read_html(url1)
table=df1[11]
#the table works - now lets make it look at change owned to find the largest value
#sorting
n = np.quantile(table['Qty'], [0.50])
print("99th percentile: ",n)
q=table.sort_values('Qty', ascending = False)
page = requests.get(url1)
name=q['Ticker'].str.replace('\d ', '')
name1 = (table['Ticker'])
n = name1.count()
#Buyers for the company
All = []
url = 'http://openinsider.com/'
for entry in name1:
  table2 = pd.read_html(url entry)
  dfn=table2[11]
  All.append(dfn)
All = pd.concat(All)
print(All.columns)#<- my sanity check
print(All['Insider Name'])#<- where the problem lies

Now if you look at the concatenated dataset, you'll see the "Insider Name" column. I want to isolate this column, but when I do, python says:


KeyError: 'Insider Name'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3361                 return self._engine.get_loc(casted_key)
   3362             except KeyError as err:
-> 3363                 raise KeyError(key) from err
   3364 
   3365         if is_scalar(key) and isna(key) and not self.hasnans:

KeyError: 'Insider Name'

So the column exists, but it also doesn't? Any tips would be greatly appreciated! Thanks in advance!

CodePudding user response:

The problem is that the character between Insider & Name is not 'space'. Try:

print(All['Insider\xa0Name'])

This will fix the issue:

All.rename(columns={"Insider\xa0Name": "Insider Name"}, inplace=True)
  • Related