Home > Net >  How do I fix this Python error? Jupyter Notebooks Converting R to Python
How do I fix this Python error? Jupyter Notebooks Converting R to Python

Time:12-11

I keep getting an error in Jupyter Notebooks and trying to convert R code to Python. I am VERY new to python and my code keeps throwing an error saying that hosp_names is undefined. Here is my code below. Anything helps. I am also posting the original R code I am attempting to convert as well.

R Code:

# Get the unique hospital names
hosp_names = hosp_info %>% 
  filter(`Hospital Type` == "Acute Care Hospitals") %>%
  filter(State == "CA") %>%
  pull(`Hospital Name`)

# Filter based on those hospital names
hosp_info_CA = 
hosp_info %>% 
  rename(Hospital = `Hospital Name`,
         Provider_ID = `Provider ID`,
         Safety = `Safety of care national comparison`,
         Effectiveness = `Effectiveness of care national comparison`
         ) %>%
  filter(Hospital %in% hosp_names, State == "CA") %>%
  mutate(Overall_Rating = as.numeric(`Hospital overall rating`)) %>%
  drop_na(Overall_Rating)

hosp_info_CA %>% 
  arrange(desc(Overall_Rating), Hospital) %>% 
  head(7)

hosp_info_CA %>% 
  group_by(Overall_Rating, Safety) %>% 
  count()

write_csv(hosp_info_CA, 'hosp_info_CA.csv')

Python code:

import pandas as pd #import panda

hosp_names = hosp_info hosp_info = [(hosp_info[" Hospital Type "] == " Acute Care Hospitals ") & (hosp_info[" State "] == " CA ")].loc[:," Hospital Name "].unique()

hosp_info_CA = hosp_info.rename(columns={" Hospital Name ": " Hospital ", " Provider ID ": " Provider_ID ", " Safety of care national comparison ": " Safety ", " Effectiveness of care national comparison ": " Effectiveness "}).loc[hosp_info[" Hospital "].isin(hosp_names) & (hosp_info[" State "] == " CA ")].dropna(subset=[" Hospital overall rating "]).assign(Overall_Rating = lambda x: pd.to_numeric(x[" Hospital overall rating "]))

hosp_info_CA.sort_values(by=" Overall_Rating ", ascending=False).head(7)

hosp_info_CA.groupby([" Overall_Rating ", " Safety "]).count()

hosp_info_CA.to_csv(" hosp_info_CA.csv ")

Error I keep getting:

Error:

NameError Traceback (most recent call last) in 3 4 # Get the unique hospital names ----> 5 hosp_names = hosp_info 6 hosp_info = [(hosp_info[" Hospital Type "] == " Acute Care Hospitals ") & (hosp_info[" State "] == " CA ")].loc[:," Hospital Name "].unique() 7

NameError: name 'hosp_info' is not defined

CodePudding user response:

Have you imported the data file that contains the hospital info into Python? If you have not, assuming it´s a csv file, you can do it by:

import pandas as pd

host_info = pd.read_csv(r'Path where the CSV file is stored\File name.csv')
print(host_info )

CodePudding user response:

Look at the error:

Error:

NameError Traceback (most recent call last) in 3 4 
# Get the unique hospital names -
---> 5 hosp_names = hosp_info 6 hosp_info = [(hosp_info[" Hospital Type "] == " Acute Care Hospitals ") & (hosp_info[" State "] == " CA ")].loc[:," Hospital Name "].unique() 7

NameError: name 'hosp_info' is not defined

It says:

NameError: name 'hosp_info' is not defined

because in line 5 of your code you assign hosp_names to hosp_info, which is not defined. So when you try to call hosp_info in line 6 it is not defined. i assume you want to do it the other way around :

change line 5 to :

hosp_info = hosp_names
  • Related