Home > database >  I want to make a scatterplot of earnings vs education but it shows error
I want to make a scatterplot of earnings vs education but it shows error

Time:10-09

I am trying to create a scatter plot of earnings vs education for my statistical models class but it says "invalid character in identifier" but when I check on the txt file the characters "earnings" and "education" are both present. Could you help me please?

mod = smf.ols(formula=’education~earnings, data=mydata)
res = mod.fit()
res.summary()
beta=res.params 
matplotlib.pyplot.scatter(mydata["education"],mydata["earnings"],color="black") 
matplotlib.pyplot.plot(mydata["education"], res.fittedvalues, "r") 
matplotlib.pyplot.ylabel("earnings")
matplotlib.pyplot.xlabel("education") 
matplotlib.pyplot.title("Scatterplot earnings versus education") 
matplotlib.pyplot.show()

CodePudding user response:

I think the issue is the quotation mark after the = on this line:

mod = smf.ols(formula=’education~earnings, data=mydata)

This is confusing Python as it's not a valid variable name. The formula should be passed as a string, with a opening & closing single/double quote.

mod = smf.ols(formula='education~earnings', data=mydata)

Perhaps something got mixed up when copy-pasting it?

  • Related