I'll give a sort simple code that reproduces the issue that I have:
import pandas as pd
df = pd.DataFrame(dict(a=[1,2,3]))
df=df.eval('x=2') # this one is ok
df.eval('y="num"') # here it will fail
The error that I get is:
ValueError: data type must provide an itemsize
What is the issue? How can i make it work?
It wasnt like this at older pandas versions...
I know that I can replace it with:
df['y']="num"
# or
df.assign(y='num')
But this is not the answer that I need...
I also tried replacing "num" with:
np.str_("num")
Which do has .itemsize, but it didn't help...
Note that when using df.query, with another content, gives me that same issue which I'm trying to solve here. I'm just assuming it's the same issue.
CodePudding user response:
Use engine='python'
parameter:
print(df.eval('y="num"', engine='python'))
a x y
0 1 2 num
1 2 2 num
2 3 2 num
CodePudding user response:
You should probably prefer assign
here:
df.assign(y='num')
With multiple assignments and handling variable as strings
df.assign(**{'x': 2, 'y': 'num'})
output:
a x y
0 1 2 num
1 2 2 num
2 3 2 num