Home > Back-end >  pandas df.eval gives ValueError: data type must provide an itemsize
pandas df.eval gives ValueError: data type must provide an itemsize

Time:09-09

I'll give a sort simple code that reproduces the issue that I have:

import pandas as pd

df = pd.DataFrame(dict(a=[1,2,3]))
df=df.eval('x=2')  # this one is ok
df.eval('y="num"')  # here it will fail

The error that I get is:

ValueError: data type must provide an itemsize

What is the issue? How can i make it work?
It wasnt like this at older pandas versions...


I know that I can replace it with:

df['y']="num"
# or
df.assign(y='num')

But this is not the answer that I need...

I also tried replacing "num" with:

np.str_("num")

Which do has .itemsize, but it didn't help...

Note that when using df.query, with another content, gives me that same issue which I'm trying to solve here. I'm just assuming it's the same issue.

CodePudding user response:

Use engine='python' parameter:

print(df.eval('y="num"', engine='python'))
   a  x    y
0  1  2  num
1  2  2  num
2  3  2  num

CodePudding user response:

You should probably prefer assign here:

df.assign(y='num')

With multiple assignments and handling variable as strings

df.assign(**{'x': 2, 'y': 'num'})

output:

   a  x    y
0  1  2  num
1  2  2  num
2  3  2  num
  • Related