My question is pretty like this but I'm using polars.
Environment: python 3.8, polars >=0.13.24
I have a CSV file to parse every 500ms, but it may be reset by another program. When it is reset via reopening it, polars will through exceptions.NoDataError: empty csv
and exit.
What I've tried is to wrap my read_csv
in a try-except
block:
# These codes are in a function body
try:
result = pl.read_csv(result_file_name)
# do some transformation with the dataframe
return result
except:
# return an empty dataframe
return pl.DataFrame(
None, ["column", "names", "as", "it", "exist"]
)
But it still throws exceptions. I'm not very familiar with python, so I don't know how to let it fall into the except
branch and return the empty dataframe.
Updates: (More detail)
The above code is in a function named parse_result
, which is used to parse the CSV file into polars.DataFrame
. It will be called in the method calculate
of a class named UpdateData
:
class UpdateData:
def __init__(
self, ax: Axes, trace_file_name: str, result_file_name: str, title: str
):
self.trace = parse_trace(trace_file_name)
self.result_file_name = result_file_name
self.ax = ax
self.lines = []
for i in range(2):
self.lines.append(self.ax.plot([], [], label=f"{i}")[0])
# plot parameters
# set ax parameters
# ...
def __call__(self, frame):
x, y = self.calculate()
self.lines[0].set_data(x, y[1])
self.lines[1].set_data(x, y[2])
return self.lines
def calculate(self):
result = parse_result(self.result_file_name)
# calculate x and y from result
# not important here
return x, y
# argpase code (not important)
if __name__ == "__main__":
args = parser.parse_args()
fig, ax = plt.subplots()
update_data = UpdateData(ax, args.trace, args.result, args.title)
anim = FuncAnimation(fig, update_data, interval=500)
plt.show()
I define the function parse_result
pretty much like cbilot's answer, and it works well individually.
But when I use it in UpdateData
to draw an animation via matplotlib, the error occurs:
File "/home/duskmoon/.local/lib/python3.8/site-packages/matplotlib/animation.py", line 907, in _start
self._init_draw()
File "/home/duskmoon/.local/lib/python3.8/site-packages/matplotlib/animation.py", line 1696, in _init_draw
self._drawn_artists = self._func(framedata, *self._args)
File "./liveshow.py", line 97, in __call__
x, y = self.calculate()
File "./liveshow.py", line 107, in calculate
result = parse_result(self.result_file_name)
File "./liveshow.py", line 72, in parse_result
self._draw_frame(frame_data)
File "/home/duskmoon/.local/lib/python3.8/site-packages/matplotlib/animation.py", line 1718, in _draw_frame
result = pl.read_csv(result_file_name)
File "/home/duskmoon/.local/lib/python3.8/site-packages/polars/io.py", line 333, in read_csv
df = DataFrame._read_csv(
File "/home/duskmoon/.local/lib/python3.8/site-packages/polars/internals/frame.py", line 587, in _read_csv
self._drawn_artists = self._func(framedata, *self._args)
File "./liveshow.py", line 97, in __call__
self._df = PyDataFrame.read_csv(
x, y = self.calculate()
exceptions.NoDataError: empty csv
File "./liveshow.py", line 107, in calculate
result = parse_result(self.result_file_name)
File "./liveshow.py", line 72, in parse_result
result = pl.read_csv(result_file_name)
File "/home/duskmoon/.local/lib/python3.8/site-packages/polars/io.py", line 333, in read_csv
df = DataFrame._read_csv(
File "/home/duskmoon/.local/lib/python3.8/site-packages/polars/internals/frame.py", line 587, in _read_csv
self._df = PyDataFrame.read_csv(
exceptions.NoDataError: empty csv
CodePudding user response:
You can try:
result = pl.read_csv(result_file_name, ignore_errors=true, columns=["column", "names", "as", "it", "exist"])
Referring to - https://pola-rs.github.io/polars/py-polars/html/reference/api/polars.read_csv.html
CodePudding user response:
you can check length of dataframe
result = pl.read_csv(result_file_name)
if len(result.index)==0:
quit()
#oranything you want to do
CodePudding user response:
Does the error look like this?
SyntaxError: 'return' outside function
If so, it means that you are trying to use a return
statement that is not part of a function definition.
The return
statement can only be used in a function definition, such as:
def my_function(result_file_name):
try:
result = pl.read_csv(result_file_name)
# do some transformation with the dataframe
return result
except:
# return an empty dataframe
result= pl.DataFrame(
None, ["column", "names", "as", "it", "exist"]
)
return result
my_function("/tmp/tmp.csv")
>>> my_function("/tmp/tmp.csv")
shape: (0, 5)
┌────────┬───────┬─────┬─────┬───────┐
│ column ┆ names ┆ as ┆ it ┆ exist │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ f32 ┆ f32 ┆ f32 ┆ f32 ┆ f32 │
╞════════╪═══════╪═════╪═════╪═══════╡
└────────┴───────┴─────┴─────┴───────┘
However, if you are not in a function definition, you can just assign the result
variable, without a return
statement:
try:
result = pl.read_csv(result_file_name)
# do some transformation with the dataframe
except:
# return an empty dataframe
result= pl.DataFrame(
None, ["column", "names", "as", "it", "exist"]
)
print(result)
shape: (0, 5)
┌────────┬───────┬─────┬─────┬───────┐
│ column ┆ names ┆ as ┆ it ┆ exist │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ f32 ┆ f32 ┆ f32 ┆ f32 ┆ f32 │
╞════════╪═══════╪═════╪═════╪═══════╡
└────────┴───────┴─────┴─────┴───────┘