I wanted to read my CSV file first. https://github.com/hamzaal014/file/blob/main/file.csv
the .csv file contains two columns X and Y here is my script:
import numpy as np
from pandas import DataFrame as df
import csv
origin_data = open("file.csv", "r")
dato = list(csv.reader(origin_data, delimiter=","))
print(dato)
rowcount = 0
#iterating through the whole file
for row in dato:
rowcount = 1
#printing the result
#_ print("Number of lines present:-", rowcount)
print(rowcount)
dati = df(dato, columns=['x', 'y'])
window = 6
roll_avg = dati.rolling(window).mean()
roll_avg_cumulative = dati['y'].cumsum()/np.arange(1, 25)
print(roll_avg_cumulative)
but my script is not working ???
Error --------------------------------------------------------------------
Traceback (most recent call last):
File "/home/haz/miniconda39/lib/python3.9/site-packages/pandas/core/ops/array_ops.py", line 163, in _na_arithmetic_op
result = func(left, right)
File "/home/haz/miniconda39/lib/python3.9/site-packages/pandas/core/computation/expressions.py", line 239, in evaluate
return _evaluate(op, op_str, a, b) # type: ignore[misc]
File "/home/haz/miniconda39/lib/python3.9/site-packages/pandas/core/computation/expressions.py", line 128, in _evaluate_numexpr
result = _evaluate_standard(op, op_str, a, b)
File "/home/haz/miniconda39/lib/python3.9/site-packages/pandas/core/computation/expressions.py", line 69, in _evaluate_standard
return op(a, b)
TypeError: unsupported operand type(s) for /: 'str' and 'int'
CodePudding user response:
When reading from a file you are returned strings. This is the source of your problem since the strings are never converted into numbers. You can fix it by:
dati = df(dato, columns=['x', 'y'], dtype_float)
If it is helpful to you I would also like to poit out a few things that may improve your code:
- you are using pandas as your container for data so I would suggest using the pandas functions to convert a CSV file to a DataFrame instead of doing it manually (do it by using
pandas.read_csv
) - the row count can be easily calculated with the
len
operator without needing to iterate over all rows - please stick to the more widely used import aliases (
import pandas as pd
) instead of creating your own. This will help your code be more readable to everyone else
So your code can become:
import numpy as np
import pandas as pd
dati = pd.read_csv("file.csv", sep=",", dtype=float, names=["x", "y"])
rowcount = len(dati)
window = 6
roll_avg = dati.rolling(window).mean()
roll_avg_cumulative = dati["y"].cumsum() / np.arange(1, 25)
print(roll_avg_cumulative)
CodePudding user response:
What went wrong in your code:
- All vals are loaded as
str
.
Simple way
import numpy as np
import pandas as pd
import csv
dati = pd.read_csv('file.csv', header=None)
window = 6
roll_avg = dati.rolling(window).mean()
print(dati[1].cumsum())
roll_avg_cumulative = dati[1].cumsum()/np.arange(1, 25)
print(roll_avg_cumulative)