I am learning about data handling in python, trying to handle weather data from each day of October. The data is from a local csv. I Iterate for day of the month and iterating for each hour inside of it. I have a class object handling data for each day. The class object is being initialized right after the iteration of each day. The issue is, that this object doesn't get re-initialized after each iteration. I have made som test object below it. The one differens between the test object and the object is, that the object is an object inside of a subfolder containing all my data handling. The object contains a list of data container class objects, whose origin is in the same directory.
-- Main class --
`
import os
import csv
from Data.VejrData import *
#Test class
class Ekstra:
streng: str = ""
class Vejr:
currentDirectory: str = os.getcwd()
dataDirectory: str = 'Data/vejrdata/'
folder = os.listdir(currentDirectory "/" dataDirectory)
days: list[VejrData] = []
def __init__(self):
self.main()
def fetchData(self):
for file in self.folder:
vejrData = VejrData()
#Error!
#For each file vejrData should be reset, but doesn't
#For each file(31 iterations), all rows from the 32 iterations are being added(25 each * 31 = 248000 sets of data)
#Testing
string = ""
string = "New day"
print(string)
#Testing if the same happens with an empty class object.
ekstra = Ekstra()
ekstra.streng = "New day"
print(ekstra.streng)
#This variable from class object does get reset.
#VejrData doesn't.
#Opens datafile from a specifik date.
with open(f"{self.dataDirectory}{file}", 'r') as data:
csvreader = csv.reader(data)
number = 0
for row in csvreader:
number = 1
index: int = 0
#iterates each element of data in a string.
for textData in row[0].split(";"):
vejrData.constructData(index, textData)
index = 1
self.days.append(vejrData)
def main(self):
self.fetchData()
print(len(self.days))
for day in self.days:
print(len(day.timeData))
vejr = Vejr()
` printing from main function results in: len(self.days) = 32 Length of set of hourly data in each day = 775 for all of the 32 days. 775/25(24 1 header) = 31
-- Data Handling -- `
from .TimeData import TimeData
#https://docs.python.org/3/tutorial/modules.html#intra-package-references
class VejrData:
#Lists containing hourly data
timeData: list[TimeData] = []
#Variable retrieving data before being added to list.
DataBuilder: None
#Distributing data from index: index 0 = time data, index 1 = prec data ...
def constructData(self,index: int, data):
#https://www.freecodecamp.org/news/python-switch-statement-switch-case-example/
match index:
case 0:
self.DataBuilder = TimeData()
self.DataBuilder.tid = data
case 1:
self.DataBuilder.prec = data
case 2:
self.DataBuilder.metp = data
case 3:
self.DataBuilder.megrtp = data
case 4:
self.DataBuilder.mesotp10 = data
case 5:
self.DataBuilder.meanwv = data
self.timeData.append(self.DataBuilder)
self.DataBuilder = None
case _:
print(f"Error - Index not at index: {index}, is out of range.")
`
-- Container class -- `
class TimeData:
tid: int
prec: float
metp: float
megrtp: float
mesotp10: float
meanwv: float
`
The structure is as such /Vejr.py, /Data/VejrData & /Data/TimeData. No errors related to the pathing occurs. I could just give it a new variable at the end of each loop, but that seems off, to be doing what the loop is supposed to do.
I have tried testing whether re-initialization isn't intended to be happening in for loops. I created some objects to see whether or not they would be affected by it. I started with string variable. As the string variable was re-initialized I tried with another class object located in the same file, changed a variable inside of it and saw it re-initialize as well.
So variables and class objects are intended to be re-initialized in each iteration.
CodePudding user response:
...are intended to be re-initialized in each iteration
VejrData.timeData is a class attribute and because it has a mutable default, every instance of VejrData will point-to the exact same list.
>>> v = VejrData()
>>> w = VejrData()
>>> v.timeData.append('x')
>>> v.timeData
['x']
>>> w.timeData
['x']
>>> v.timeData is w.timeData
True
>>>
Make timeData an instance attribute.
class VejrData:
#Lists containing hourly data
# timeData: list[TimeData] = []
#Variable retrieving data before being added to list.
DataBuilder: None
def __init__(self):
self.timeData: list[TimeData] = []
>>> v = VejrData()
>>> w = VejrData()
>>> v.timeData.append('x')
>>> v.timeData
['x']
>>> w.timeData
[]
>>> v.timeData is w.timeData
False
>>>
CodePudding user response:
Thanks for the quick response on my question. I was indeed experiencing mutable defaults. After setting my initial value to none and adding it later, I created 32 unique sets of data, instead of one.