Home > Software engineering >  Python list size changes once it gets out of for loop
Python list size changes once it gets out of for loop

Time:05-02

MFCC_coeffs = 12
train_data = []
current_block = []
MAX_ROWS = 29
row_counter = 0

for line in f:
    element = line.split(' ')
    if(len(element) == MFCC_coeffs 1):
        row_counter = row_counter   1
        element = element[:-1]
        element = [float(i) for i in element]
        current_block.append(element)
        # print("HERE")
        # print(f"element = {element}, length = {len(element)}")

    elif(len(element) == 1):
        if row_counter<MAX_ROWS:
            padding = MAX_ROWS-row_counter
            while(padding):
                pad_row = [0]*MFCC_coeffs
                current_block.append(pad_row)
                padding = padding-1
            
        row_counter = 0    
        # print(f"element = {element}, length = {len(element)}")
        # print(f"current_block = {current_block}, shape = {np.shape(current_block)}")
        train_data.append(current_block)
        # print(f"train_data shape = {np.shape(train_data)}") ## PRINTS CORRECT SIZE AT THE END OF THE FILE. E.G. (370,29,12)
        current_block.clear()
        continue
    else:
        assert("Wrong Data")
    
print(f"train_data = {train_data}, shape = {np.shape(train_data)}")    ## SIZE TO (370,0)

In the previous block of code, I am reading a text file and storing to a train_data variable. As I keep going through a text file line by line, the train_data is appended and reaches size (370*29*12) by the end which is correct. However once I get out of the file read block of code, the final size of train_data gets reset to (370*0). I have commented in block letters the part where the output is correct and where it is wrong.

CodePudding user response:

The issue is, that you're appending the same reference to current_block to train_data. Consider this example:

current_block = []
train_data = []

current_block = [1]
train_data.append(current_block)  # train_data = [[1]]
current_block.clear()  # train_data = [[]]

current_block.append(2) # train_data = [[2]]
train_data.append(current_block)  # train_data = [[2], [2]]
current_block.clear()  # train_data = [[], []]

As you see, current_block.clear() will the list - and subsequently all references in train_data.

The solution is to append a copy to train_data:

train_data.append(current_block[:])

That way, the next current_block.clear() won't clear any data already inside train_data.

  • Related