Python list size changes once it gets out of for loop-CodePudding

MFCC_coeffs = 12
train_data = []
current_block = []
MAX_ROWS = 29
row_counter = 0

for line in f:
    element = line.split(' ')
    if(len(element) == MFCC_coeffs 1):
        row_counter = row_counter   1
        element = element[:-1]
        element = [float(i) for i in element]
        current_block.append(element)
        # print("HERE")
        # print(f"element = {element}, length = {len(element)}")

    elif(len(element) == 1):
        if row_counter<MAX_ROWS:
            padding = MAX_ROWS-row_counter
            while(padding):
                pad_row = [0]*MFCC_coeffs
                current_block.append(pad_row)
                padding = padding-1
            
        row_counter = 0    
        # print(f"element = {element}, length = {len(element)}")
        # print(f"current_block = {current_block}, shape = {np.shape(current_block)}")
        train_data.append(current_block)
        # print(f"train_data shape = {np.shape(train_data)}") ## PRINTS CORRECT SIZE AT THE END OF THE FILE. E.G. (370,29,12)
        current_block.clear()
        continue
    else:
        assert("Wrong Data")
    
print(f"train_data = {train_data}, shape = {np.shape(train_data)}")    ## SIZE TO (370,0)

In the previous block of code, I am reading a text file and storing to a train_data variable. As I keep going through a text file line by line, the train_data is appended and reaches size (370*29*12) by the end which is correct. However once I get out of the file read block of code, the final size of train_data gets reset to (370*0). I have commented in block letters the part where the output is correct and where it is wrong.

CodePudding user response：

The issue is, that you're appending the same reference to current_block to train_data. Consider this example:

current_block = []
train_data = []

current_block = [1]
train_data.append(current_block)  # train_data = [[1]]
current_block.clear()  # train_data = [[]]

current_block.append(2) # train_data = [[2]]
train_data.append(current_block)  # train_data = [[2], [2]]
current_block.clear()  # train_data = [[], []]

As you see, current_block.clear() will the list - and subsequently all references in train_data.

The solution is to append a copy to train_data:

train_data.append(current_block[:])

That way, the next current_block.clear() won't clear any data already inside train_data.