Home > database >  How to call values by column name after reading in CSV
How to call values by column name after reading in CSV

Time:12-21

I have read in a csv by saving each element of a row to the listed column name variables.

For example:

data = []

with open('sample.txt', 'r') as file:
    for line in file.readlines():
        col1, col2, col3 = line.split('\t')
        data.append([col1, col2, col3])

Now, if I want to perform some operation on a specific column in data, such as col1, how could I do that?

The below code does not work:

for line in data:
    print(col1)

Instead, I would need to hardcode the line element in question, likeso:

for line in data:
    print(line[0])

I thought that because I read the csv in a way that I defined col1, col2, and col3 that I should be able to call upon the elements of my list that correspond to these column names.

Is it possible to do this? I do not want to use imported modules/packages (like pandas).

CodePudding user response:

You've defined the variables col1, col2, and col3 within a specific scope of your script—within the for loop—and outside of that section of code, you cannot access these variables. Here are two suggestions I have:

  1. The quickest way to perform a manipulation of your column data might be to insert a statement before you do data.append(). In other words, if you wanted to add 5 to column 2, you could do something like this:
data = []

with open('sample.txt', 'r') as file:
    for line in file.readlines():
        col1, col2, col3 = line.split('\t')

        col2  = 5  # Modify column before appending

        data.append([col1, col2, col3])
  1. If you need all of the data to be collected first, and then you'd like to modify it afterward in a different step, you can start another for loop. Keep in mind that you have now doubled the amount of time that your script will run (you loop over the data twice instead of once). You can use a Python grammar feature called "list unpacking" to get your column variables back, like so:
data = []

with open('sample.txt', 'r') as file:
    for line in file.readlines():
        col1, col2, col3 = line.split('\t')
        data.append([col1, col2, col3])

modified_data = []

for row in data:
    col1, col2, col3 = row  # This is list unpacking
    
    . . .  # (do something with columns here)

    modified_data.append([col1, col2, col3])
  • Related