I have read in a csv by saving each element of a row to the listed column name variables.
For example:
data = []
with open('sample.txt', 'r') as file:
for line in file.readlines():
col1, col2, col3 = line.split('\t')
data.append([col1, col2, col3])
Now, if I want to perform some operation on a specific column in data
, such as col1
, how could I do that?
The below code does not work:
for line in data:
print(col1)
Instead, I would need to hardcode the line element in question, likeso:
for line in data:
print(line[0])
I thought that because I read the csv in a way that I defined col1
, col2
, and col3
that I should be able to call upon the elements of my list that correspond to these column names.
Is it possible to do this? I do not want to use imported modules/packages (like pandas).
CodePudding user response:
You've defined the variables col1
, col2
, and col3
within a specific scope of your script—within the for
loop—and outside of that section of code, you cannot access these variables. Here are two suggestions I have:
- The quickest way to perform a manipulation of your column data might be to insert a statement before you do
data.append()
. In other words, if you wanted to add 5 to column 2, you could do something like this:
data = []
with open('sample.txt', 'r') as file:
for line in file.readlines():
col1, col2, col3 = line.split('\t')
col2 = 5 # Modify column before appending
data.append([col1, col2, col3])
- If you need all of the data to be collected first, and then you'd like to modify it afterward in a different step, you can start another
for
loop. Keep in mind that you have now doubled the amount of time that your script will run (you loop over the data twice instead of once). You can use a Python grammar feature called "list unpacking" to get your column variables back, like so:
data = []
with open('sample.txt', 'r') as file:
for line in file.readlines():
col1, col2, col3 = line.split('\t')
data.append([col1, col2, col3])
modified_data = []
for row in data:
col1, col2, col3 = row # This is list unpacking
. . . # (do something with columns here)
modified_data.append([col1, col2, col3])