Home > Enterprise >  How to read specific columns in the csv file?
How to read specific columns in the csv file?

Time:04-21

I have lots of live data coming from sensor. Currently, I stored the data in a csv file as following:

0  2  1  437   464  385  171    0:44:4   dog.jpg
1  1  3  452   254  444  525    0:56:2   cat.jpg
2  3  2  552   525  785  522    0:52:8   car.jpg
3  8  4  552   525  233  555    0:52:8   car.jpg
4  7  5  552   525  433  522    1:52:8   phone.jpg
5  9  3  552   525  555  522    1:52:8   car.jpg
6  6  6  444   392  111  232    1:43:4   dog.jpg
7  1  1  234   322  191  112    1:43:4   dog.jpg
.
.

. .

Third column has numbers between 1 to 6. I want to read information of columns #4 and #5 for all the rows that have number 2 and 5 in the third columns. I also want to write them in another csv file line by line every 2 second, one line at the time. I do so because I have another code which would go through the data and read the data from there. I was wondering how could I write the information for the lines that have 3 and 5 in their 3rd column? Please advise!

for example:

2  552   525  
5  552   525  
......
......
.....

.

import csv

with open('newfilename.csv', 'w') as f2:
    with open('mydata.csv', mode='r') as infile:
        reader = csv.reader(infile)  # no conversion to list
        header = next(reader)  # get first line
        for row in reader:  # continue to read one line per loop

            if row[5] == 2 & 5:

CodePudding user response:

I think you'd just need to change your if statement to be able to get the rows you want.

for example:

import csv

with open('newfilename.csv', 'w') as f2:
    with open('mydata.csv', mode='r') as infile:
        reader = csv.reader(infile)  # no conversion to list
        header = next(reader)  # get first line
        for row in reader:  # continue to read one line per loop

            if row[5] in [2,5]:

inside the if, you'll get the rows that have 2 or 5

CodePudding user response:

The third column has index 2 so you should be checking if row[2] is one of '2' or '5'. I have done this by defining the set select = {'2', '5'} and checking if row[2] in select.

I don't see what you are using header for but I assume you have more code that processes header somewhere. If you don't need header and just want to skip the first line, just do next(reader) without assigning it to header but I have kept header in my code under the assumption you use it later.

We can use time.sleep(2) from the time module to help us write a row every 2 seconds.

Below, "in.txt" is the csv file containing the sample input you provided and "out.txt" is the file we write to.

Code

import csv
import time

select = {'2', '5'}
with open("in.txt") as f_in, open("out.txt", "w") as f_out:
    reader = csv.reader(f_in)
    writer = csv.writer(f_out)
    header = next(reader)    
    for row in reader:
        if row[2] in select:
            print(f"Writing {row[2:5]} at {time.time()}")
            writer.writerow(row[2:5])
            time.sleep(2)

Output

Writing ['2', '552', '525'] at 1650526118.9760585
Writing ['5', '552', '525'] at 1650526120.9763758

"out.txt"

2,552,525
5,552,525

Input

"in.txt"

0,2,1,437,464,385,171,0:44:4,dog.jpg
1,1,3,452,254,444,525,0:56:2,cat.jpg
2,3,2,552,525,785,522,0:52:8,car.jpg
3,8,4,552,525,233,555,0:52:8,car.jpg
4,7,5,552,525,433,522,1:52:8,phone.jpg
5,9,3,552,525,555,522,1:52:8,car.jpg
6,6,6,444,392,111,232,1:43:4,dog.jpg
7,1,1,234,322,191,112,1:43:4,dog.jpg
  • Related