Home > database >  Can you read a CSV file as one column?
Can you read a CSV file as one column?

Time:09-07

I know this sounds silly, but is it possible to read a CSV file containing multiple columns and combine all the data into one column? Let's say I have a CSV file with 6 columns and they have different delimiters. Is it possible to read these files, but spit out the first 100 rows into one column, without specifying a delimiter? My understanding is that this isn't possible if using pandas.

I don't know if this helps, but to add context to my question, I'm trying to use Treeview from Tkinter to display the first 100 rows of a CSV file. The Treeview window should display this data as 1 column if a delimiter isn't specified. Otherwise, it will automatically split the data based on a delimiter from the user input.

This is the data I have:

enter image description here

This should be the result:

enter image description here

CodePudding user response:

you can use

open('file.csv') as f: data=list(f.readlines())

to read file line by line

CodePudding user response:

As other answers have explained, you can use various ways to read first n-lines of text from a file. But if you insist on using pandas then there is a trick you can use.

Find a character which will never appear in your text and use it as dummy delimiter to read_csv(), so that all text will be read as one column. Use nrows parameter to control number of lines to read:

pd.read_csv("myfile.csv", sep="~", nrows=100)

CodePudding user response:

Pandas isn't the only way to read a CSV file. There is also the built in csv module in the python standard lib as well as the basic builtin function open that will work just as well. Both of these methods can generate single rows of data like your question indicates.

Using open function

filepath = "/path/to/file.csv"
with open(filepath, "rt", encoding="utf-8") as fd:
    header = next(fd)
    for row in fd:
        # .... do something with row data 
        # row will be a string of all the data for a single row.
        # example: "Information,44775.4541667,MicrosoftWindowsSecurity,16384..."
        # then you can break at any time you want to stop reading.

or using the csv module:

import csv

reader = csv.reader(open("/path/to/file.csv", "rt", encoding="utf8"), delimeter=',')
header = next(reader)
for row in reader:
    # this time the row will be a list split by the delimiter which
    # by default is a comma but you can change it in the call to the reader
  • Related