How to separate .csv data into different columns-CodePudding

I have a text file with data which looks like this:

NCP_341_1834_0022.png 2 0 130 512 429

I would like to split the data into different columns with names like this:

['filename','class','xmin','ymin','xmax','ymax']

I have done this:

test_txt = pd.read_csv(r"../input/covidxct/train_COVIDx_CT-3A.txt")
test_txt.to_csv(r"../working/test/train.csv",index=None, sep='\t')
train = pd.read_csv("../working/test/train.csv")

However when I download the .csv file, it gives me the data line all in one column, as opposed to 6 columns. How can I fix this?

CodePudding user response：

Just set the right separator (',' by default):

test_txt = pd.read_csv(r"../input/covidxct/train_COVIDx_CT-3A.txt", sep=' ', header=None)

if you are using test_COVIDx_CT-3A.txt from Kaggle.

Don't forget to set header=None since there is no header. You can also use colnames=['image', 'col1', 'col2', ...] to replace default names (0, 1, 2, ...)

CodePudding user response：

Just to answer my own question, You can use str to split the single .csv file into different columns. For me, I split it into 6 columns, for my 6 labels:

train[['filename', 'class','xmin','ymin','xmax','ymax']] = train['NCP_96_1328_0032.png 2 9 94 512 405'].str.split(' ', 6, expand=True)
train.head()

Then just drop the column you dont need:

train.drop(train.columns[[0]], axis=1)