I have a text file with data which looks like this:
NCP_341_1834_0022.png 2 0 130 512 429
I would like to split the data into different columns with names like this:
['filename','class','xmin','ymin','xmax','ymax']
I have done this:
test_txt = pd.read_csv(r"../input/covidxct/train_COVIDx_CT-3A.txt")
test_txt.to_csv(r"../working/test/train.csv",index=None, sep='\t')
train = pd.read_csv("../working/test/train.csv")
However when I download the .csv file, it gives me the data line all in one column, as opposed to 6 columns. How can I fix this?
CodePudding user response:
Just set the right separator (',' by default):
test_txt = pd.read_csv(r"../input/covidxct/train_COVIDx_CT-3A.txt", sep=' ', header=None)
if you are using test_COVIDx_CT-3A.txt from Kaggle.
Don't forget to set header=None
since there is no header. You can also use colnames=['image', 'col1', 'col2', ...]
to replace default names (0, 1, 2, ...)
CodePudding user response:
Just to answer my own question, You can use str to split the single .csv file into different columns. For me, I split it into 6 columns, for my 6 labels:
train[['filename', 'class','xmin','ymin','xmax','ymax']] = train['NCP_96_1328_0032.png 2 9 94 512 405'].str.split(' ', 6, expand=True)
train.head()
Then just drop the column you dont need:
train.drop(train.columns[[0]], axis=1)