This script reads in a txt file and creates a df, but the 'sep' argument I want to handle values that may be seperated by 1 space or more, so when I run the script above I get many columns with NaN.
code:
df = pd.read_csv(data_file,header = None, sep=' ')
example txt file
blah blahh bl
blah3 blahhe ble
I want there to just be 3 columns so i get
Col_a col_b col_c
blah blahh bl
blah3 blahhe ble
CodePudding user response:
You can use regex as the delimiter:
pd.read_csv(data_file, header=None, delimiter=r"\s ", names='Col_a Col_b Col_c'.split(' '))
Or you can use delim_whitespace=True
argument, it's faster than regex:
pd.read_csv(data_file, header=None, delim_whitespace=True, names='Col_a Col_b Col_c'.split(' '))
Reference: How to read file with space separated values in pandas