I want to read a file in pandas data frame, which looks like:
label1 text more text
label2 text some more text
into a dataframe, which looks like:
label text
label1 text more text
label2 text some more text
My text
column is without the quotes and I want to specify the 1st space
as the separator and ignore all the later spaces in text
field
I tried with the regular expression, which matches 1st space as well as the whole string before the 1st space with https://regex101.com:
query_data = pd.read_csv('data.txt', sep='^.*?(?=\ )', engine='python',\
names = ['label', 'text'])
But, this gives all the labels as NaNs. Then, I realized it is matching the full string before space. How do I just specify the 1st space as the delimiter?
CodePudding user response:
You can just read the csv and split
it later on
df = df.col.str.split(' ', n =1, expand=True)