Home > Software engineering >  How can I separate characters inside a string in python?
How can I separate characters inside a string in python?

Time:06-29

I have data in a txt file and I need to separate a sentence from a value. Every line of the txt file has the form <Sentence> <number>. I need to read the value and the sentence in two different columns, but the sentences can contain numbers, dots and every possible stuff since they are just random sentences. The numeric value in question though is always at the end of the line. For example :

This coffee is bad. -1

How can I do this in Python?

CodePudding user response:

if it always follows the format sentence / random <space><number><end> then something like:

sent, _, num = input_str.rpartition(' ')

CodePudding user response:

Here is a solution using to load the CSV as DataFrame with a regex separator:

import pandas as pd

df = pd.read_csv('file.csv', sep='\s(?=\S $)', engine='python',
                 header=None, names=['sentence', 'Value'])

Output:

              sentence  value
0  This coffee is bad.     -1
1        other example    123

You can then easily convert to lists:

df.to_dict('list')

Output:

{'sentence': ['This coffee is bad.', 'other example'],
 'value': [-1, 123]}

Used text input:

This coffee is bad. -1
other example 123

CodePudding user response:

There are many ways to do it.

The simple/dirty solution is as follows:

  • Run regex pattern to extract digit groups then select the last one as the second column.
  • Subtract what you find in the first step from the string/line and make it the first column.

This code should give you an idea.

import re

sample = "This coffee 5656 is bad. -134 -454"
    
result = re.findall('[0-9] ', sample)
    
first_column = sample.replace(result[-1], '')
second_column = result[-1]

print(f'First Column: {first_column}')
print(f'Second Column: {second_column}')

Output

First Column: This coffee 5656 is bad. -134 -
Second Column: 454
  • Related