I have a multiline string (and not a text file) like this:
x = '''
Index Value Max Min State
0 10 nan nan nan
1 20 nan nan nan
2 15 nan nan nan
3 25 20 10 1
4 15 25 15 2
5 10 25 15 4
6 15 20 10 3
'''
The column white spaces are unequal.
I want to replace the whitespace
with a comma
, but keep the end-of-line
.
So the result would look like this:
Index,Value,Max,Min,State
0,10,nan,nan,nan
1,20,nan,nan,nan
2,15,nan,nan,nan
3,25,20,10,1
4,15,25,15,2
5,10,25,15,4
6,15,20,10,3
...or alternatively as a pandas
dataframe.
what i have tried
- I can use
replace('')
with different spaces, but need to count the white spaces - I can use the
re
module (from here re.sub question ), but it converts the whole string to 1 line, where as i need to keep multiple lines (end-of-line).
CodePudding user response:
Try with StringIO
from io import StringIO
import pandas as pd
x = '''
Index Value Max Min State
0 10 nan nan nan
1 20 nan nan nan
2 15 nan nan nan
3 25 20 10 1
4 15 25 15 2
5 10 25 15 4
6 15 20 10 3
'''
df = pd.read_csv(StringIO(x), sep='\s\s ', engine='python')
Index Value Max Min State
0 0 10 NaN NaN NaN
1 1 20 NaN NaN NaN
2 2 15 NaN NaN NaN
3 3 25 20.0 10.0 1.0
4 4 15 25.0 15.0 2.0
5 5 10 25.0 15.0 4.0
6 6 15 20.0 10.0 3.0
CodePudding user response:
Since you tagged pandas, you can try:
out = ('\n'.join(pd.Series(x.split('\n')).str.strip().str.replace('\s ',',', regex=True)))
Output (note that there are leading and trailing blank lines because your x
does):
Index,Value,Max,Min,State
0,10,nan,nan,nan
1,20,nan,nan,nan
2,15,nan,nan,nan
3,25,20,10,1
4,15,25,15,2
5,10,25,15,4
6,15,20,10,3