I have the data in the below variable ohlc = 1664788799|38444.9|38569.2|38327.85|38412.35|0|0|,1664789099|38408.35|38587.3|38394.05|38586.15|0|0|,1664789399|38589.55|38641.6|38420.05|38422.35|0|0|
The coulums are timestamp, open, high, low, close, volume,OI
How do we convert this pythion variable to dataframe. Any simple way to to do this?
CodePudding user response:
Split the string into lists using the two separators and then pass the list of lists to the dataframe constructor:
pd.DataFrame(
[row.split('|')[:-1] for row in ohlc.split(',')],
columns=['timestamp', 'open', 'high', 'low', 'close', 'volume', 'OI']
)
Output:
timestamp open high low close volume OI
0 1664788799 38444.9 38569.2 38327.85 38412.35 0 0
1 1664789099 38408.35 38587.3 38394.05 38586.15 0 0
2 1664789399 38589.55 38641.6 38420.05 38422.35 0 0
P.S. You can add df = df.apply(pd.to_numeric, errors='coerce')
to convert all columns to numeric values (integers/floats).
CodePudding user response:
could you please give a demo of output, in which way you want to (like table)
CodePudding user response:
Since your data is stored in a single string with "|" and "," used as separators, you can use Python's str.split
method to get a list of values. In the data there's also an addiitonal sep character at the end which needs to be removed. To do this we can use str.replace
using tab characters and str.strip
to remove the trailing separator.
For converting the string into a CSV like object Pandas can read, we can use StringIO
class from the io
builtin module.
import io
ohlc = "1664788799|38444.9|38569.2|38327.85|38412.35|0|0|,1664789099|38408.35|38587.3|38394.05|38586.15|0|0|,1664789399|38589.55|38641.6|38420.05|38422.35|0|0|"
rows = [row.replace('|', '\t').strip() for row in ohlc.split(',')]
rows_tsv = '\n'.join(rows)
header = ['timestamp', 'open', 'high', 'low', 'close', 'volume', 'OI']
df = pd.read_csv(io.StringIO(rows_tsv), sep='\t', header=None))
df.columns = header # to use the header names as required