I have this type of dataframe with two main columns, each column contains two sub-columns divided by commas. This dataframe comes from an online json. I would need 4 columns with the headings bid price, bid, ask price, and ask. Can you advise me how to write the code? I get an empty dataframe, I don't understand why. here an image of what i get: https://ibb.co/bvfPkXg
import pandas as pd
import requests
r = requests.get("https://api.binance.com/api/v3/depth?limit=20&symbol=BTCUSDT")
df = pd.read_json(r.text)
data = pd.DataFrame(df)
data
dp = data.drop('lastUpdateId', axis=1)
# split the text on the comma
dp['bids'] = dp['bids'].str.split(',')
dp['asks'] = dp['asks'].str.split(',')
# store in new the new columns
dp['bid price'] = dp['bids'][0]
dp['bid vol'] = dp['bids'][1]
dp['ask price'] = dp['asks'][0]
dp['ask vol'] = dp['asks'][1]
#show results
dp = dp.iloc[:,2:]
dp
CodePudding user response:
Demo DataFrame
df = pd.DataFrame({'column1': ['bid price1, bid1', 'bid price2, bid2', 'bid price3, bid3'],
'column2': ['ask price1, ask1', 'ask price2, ask2', 'ask price3, ask3']})
column1 column2
0 bid price1, bid1 ask price1, ask1
1 bid price2, bid2 ask price2, ask2
2 bid price3, bid3 ask price3, ask3
Split the text
# split the text on the comma
df['column1'] = df['column1'].str.split(',')
df['column2'] = df['column2'].str.split(',')
column1 column2
0 [bid price1, bid1] [ask price1, ask1]
1 [bid price2, bid2] [ask price2, ask2]
2 [bid price3, bid3] [ask price3, ask3]
Take the First and second value of each list
# store in new the new columns
df['bid price'] = df['column1'].str[0]
df['bid'] = df['column1'].str[1]
df['ask price'] = df['column2'].str[0]
df['ask'] = df['column2'].str[1]
Show results
df = df.iloc[:,2:]
print(df)
bid price bid ask price ask
0 bid price1 bid1 ask price1 ask1
1 bid price2 bid2 ask price2 ask2
2 bid price3 bid3 ask price3 ask3
CodePudding user response:
Specific result, after updating the qustion:
import pandas as pd
import requests
r = requests.get("https://api.binance.com/api/v3/depth?limit=20&symbol=BTCUSDT")
df = pd.read_json(r.text)
data = pd.DataFrame(df)
dp = data.drop('lastUpdateId', axis=1)
# store in data into the new columns
dp['bid price'] = dp['bids'].str[0].astype(float)
dp['bid vol'] = dp['bids'].str[1].astype(float)
dp['ask price'] = dp['asks'].str[0].astype(float)
dp['ask vol'] = dp['asks'].str[1].astype(float)
dp = dp.drop(['bids', 'asks'], axis=1)
bid price bid vol ask price ask vol
0 38933.43 2.51460 38933.44 5.94267
1 38933.41 0.12812 38933.46 0.01073
2 38933.36 0.00200 38934.09 0.00965
3 38932.41 0.00037 38934.34 1.14567
4 38932.24 0.00440 38934.42 0.06000
5 38932.12 0.00266 38934.76 0.17284
6 38931.73 0.00596 38935.36 0.34796
7 38931.70 0.00166 38935.42 0.12840
8 38931.69 0.12843 38935.87 0.38523
9 38931.48 0.04000 38936.17 0.00027