Home > Blockchain >  Strange values with Pandas DataFrames
Strange values with Pandas DataFrames

Time:01-17

Thats my code and I get some strange values for "TradePrice" Does anyone hava clue how to solve this.

import pandas as pd

df = pd.read_csv("TradesSEK.csv")
df["IBCommission"] = abs(df["IBCommission"])

sum_df = df.groupby(["Symbol", "Buy/Sell"]).sum()[["Quantity", "TradePrice", "IBCommission"]].reset_index()

#skriv ut den nya dataframe
print(sum_df)
#skriv till csv fil
sum_df.to_csv("SumTrades.csv", index=False)

Here is a short example for the data I try to import and Sum

Symbol,Buy/Sell,Quantity,TradePrice,IBCommission,CurrencyPrimary,TradeDate
FDXS MAR 22,SELL,-1,164561.432,-3.89956,SEK,2022-01-03
FDXS MAR 22,SELL,-1,164561.432,-3.89956,SEK,2022-01-03
FDXS MAR 22,SELL,-1,164561.432,-3.89956,SEK,2022-01-03
FDXS MAR 22,BUY,1,164684.576,-3.89956,SEK,2022-01-03
FDXS MAR 22,BUY,1,164674.314,-3.89956,SEK,2022-01-03
FDXS MAR 22,BUY,1,164674.314,-3.89956,SEK,2022-01-03
FDXS MAR 22,BUY,1,164684.576,-3.89956,SEK,2022-01-03
FDXS MAR 22,BUY,1,164684.576,-3.89956,SEK,2022-01-03
FDXS MAR 22,BUY,1,164684.576,-3.89956,SEK,2022-01-03
FDXS MAR 22,SELL,-1,164571.69400000002,-3.89956,SEK,2022-01-03
FDXS MAR 22,SELL,-1,164571.69400000002,-3.89956,SEK,2022-01-03
FDXS MAR 22,SELL,-1,164571.69400000002,-3.89956,SEK,2022-01-03
FDXS MAR 22,SELL,-1,164469.074,-3.89956,SEK,2022-01-03
FDXS MAR 22,SELL,-1,164469.074,-3.89956,SEK,2022-01-03
FDXS MAR 22,SELL,-1,164469.074,-3.89956,SEK,2022-01-03
FDXS MAR 22,BUY,1,164366.454,-3.89956,SEK,2022-01-03
FDXS MAR 22,BUY,1,164571.69400000002,-3.89956,SEK,2022-01-03
FDXS MAR 22,BUY,1,164571.69400000002,-3.89956,SEK,2022-01-03
FDXS MAR 22,BUY,1,164171.476,-3.89956,SEK,2022-01-03
FDXS MAR 22,BUY,1,164171.476,-3.89956,SEK,2022-01-03
FDXS MAR 22,BUY,1,164171.476,-3.89956,SEK,2022-01-03
FDXS MAR 22,SELL,-1,164458.812,-3.89956,SEK,2022-01-03
FDXS MAR 22,SELL,-1,164684.576,-3.89956,SEK,2022-01-03
FDXS MAR 22,SELL,-1,164499.86000000002,-3.89956,SEK,2022-01-03
MESH2,BUY,1,43729.370899999994,-4.729712,SEK,2022-01-04
MESH2,BUY,1,43729.370899999994,-4.729712,SEK,2022-01-04
MESH2,SELL,-1,43688.4407,-4.729712,SEK,2022-01-04
MESH2,SELL,-1,43688.4407,-4.729712,SEK,2022-01-04
MESH2,SELL,-1,43702.08409999999,-4.729712,SEK,2022-01-04
MESH2,SELL,-1,43702.08409999999,-4.729712,SEK,2022-01-04
MESH2,BUY,1,43665.7017,-4.729712,SEK,2022-01-04
MESH2,BUY,1,43711.17969999999,-4.729712,SEK,2022-01-04
MESH2,BUY,1,43297.79,-4.715152000000001,SEK,2022-01-05
MESH2,BUY,1,43297.79,-4.715152000000001,SEK,2022-01-05
MESH2,SELL,-1,43334.0604,-4.715152000000001,SEK,2022-01-05
MESH2,SELL,-1,43340.8611,-4.715152000000001,SEK,2022-01-05

Here is the output

         Symbol Buy/Sell  Quantity    TradePrice  IBCommission
0       EUR.USD     SELL -617.1651  3.196508e 01     18.271867
1   FDXS DEC 22      BUY  110.0000  1.561461e 07    455.386870
2   FDXS DEC 22     SELL -110.0000  1.529547e 07    455.386870
3   FDXS JUN 22      BUY   24.0000  3.523879e 06     94.577592
4   FDXS JUN 22     SELL  -24.0000  3.528523e 06     94.577592
5   FDXS MAR 22      BUY  172.0000  2.690855e 07    682.416426
6   FDXS MAR 22     SELL -172.0000  2.662888e 07    682.416426
7   FDXS MAR 23      BUY    3.0000  4.688854e 05     12.682500
8   FDXS MAR 23     SELL   -3.0000  4.684181e 05     12.682500
9   FDXS SEP 22      BUY   47.0000  6.480417e 06    189.812090
11  FESX MAR 22      BUY    9.0000  3.750484e 05    119.088396
12  FESX MAR 22     SELL   -9.0000  3.748532e 05    119.088396
13  FSXE MAR 22      BUY    5.0000  1.178318e 05     18.331270
14  FSXE MAR 22     SELL   -5.0000  1.175731e 05     18.331270
15        M2KH2      BUY    3.0000  5.665107e 04     15.401632
16        M2KH2     SELL   -3.0000  5.653878e 04     15.401632
17        MESH2      BUY   63.0000  2.597867e 06    301.419600
18        MESH2     SELL  -63.0000  2.512345e 06    301.419600

And please Dont hate totaly new to Python

I try to import data to a DataFrame and then sum all the values. It works for all columns but not "TradePrice"

CodePudding user response:

In practical terms, if the desired format is five digits after the decimal point, you can use these two commands:

  • pd.set_option('display.float_format', lambda x: '%.5f' % x) - this will set your display format of floats to five digits.
  • When you're ready to save your DataFrame to CSV, you can use df.to_csv('SumTrades.csv', float_format='%.5f') to tell Pandas to use the same float format.

For further reference: Styler, options, style

CodePudding user response:

The reason you are getting strange numbers for TradePrice is because your statement sum_df = df.groupby(["Symbol", "Buy/Sell"]).sum()[["Quantity", "TradePrice", "IBCommission"]].reset_index() groups your dataframe rows by Symbol and Buy/Sell. It then adds ALL the numbers in the rows. You do not want the Sum of TradePrice, You would want the average so you will need another statement to generate that:

sum_df = df.groupby(["Symbol", "Buy/Sell"]).sum()[["Quantity", "IBCommission"]].reset_index()
sum_df['TradePrice'] = df.groupby(["Symbol", "Buy/Sell"]).mean()[["TradePrice"]]

That should work but I have not tested it. At any rate you should get the idea. I would also reset the output format as others have already suggested in the other answers

  • Related