I have the following transaction:
let insertPositionsAsync accountId timestamp (positions: PositionInfo list) : Async<Result<int, ExchangeError>> =
async {
try
use connection = getConnection location
do! connection.OpenAsync()
use writer =
connection.BeginBinaryImport(
$"COPY {accountId}.{tablePositionsName} (ts,instrument,average_price,leverage,unrealized_pnl,side,initial_margin,maintenance_margin,position_initial_margin,open_order_initial_margin,quantity,max_notional)
FROM STDIN (FORMAT BINARY)"
)
for t in positions do
do! writer.StartRowAsync() |> Async.AwaitTask
do! writer.WriteAsync(timestamp, NpgsqlDbType.Timestamp) |> Async.AwaitTask
do! writer.WriteAsync(t.Instrument.Ticker, NpgsqlDbType.Varchar) |> Async.AwaitTask
do! writer.WriteAsync(t.AveragePrice, NpgsqlDbType.Double) |> Async.AwaitTask
do! writer.WriteAsync(t.Leverage, NpgsqlDbType.Integer) |> Async.AwaitTask
do! writer.WriteAsync(t.UnrealizedPnl, NpgsqlDbType.Double) |> Async.AwaitTask
do! writer.WriteAsync(t.Side.ToString().ToLower(), NpgsqlDbType.Varchar) |> Async.AwaitTask
do! writer.WriteAsync(t.InitialMargin, NpgsqlDbType.Double) |> Async.AwaitTask
do! writer.WriteAsync(t.MaintenanceMargin, NpgsqlDbType.Double) |> Async.AwaitTask
do! writer.WriteAsync(t.PositionInitialMargin, NpgsqlDbType.Double) |> Async.AwaitTask
do! writer.WriteAsync(t.OpenOrderInitialMargin, NpgsqlDbType.Double) |> Async.AwaitTask
do! writer.WriteAsync(t.Quantity, NpgsqlDbType.Double) |> Async.AwaitTask
do! writer.WriteAsync(t.MaxNotional, NpgsqlDbType.Double) |> Async.AwaitTask
let! c = writer.CompleteAsync()
return Ok (int c)
with ex ->
error $"insertPositionsAsync {ex.Humanize()}"
return Error (ServiceException ex)
}
my understanding is that the loop:
for t in positions do
do! writer.StartRowAsync() |> Async.AwaitTask
do! writer.WriteAsync(timestamp, NpgsqlDbType.Timestamp) |> Async.AwaitTask
...
let! c = writer.CompleteAsync()
is happening in the driver and it's just collecting data in some local storage. Does it make sense to have all these async blocks then? (performance wise).
But the async API must exist for some reason. What could I be missing?
CodePudding user response:
As with any performance related question, the only meaningful answer is measure it and see for yourself!
More generally, async makes it possible to do the communication with the database without blocking an operating system thread. This may or may not have performance implications - but it depends on what else is happening in your program.
If you just have a single logical process communicating with the database, it makes no difference whether you do this in a synchronous or asynchronous bit of code.
If you want to write data to a database, you could do this from multiple concurrent writers. In this case, async would make a difference as you could create more concurrent writters. But I think it would likely not help much - because the database will probably not be able to write faster when you have excessive number of writer threads.
A more interesting case is when you are doing something else in your program. For example, if you have a web server than needs to handle many concurrent requests, it makes sense to use asynchronous communication with a database - because it will not block threads that you need to handle other web requests.
The summary is, you should measure the performance - but it will likely depend on what else is going on in your program. I do not think changing a single threaded writer script to async would make any difference (aside form adding some small overhead).
CodePudding user response:
There's a discussion of this API here. It starts with someone saying:
It would be helpful to have asynchronous versions of BeginBinaryImport and BeginBinaryExport so they can be used from e.g. api endpoints without blocking the server.
For Import, given the write happens on Close/Dispose I'm guessing the write methods would not become async
However, one one of the developers then comments that:
Write can also block. So you need WriteAsync too.