I'm trying to improve performance of a reporting solution I'm building. The tech I'm using is:
- Custom build Python tool extracting data and storing it in
- Azure Hosted PostgreSQL running GP compute with 8vCores
- PowerBI Desktop -> intention is to publish on app.powerbi.com
Data is extracted in JSON format and stored in two column tables (id::bigint, document::json). Then transformed into multi-column using views which have 1-to-many relationships between them based on ids.
Problem is one of the tables has ~4.5M rows and every time I do an action in Power BI it takes 5 minutes for the database to process the request. By action I'm doing really simple operations: adding a column to a report using a single view (with 4M entries) as data source.
I've tried the obvious of adding more compute to my database, but got minimal improvement. Beforehand I tried using ElasticSearch for storing data and found it difficult to integrate with Power BI and operate large sets of data.
Any suggestions what I should be looking at?
Thanks, Alex
CodePudding user response:
Currently, we have over 15 million records in a few tables on PostgreSQL instances and have been seeing performance issues when doing Direct Query in PowerBI.
We have created an index on a few columns that had a "where" clause in the query but it did not help at all. We are wondering if there are any other things we can try to improve the performance on PostgreSQL for PowerBI Direct Query usage.
Our database administrator is also looking into other possible solutions and wanted to see if anyone here has any suggestions as well.
CodePudding user response:
You face the biggest performance problem of PostGreSQL whose COUNT type queries are extremely slow compared to other RDBMSs.
Read the article I wrote about it which shows a gap at most 1500 times slower than Microsoft SQL Server and on average 114 times...
PostGreSQL vs Microsoft SQL Server – Comparison part 2 : COUNT performances
The only way to improve performances of such queries is to create materialized views... PostGreSQL does not have vertical indexes (aka columnstore), nor data compression that are usually used for BI