I'm a data visualization noob and have been using Plotly dash to make dashboards.
While the features are pretty good for a small number of plots, inputs and relationships, the dashboard obviously becomes very laggy when the dataset is big or the backend callbacks have long and complex logic in them.
I've found plotly.js to be obviously faster, but the frontend doesn't allow the easy to use pandas functionality that the python backend does.
This is why I am seeking to learn to build a high-performance frontend and backend, like on TradingView, which has a great UI/UX.
So, here's what I am thinking:
Build a flask backend that deals with the large database and uses pandas like libs to process the data and send back the visualization, effectively doing the hard work.
Have an AJAX frontend that triggers the backend with the input values and updates the Plotly plot.
I am obviously reinventing the wheel, but Plotly dash is just not feasible for large dashboards.
What other frontend packages are there for the input widgets e.g. slider, dropdown?
CodePudding user response:
On the frontend side, if you plan to use React, maybe you could consider using Recoil to solve rerendering issues. I guess you will have to store big arrays of data which can be managed by atomFamily in recoil.
The library is developed by Facebook itself. It still marked as experimental from a long time now, but look pretty mature. You should take a look to this thread and especially this response.
I am using it for some little projects for now, and I have a lot of fun with it.
CodePudding user response:
Interesting question, I believe Dash can still be a powerful tool for visualising large datasets if you manage the data efficiently.
Some thoughts:
Pandas is good at handling big data and its pre-defined aggregation/querying/preprocessing functions are better optimised than say rather than using loops and single functions on each row. Making sure that certain operations aren't being repeated and efficiently preprocessing your dataframes with Pandas appropriately (ie. Pandas functions) will go along in making your dashboard more efficient.
Alternatives to pandas may be beneficial to visualisations such as Vaex and Dask. These could be help for graphs since you are returning a plotly graph objects for graphs and these can be built with these alternatives as well as display tables. Here's an article on these alternatives and other options.
Consider caching data/dataframes that you grab frequently as bottlenecks can exist in API retrievals or data collection from different sources. Dash has some advice on how to handle performance well for large datasets.