I use excel powerquery to clean structured data regularly. Sometimes the dataset row count exceeds the limit of what can be loaded in Excel cells (~1M).
Currently, in my powerquery, I split the dataset based on a categorical column into two separate queries (in order to meet excel row limits). But, I would ideally only like to split the data if the row count of the full data set exceeds 1 million rows.
Is there a Powerquery way to do something like
if table.row_count > 1 000 000 then Create new query containing rows where row number > 1 000 000
?
CodePudding user response:
That's not possible, I'm afraid. Instead, you could write multiple queries where the last step of each is filtered according to the row count you want. e.g. query 1 always loads first 1 million rows, query 2 is exactly the same but filters from 1,000,001 to 2m and loads those rows. If query 2 has no data, then nothing will load to the sheet.
Do you need the granular data in your worksheets? You could load the full dataset to the data model and then create pivot tables or cube functions based on the data model.