Home > Software design >  Does a fat table/more columns affect performance in sql
Does a fat table/more columns affect performance in sql

Time:03-15

In the data that I have, there are around 1M rows, each with around 60-70 columns. However only few rows(20-30) will have columns beyond 30 filled, i.e, the table is sparse.Also columns beyond 30 are rarely queried.

Does "number of columns" impact performance?

Should I make two tables? one with first 30 columns and the second table is the original table.

or should I keep the original structure?

Table schema :- Table entity_table ( entity_id int, tag_1 text, tag_2 text, . . . tag_30 text, --upto col. 30 table is dense tag_31 text, . . . tag_70 text --sparse columns );

Also, does the type of these columns affect performance. Does postgres index null values, how to prevent that?

CodePudding user response:

Does "number of columns" impact performance? Short answer is "Yes, but don't worry about it."

More precisely, it eats space and that space has to go to and from disk, eats cache, etc. all of which costs resources. The exact amount of space depends on the column and is available alongside each data type in the postgres docs for data types: https://www.postgresql.org/docs/14/datatype.html

As Frank Heikens commented, a million rows isn't a lot these days. At 70 columns, 8 bytes per column for a million rows you'd be looking at ~560M which will happily fit in memory on a Raspberry PI so shouldn't be that big of a deal.

However, when you get to billions or trillions of rows all those little bytes really start adding up. Hence you might look at:

  • Related