Understanding of PostgreSQL performance-CodePudding

I have a question regarding the exact understanding of how PostgreSQL executes a query. Since it's row based it shouldn't matter whether you have a query like Select * from lineitem or Select l_orderkey from lineitem performance wise? Further why is there a difference in measured time between Select count(*) from lineitem and Select * from lineitem since the full table has to be scanned in both cases? I am measuring the performance like that:

long starttime=System.CurrenMillis();
Statement a = conn.createStatement();
a.setFetchsize(10000);
Resultset rs = a.executeQuery(Query);
while(rs.next()){}
long endTime=System.currentMillis();
System.out.println((endTime-starttime));

I have to set a fetchsize otherwise there will be a Java Heap Space Error, since the table are quite large. In Addition i was trying to measure the difference to horizontal partitioned table. For that i splitted the table order by orderstatus='O', resulting two tables (ordersO, ordersF) which have both the same size. When comparing the queries Select * from ordersO and Select * from orders where o_orderstatus='O', i was expecting that the first query on the partition only needs half of the time, since it contains only half of the tupel? But it wasnt, the measured time was more or less equal.

Thank you in Advance !

CodePudding user response：

There are several questions here.

Will `SELECT col FROM tab` perform better than `SELECT * FROM tab`?

Yes, it will, for two reasons:

PostgreSQL doesn't have to extract all columns from the row. If the column is early in the list of columns, that is an obvious win. But we also win, because we don't have to “de-toast” oversized attributes that are stored out of line in a TOAST table.
You have to transfer less data to the client, and the client has to process less data.

Why is `SELCT count() FROM tab` faster than `SELECT FROM tab`?

Mostly because it does not have to transfer a lot of data to the client. In addition, it has to do less processing on each table row on the server.

Why are `SELECT * FROM tab_42` and `SELECT * FROM tab WHERE partkey = 42` equally fast?

That's because of partition pruning. PostgreSQL will restrict scans to the appropriate table partitions if the partitioning key is in the WHERE clause.

Will SELECT col FROM tab perform better than SELECT * FROM tab?

Why is SELCT count(*) FROM tab faster than SELECT * FROM tab?

Why are SELECT * FROM tab_42 and SELECT * FROM tab WHERE partkey = 42 equally fast?

Will `SELECT col FROM tab` perform better than `SELECT * FROM tab`?

Why is `SELCT count() FROM tab` faster than `SELECT FROM tab`?

Why are `SELECT * FROM tab_42` and `SELECT * FROM tab WHERE partkey = 42` equally fast?