DEV Community

PostgreSQL: prepared out of order order by

Recently, a colleague asked me a question about a PostgreSQL construction for which they found that it would silently ignore part of SQL, and whether it's truly doing that.

This is a mini setup that shows the issue:

create table test( id int primary key, f1 text );
insert into test select id, chr(ascii('a')+id) from generate_series(25,0,-1) id;;
Enter fullscreen mode Exit fullscreen mode

This creates a table 'test' with two fields, and fills the fields with a number and a character from 'z' to 'a'.

If I execute the following SQL:

select * from test order by f1 limit 1;
Enter fullscreen mode Exit fullscreen mode

I get:

postgres=# select * from test order by f1 limit 1;
 id | f1
----+----
  0 | a
(1 row)
Enter fullscreen mode Exit fullscreen mode

Here's the issue:

prepare prep(text, int) as select * from test order by $1 limit $2;
Enter fullscreen mode Exit fullscreen mode

And then execute the prepared statement:

postgres=# execute prep('f1',1);
 id | f1
----+----
 25 | z
(1 row)
Enter fullscreen mode Exit fullscreen mode

These are two different results for seemingly the same statement. I actually didn't need to do anything, they also found the issue:

For the statement, the execution plan is:

postgres=# explain select * from test order by f1 limit 1;
                           QUERY PLAN
-----------------------------------------------------------------
 Limit  (cost=1.39..1.39 rows=1 width=6)
   ->  Sort  (cost=1.39..1.46 rows=26 width=6)
         Sort Key: f1
         ->  Seq Scan on test  (cost=0.00..1.26 rows=26 width=6)
(4 rows)
Enter fullscreen mode Exit fullscreen mode
  • The select performs an unfiltered scan of a field for which is no index, and thus it performs a seq scan.
  • Because the seq scan does not hold ordering guarantees that an ascending or descending index can, it has to perform a sort for the order by.
  • The limit rowsource then takes one value because of limit 1.

For the prepared statement the execution plan is:

postgres=# explain execute prep('f1',1);
                         QUERY PLAN
------------------------------------------------------------
 Limit  (cost=0.00..0.15 rows=3 width=38)
   ->  Seq Scan on test  (cost=0.00..1.26 rows=26 width=38)
(2 rows)
Enter fullscreen mode Exit fullscreen mode
  • The select performs an unfiltered scan of a field for which there is no index, and thus it performs a seq scan.
  • Limit does take the first value.
  • The sort step is missing, and thus the first value that it found from the seq scan is shown, which is 'z', but could be anything, because the seq scan does not guarantee ordering.

I also found some sources on the internet explaining that this is because of the bind parameters, probably the best one being from the PostgreSQL mail list: https://www.postgresql.org/message-id/1421875206968-5834948.post@n5.nabble.com

The essence of that explanation is that for the order by the parameter is used as an identifier, whilst parameters are values, and that therefore this doesn't work. However, what I am puzzled about is why this succeeds in creating a prepared statement, and why it all works, but silently skips the order by.

I hope to hear any opinions or reasons for ignoring it, or from things missing from this small investigation.

Top comments (1)

Collapse
 
rednikotin profile image
Valentin Nikotin

What would you expect to see when executing the following statement?

prepare prep(text) as select $1 as val from test limit 1;
execute prep('f1');
Enter fullscreen mode Exit fullscreen mode