Select distinct generate_series postgresql -
i know meaning of distinct , generate series. when execute query, question marks flying around head.
select distinct generate_series(0,8)
the result weird.
can please me explain happening?
a select
query no order by
clause has no defined order, return relevant rows in whatever order happens convenient executing dbms.
in case of "real" table, might in order of primary key, in order inserted table, or in order of particular index used in execution plan.
in example, "table" created generated_series()
starts off in order 0, 1, 2, 3, etc. however, in order check distinct
constraint put on query, postgres has check if items appear more once. (there no way know generate_series()
function always provide distinct values.)
an efficient way of doing (in general) build "hash map" of values want check uniqueness. rather checking each new value against every existing value, calculate "hash bucket" fall into; if bucket empty, value unique; if not, need compare against other values in bucket.
running explain select distinct generate_series(0,8)
show query plan postgres has selected; me (and presumably you) looks this:
hashaggregate (cost=0.02..0.03 rows=1 width=0) -> result (cost=0.00..0.01 rows=1 width=0)
as expected, there's hashaggregate
operation there, running on result of generate_series()
in order check uniqueness. (exactly how operation works don't know, , isn't important, name suggests it's using hash map work).
at end of hashing operation, postgres can read out values hash map, rather going original list, so. result, no longer in original order, ordered according "hash buckets" fell into.
the moral of story is: always use order by
clause!
Comments
Post a Comment