De meest linkse ORDER BY
items kunnen het niet oneens zijn met de items van de DISTINCT
clausule. Ik citeer de handleiding over DISTINCT
:
Probeer:
SELECT *
FROM (
SELECT DISTINCT ON (c.cluster_id, feed_id)
c.cluster_id, num_docs, feed_id, url_time
FROM url_info u
JOIN cluster_info c ON (c.cluster_id = u.cluster_id)
WHERE feed_id IN (SELECT pot_seeder FROM potentials)
AND num_docs > 5
AND url_time > '2012-04-16'
ORDER BY c.cluster_id, feed_id, num_docs, url_time
-- first columns match DISTINCT
-- the rest to pick certain values for dupes
-- or did you want to pick random values for dupes?
) x
ORDER BY num_docs DESC;
Of gebruik GROUP BY
:
SELECT c.cluster_id
, num_docs
, feed_id
, url_time
FROM url_info u
JOIN cluster_info c ON (c.cluster_id = u.cluster_id)
WHERE feed_id IN (SELECT pot_seeder FROM potentials)
AND num_docs > 5
AND url_time > '2012-04-16'
GROUP BY c.cluster_id, feed_id
ORDER BY num_docs DESC;
Als c.cluster_id, feed_id
zijn de primaire sleutelkolommen van alle (beide in dit geval) tabellen waarvan u kolommen opneemt in de SELECT
lijst, dan werkt dit gewoon met PostgreSQL 9.1 of later.
Anders moet je GROUP BY
de rest van de kolommen of aggregeer of geef meer informatie.