sql >> Database >  >> RDS >> Mysql

SQL-query over meerdere rijen

Een andere benadering zou zijn -

SELECT housing_id
FROM mytable
WHERE facility_id IN (4,7)
GROUP BY housing_id
HAVING COUNT(DISTINCT facility_id) = 2

UPDATE - geïnspireerd door de opmerking van Josvic besloot ik nog wat meer testen te doen en dacht ik dat ik mijn bevindingen zou opnemen.

Een van de voordelen van het gebruik van deze query is dat deze eenvoudig kan worden gewijzigd om meer facility_ids op te nemen. Als je alle housing_ids wilt vinden die facility_ids 1, 3, 4 &7 hebben, doe dan gewoon -

SELECT housing_id
FROM mytable
WHERE facility_id IN (1,3,4,7)
GROUP BY housing_id
HAVING COUNT(DISTINCT facility_id) = 4

De prestaties van alle drie deze zoekopdrachten variëren enorm, afhankelijk van de gebruikte indexeringsstrategie. Ik kon op mijn testdataset geen redelijke prestaties krijgen van de afhankelijke subqueryversie, ongeacht de gebruikte indexering.

De self-join-oplossing van Tim presteert zeer goed gezien de afzonderlijke indices van één kolom op de twee kolommen, maar presteert niet zo goed naarmate het aantal criteria toeneemt.

Hier zijn enkele basisstatistieken op mijn testtabel - 500k rijen - 147963 housing_ids met potentiële waarden voor facility_id tussen 1 en 9.

Dit zijn de indices die worden gebruikt voor het uitvoeren van al deze tests -

SHOW INDEXES FROM mytable;
+---------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+
| Table   | Non_unique | Key_name            | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type |
+---------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+
| mytable |          0 | UQ_housing_facility |            1 | housing_id  | A         |      500537 |     NULL | NULL   |      | BTREE      |
| mytable |          0 | UQ_housing_facility |            2 | facility_id | A         |      500537 |     NULL | NULL   |      | BTREE      |
| mytable |          0 | UQ_facility_housing |            1 | facility_id | A         |          12 |     NULL | NULL   |      | BTREE      |
| mytable |          0 | UQ_facility_housing |            2 | housing_id  | A         |      500537 |     NULL | NULL   |      | BTREE      |
| mytable |          1 | IX_housing          |            1 | housing_id  | A         |      500537 |     NULL | NULL   |      | BTREE      |
| mytable |          1 | IX_facility         |            1 | facility_id | A         |          12 |     NULL | NULL   |      | BTREE      |
+---------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+

De eerste geteste zoekopdracht is de afhankelijke subquery -

SELECT SQL_NO_CACHE DISTINCT housing_id
FROM mytable
WHERE housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=4)
AND housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=7);

17321 rows in set (9.15 sec)

+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
| id | select_type        | table   | type            | possible_keys                                                  | key                 | key_len | ref        | rows   | Extra                                 |
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
|  1 | PRIMARY            | mytable | range           | NULL                                                           | IX_housing          | 4       | NULL       | 500538 | Using where; Using index for group-by |
|  3 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8       | func,const |      1 | Using index; Using where              |
|  2 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8       | func,const |      1 | Using index; Using where              |
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+

SELECT SQL_NO_CACHE DISTINCT housing_id
FROM mytable
WHERE housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=1)
AND housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=3)
AND housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=4)
AND housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=7);

567 rows in set (9.30 sec)

+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
| id | select_type        | table   | type            | possible_keys                                                  | key                 | key_len | ref        | rows   | Extra                                 |
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
|  1 | PRIMARY            | mytable | range           | NULL                                                           | IX_housing          | 4       | NULL       | 500538 | Using where; Using index for group-by |
|  5 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8       | func,const |      1 | Using index; Using where              |
|  4 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8       | func,const |      1 | Using index; Using where              |
|  3 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8       | func,const |      1 | Using index; Using where              |
|  2 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8       | func,const |      1 | Using index; Using where              |
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+

De volgende is mijn versie met behulp van de GROUP BY ... HAVING COUNT ...

SELECT SQL_NO_CACHE housing_id
FROM mytable
WHERE facility_id IN (4,7)
GROUP BY housing_id
HAVING COUNT(DISTINCT facility_id) = 2;

17321 rows in set (0.79 sec)

+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
| id | select_type | table   | type  | possible_keys                   | key         | key_len | ref  | rows   | Extra                                    |
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
|  1 | SIMPLE      | mytable | range | UQ_facility_housing,IX_facility | IX_facility | 4       | NULL | 198646 | Using where; Using index; Using filesort |
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+

SELECT SQL_NO_CACHE housing_id
FROM mytable
WHERE facility_id IN (1,3,4,7)
GROUP BY housing_id
HAVING COUNT(DISTINCT facility_id) = 4;

567 rows in set (1.25 sec)

+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
| id | select_type | table   | type  | possible_keys                   | key         | key_len | ref  | rows   | Extra                                    |
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
|  1 | SIMPLE      | mytable | range | UQ_facility_housing,IX_facility | IX_facility | 4       | NULL | 407160 | Using where; Using index; Using filesort |
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+

En last but not least de zelf-join -

SELECT SQL_NO_CACHE a.housing_id
FROM mytable a
INNER JOIN mytable b
    ON a.housing_id = b.housing_id
WHERE a.facility_id = 4 AND b.facility_id = 7;

17321 rows in set (1.37 sec)

+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+-------------+
| id | select_type | table | type   | possible_keys                                                  | key                 | key_len | ref                     | rows  | Extra       |
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+-------------+
|  1 | SIMPLE      | b     | ref    | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | IX_facility         | 4       | const                   | 94598 | Using index |
|  1 | SIMPLE      | a     | eq_ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8       | test.b.housing_id,const |     1 | Using index |
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+-------------+

SELECT SQL_NO_CACHE a.housing_id
FROM mytable a
INNER JOIN mytable b
    ON a.housing_id = b.housing_id
INNER JOIN mytable c
    ON a.housing_id = c.housing_id
INNER JOIN mytable d
    ON a.housing_id = d.housing_id
WHERE a.facility_id = 1
AND b.facility_id = 3
AND c.facility_id = 4
AND d.facility_id = 7;

567 rows in set (1.64 sec)

+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+--------------------------+
| id | select_type | table | type   | possible_keys                                                  | key                 | key_len | ref                     | rows  | Extra                    |
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+--------------------------+
|  1 | SIMPLE      | b     | ref    | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | IX_facility         | 4       | const                   | 93782 | Using index              |
|  1 | SIMPLE      | d     | eq_ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8       | test.b.housing_id,const |     1 | Using index              |
|  1 | SIMPLE      | c     | eq_ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8       | test.b.housing_id,const |     1 | Using index              |
|  1 | SIMPLE      | a     | eq_ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8       | test.d.housing_id,const |     1 | Using where; Using index |
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+--------------------------+


  1. 5 interessante feiten over databasebeheersystemen

  2. Werk gegevens bij in de mysql-database door op een link te klikken

  3. SQL hoe groeperen op fiscaal kwartaal en jaar met een datumveld

  4. MySQL GEBRUIKER MAKEN met een variabele?