Hier is een oplossing op basis van geneste subquery's. Eerst heb ik een paar rijen toegevoegd om nog een paar gevallen te vangen. Transactie 10 moet bijvoorbeeld niet worden geannuleerd door transactie 12, omdat transactie 11 er tussenin zit.
> select * from transactions order by date_time;
+----+---------+------+---------------------+--------+
| id | account | type | date_time | amount |
+----+---------+------+---------------------+--------+
| 1 | 1 | R | 2012-01-01 10:01:00 | 1000 |
| 2 | 3 | R | 2012-01-02 12:53:10 | 1500 |
| 3 | 3 | A | 2012-01-03 13:10:01 | -1500 |
| 4 | 2 | R | 2012-01-03 17:56:00 | 2000 |
| 5 | 1 | R | 2012-01-04 12:30:01 | 1000 |
| 6 | 2 | A | 2012-01-04 13:23:01 | -2000 |
| 7 | 3 | R | 2012-01-04 15:13:10 | 3000 |
| 8 | 3 | R | 2012-01-05 12:12:00 | 1250 |
| 9 | 3 | A | 2012-01-06 17:24:01 | -1250 |
| 10 | 3 | R | 2012-01-07 00:00:00 | 1250 |
| 11 | 3 | R | 2012-01-07 05:00:00 | 4000 |
| 12 | 3 | A | 2012-01-08 00:00:00 | -1250 |
| 14 | 2 | R | 2012-01-09 00:00:00 | 2000 |
| 13 | 3 | A | 2012-01-10 00:00:00 | -1500 |
| 15 | 2 | A | 2012-01-11 04:00:00 | -2000 |
| 16 | 2 | R | 2012-01-12 00:00:00 | 5000 |
+----+---------+------+---------------------+--------+
16 rows in set (0.00 sec)
Maak eerst een zoekopdracht om voor elke transactie "de datum van de meest recente transactie vóór die op dezelfde rekening" te verkrijgen:
SELECT t2.*,
MAX(t1.date_time) AS prev_date
FROM transactions t1
JOIN transactions t2
ON (t1.account = t2.account
AND t2.date_time > t1.date_time)
GROUP BY t2.account,t2.date_time
ORDER BY t2.date_time;
+----+---------+------+---------------------+--------+---------------------+
| id | account | type | date_time | amount | prev_date |
+----+---------+------+---------------------+--------+---------------------+
| 3 | 3 | A | 2012-01-03 13:10:01 | -1500 | 2012-01-02 12:53:10 |
| 5 | 1 | R | 2012-01-04 12:30:01 | 1000 | 2012-01-01 10:01:00 |
| 6 | 2 | A | 2012-01-04 13:23:01 | -2000 | 2012-01-03 17:56:00 |
| 7 | 3 | R | 2012-01-04 15:13:10 | 3000 | 2012-01-03 13:10:01 |
| 8 | 3 | R | 2012-01-05 12:12:00 | 1250 | 2012-01-04 15:13:10 |
| 9 | 3 | A | 2012-01-06 17:24:01 | -1250 | 2012-01-05 12:12:00 |
| 10 | 3 | R | 2012-01-07 00:00:00 | 1250 | 2012-01-06 17:24:01 |
| 11 | 3 | R | 2012-01-07 05:00:00 | 4000 | 2012-01-07 00:00:00 |
| 12 | 3 | A | 2012-01-08 00:00:00 | -1250 | 2012-01-07 05:00:00 |
| 14 | 2 | R | 2012-01-09 00:00:00 | 2000 | 2012-01-04 13:23:01 |
| 13 | 3 | A | 2012-01-10 00:00:00 | -1500 | 2012-01-08 00:00:00 |
| 15 | 2 | A | 2012-01-11 04:00:00 | -2000 | 2012-01-09 00:00:00 |
| 16 | 2 | R | 2012-01-12 00:00:00 | 5000 | 2012-01-11 04:00:00 |
+----+---------+------+---------------------+--------+---------------------+
13 rows in set (0.00 sec)
Gebruik dat als een subquery om elke transactie en zijn voorganger op dezelfde rij te krijgen. Gebruik wat filtering om de transacties eruit te halen waarin we geïnteresseerd zijn - namelijk 'A'-transacties waarvan de voorgangers 'R'-transacties zijn die ze precies annuleren -
SELECT
t3.*,transactions.*
FROM
transactions
JOIN
(SELECT t2.*,
MAX(t1.date_time) AS prev_date
FROM transactions t1
JOIN transactions t2
ON (t1.account = t2.account
AND t2.date_time > t1.date_time)
GROUP BY t2.account,t2.date_time) t3
ON t3.account = transactions.account
AND t3.prev_date = transactions.date_time
AND t3.type='A'
AND transactions.type='R'
AND t3.amount + transactions.amount = 0
ORDER BY t3.date_time;
+----+---------+------+---------------------+--------+---------------------+----+---------+------+---------------------+--------+
| id | account | type | date_time | amount | prev_date | id | account | type | date_time | amount |
+----+---------+------+---------------------+--------+---------------------+----+---------+------+---------------------+--------+
| 3 | 3 | A | 2012-01-03 13:10:01 | -1500 | 2012-01-02 12:53:10 | 2 | 3 | R | 2012-01-02 12:53:10 | 1500 |
| 6 | 2 | A | 2012-01-04 13:23:01 | -2000 | 2012-01-03 17:56:00 | 4 | 2 | R | 2012-01-03 17:56:00 | 2000 |
| 9 | 3 | A | 2012-01-06 17:24:01 | -1250 | 2012-01-05 12:12:00 | 8 | 3 | R | 2012-01-05 12:12:00 | 1250 |
| 15 | 2 | A | 2012-01-11 04:00:00 | -2000 | 2012-01-09 00:00:00 | 14 | 2 | R | 2012-01-09 00:00:00 | 2000 |
+----+---------+------+---------------------+--------+---------------------+----+---------+------+---------------------+--------+
4 rows in set (0.00 sec)
Uit het bovenstaande resultaat blijkt dat we er bijna zijn - we hebben de ongewenste transacties geïdentificeerd. LEFT JOIN
gebruiken we kunnen deze uit de hele transactieset filteren:
SELECT
transactions.*
FROM
transactions
LEFT JOIN
(SELECT
transactions.id
FROM
transactions
JOIN
(SELECT t2.*,
MAX(t1.date_time) AS prev_date
FROM transactions t1
JOIN transactions t2
ON (t1.account = t2.account
AND t2.date_time > t1.date_time)
GROUP BY t2.account,t2.date_time) t3
ON t3.account = transactions.account
AND t3.prev_date = transactions.date_time
AND t3.type='A'
AND transactions.type='R'
AND t3.amount + transactions.amount = 0) t4
USING(id)
WHERE t4.id IS NULL
AND transactions.type = 'R'
ORDER BY transactions.date_time;
+----+---------+------+---------------------+--------+
| id | account | type | date_time | amount |
+----+---------+------+---------------------+--------+
| 1 | 1 | R | 2012-01-01 10:01:00 | 1000 |
| 5 | 1 | R | 2012-01-04 12:30:01 | 1000 |
| 7 | 3 | R | 2012-01-04 15:13:10 | 3000 |
| 10 | 3 | R | 2012-01-07 00:00:00 | 1250 |
| 11 | 3 | R | 2012-01-07 05:00:00 | 4000 |
| 16 | 2 | R | 2012-01-12 00:00:00 | 5000 |
+----+---------+------+---------------------+--------+