This may be obvious to most people who read this blog but I’m going to post it anyway.
Let’s suppose you have a requirement to match organ donors to potential recipients. Each recipient has a complex set of requirements. For recipient 555 it has to be a liver, and the blood type must be ABNeg and the donor must be less than 60 years old or could be less than 70 if he was a non drinker. You decide to express the criteria as a table of predicates and operators which can be constructed into a WHERE clause. For example, WHERE organ = ‘liver’ and blood_type = ‘AB-” and (age < 60 or (age < 70 and drinker = ‘No’ ))
When a new organ comes in, an organ record is written to the t_organ table and assigned a unique sequential numeric key.
Your program iterates through the the recipients table and tests each recipient record against the organ record to see if there is a match. The recipient record has 2 columns, recipient_id and criteria_nested_table. The criteria_nested_table column contains Boolean criteria and operators that will allow you to construct a where clause at run time.
Your pseudo code looks something like:
- Get new donor record
- For each recipient record
- construct the where_clause using the recipient’s criteria records
- Using dynamic SQL, execute recipient.where_clause against donor record.
- If matched, celebrate else try next recipient
A month later, your program is working perfectly and donors are being matched with recipients. Then, one day a doctor asks you, “Why didn’t recipient 123 match with donor record number 789″? You don’t know exactly which criteria cause the no-match situation. So, now you need to enhance the system.
You think about it for a while and conclude that, if you need to determine which criteria did not match, you will need to test each criteria independently. That means executing a where clause for each predicate.
Instead of SELECT ‘MATCHED’ from t_organ where key_id = 8373 and organ = ‘liver’ and blood_type = ‘AB-‘;
You will now need to generate and execute
SELECT ‘MATCHED’ from t_organ where key_id = 8373 and organ = ‘liver';
SELECT ‘MATCHED’ from t_organ where key_id = 8373 and blood_type = ‘AB-‘;
If all of the generated SELECT statements match, then the recipient matches the donor and you will be able to record which criteria, if any, failed.
But, the criteria is actually much more complex and sometimes involves 20 predicates. That would mean 20 SELECT statements to get the job done. Performance is already a problem, so clearly, this is never going to work. Then it hits you like a ton of lead bricks …. the CASE statement! What if I move my where_clause into a CASE statement.
The following example shows where I am going with this.
drop table t;
create table t (thekey integer,a integer,b integer,c integer, d varchar2(20));
insert into t values (10,1,2,3,’this’);
insert into t values (20,1,2,4,’that’);
insert into t values (30,1,3,5,’same’);
insert into t values (40,2,2,3,’other’);
— THE OLD WAY (do the same for thekey =20,30,40 also)
from t where thekey = 10 and a = 1 and b = 2 and (c = 3 or d = ‘same’);
if rec_not_found then no match
— THE NEW WAY (do the same for thekey =20,30,40 also)
case when a = 1 then ‘TRUE’ else ‘FALSE’ end a,
case when b = 2 then ‘TRUE’ else ‘FALSE’ end b,
case when c = 3 or d = ‘same’ then ‘TRUE’ else ‘FALSE’ end c_or_d,
case when a = 1 and b = 2 and (c = 3 or d = ‘same’) then ‘TRUE’ else ‘FALSE’ end overall
where thekey = 10;
If overall = ‘FALSE’ then no match.
So, now I can easily record which predicates were true and which were false without making multiple trips to the database. Note that I don’t try to split out the OR’d criteria, but treat it as a single predicate. Fortunately, this is good enough.