CASE #1: Predicates on the non-preserved table attributesCREATE TABLE Loans (
loan_nbr INT NOT NULL,
customer_nbr INT NOT NULL,
loan_date DATETIME NOT NULL,
loan_amount DECIMAL(15, 2) NOT NULL,
loan_type CHAR(1) NOT NULL,
CONSTRAINT ck_loan_type
CHECK (loan_type IN ('P', 'B')), -- P=Personal; B=Business
CONSTRAINT pk_loans
PRIMARY KEY (loan_nbr));
INSERT INTO Loans
(loan_nbr, customer_nbr, loan_date, loan_amount, loan_type)
VALUES(1, 1, '20080101', 1500.00, 'P');
INSERT INTO Loans
(loan_nbr, customer_nbr, loan_date, loan_amount, loan_type)
VALUES(2, 2, '20080215', 1000.00, 'P');
INSERT INTO Loans
(loan_nbr, customer_nbr, loan_date, loan_amount, loan_type)
VALUES(3, 1, '20080311', 5000.00, 'B');
INSERT INTO Loans
(loan_nbr, customer_nbr, loan_date, loan_amount, loan_type)
VALUES(4, 3, '20080312', 2000.00, 'P');
INSERT INTO Loans
(loan_nbr, customer_nbr, loan_date, loan_amount, loan_type)
VALUES(5, 4, '20080325', 1200.00, 'P');
INSERT INTO Loans
(loan_nbr, customer_nbr, loan_date, loan_amount, loan_type)
VALUES(6, 3, '20080327', 4000.00, 'B');
INSERT INTO Loans
(loan_nbr, customer_nbr, loan_date, loan_amount, loan_type)
VALUES(7, 5, '20080410', 3500.00, 'B');
INSERT INTO Loans
(loan_nbr, customer_nbr, loan_date, loan_amount, loan_type)
VALUES(8, 2, '20080412', 2000.00, 'P');
CREATE TABLE Customers (
customer_nbr INT NOT NULL,
customer_name VARCHAR(35),
PRIMARY KEY (customer_nbr));
INSERT INTO Customers (customer_nbr, customer_name)
VALUES(1, 'Peter Brown');
INSERT INTO Customers (customer_nbr, customer_name)
VALUES(2, 'Jeff Davis');
INSERT INTO Customers (customer_nbr, customer_name)
VALUES(3, 'Kevin Fox');
INSERT INTO Customers (customer_nbr, customer_name)
VALUES(4, 'Donna Louis');
INSERT INTO Customers (customer_nbr, customer_name)
VALUES(5, 'Greg Barrow');
INSERT INTO Customers (customer_nbr, customer_name)
VALUES(6, 'Richard Douglas');
-- Add foreign key for Loans
ALTER TABLE Loans
ADD CONSTRAINT fk_loans_customers
FOREIGN KEY (customer_nbr)
REFERENCES Customers(customer_nbr);
However, the results do not seem correct:SELECT C.customer_name,
SUM(COALESCE(L.loan_amount, 0)) AS total
FROM Customers AS C
LEFT OUTER JOIN Loans AS L
ON C.customer_nbr = L.customer_nbr
WHERE L.loan_type = 'P'
GROUP BY C.customer_name
ORDER BY customer_name;
Customers Greg and Richard are missing from the output. To understand the problems let's run this step by step:customer_name total
---------------- --------
Donna Louis 1200.00
Jeff Davis 3000.00
Kevin Fox 2000.00
Peter Brown 1500.00
This returns Cartesian product (every possible combination of a row from the Customers table and a row from the Loans table).SELECT C.customer_name, L.loan_amount, L.loan_type
FROM Customers AS C
CROSS JOIN Loans AS L;
This results in the following output:SELECT C.customer_name, L.loan_amount, L.loan_type
FROM Customers AS C
INNER JOIN Loans AS L
ON C.customer_nbr = L.customer_nbr;
At this stage only rows that match based on the predicate for customer match are included.customer_name loan_amount loan_type
--------------- ------------ ---------
Peter Brown 1500.00 P
Jeff Davis 1000.00 P
Peter Brown 5000.00 B
Kevin Fox 2000.00 P
Donna Louis 1200.00 P
Kevin Fox 4000.00 B
Greg Barrow 3500.00 B
Jeff Davis 2000.00 P
The query returns this result set:SELECT C.customer_name, L.loan_amount, L.loan_type
FROM Customers AS C
LEFT OUTER JOIN Loans AS L
ON C.customer_nbr = L.customer_nbr;
customer_name loan_amount loan_type
--------------- ------------- ---------
Peter Brown 1500.00 P
Peter Brown 5000.00 B
Jeff Davis 1000.00 P
Jeff Davis 2000.00 P
Kevin Fox 2000.00 P
Kevin Fox 4000.00 B
Donna Louis 1200.00 P
Greg Barrow 3500.00 B
Richard Douglas NULL NULL
This result set is the base for the results in the first query attempt:SELECT C.customer_name, L.loan_amount, L.loan_type
FROM Customers AS C
LEFT OUTER JOIN Loans AS L
ON C.customer_nbr = L.customer_nbr
WHERE L.loan_type = 'P';
Now it is very clear that the predicate in the WHERE clause filters the NULL for loan type for customer Richard, and customer Greg is excluded because he has only business loans. Incorrectly placing the predicate in the WHERE clause turns the outer join to inner join.customer_name loan_amount loan_type
--------------- ------------ ---------
Peter Brown 1500.00 P
Jeff Davis 1000.00 P
Kevin Fox 2000.00 P
Donna Louis 1200.00 P
Jeff Davis 2000.00 P
SELECT C.customer_name,
SUM(COALESCE(L.loan_amount, 0)) AS total
FROM Customers AS C
LEFT OUTER JOIN Loans AS L
ON C.customer_nbr = L.customer_nbr
AND L.loan_type = 'P'
GROUP BY C.customer_name
ORDER BY customer_name;
customer_name total
---------------- --------
Donna Louis 1200.00
Greg Barrow 0.00
Jeff Davis 3000.00
Kevin Fox 2000.00
Peter Brown 1500.00
Richard Douglas 0.00
SELECT C.customer_name,
SUM(COALESCE(L.loan_amount, 0)) AS total
FROM Customers AS C
LEFT OUTER JOIN Loans AS L
ON C.customer_nbr = L.customer_nbr
AND C.customer_name = 'Richard Douglas'
GROUP BY C.customer_name
ORDER BY customer_name;
customer_name total
--------------- -----
Donna Louis 0.00
Greg Barrow 0.00
Jeff Davis 0.00
Kevin Fox 0.00
Peter Brown 0.00
Richard Douglas 0.00
SELECT C.customer_name,
SUM(COALESCE(L.loan_amount, 0)) AS total
FROM Customers AS C
LEFT OUTER JOIN Loans AS L
ON C.customer_nbr = L.customer_nbr
WHERE C.customer_name = 'Richard Douglas'
GROUP BY C.customer_name
ORDER BY customer_name;
customer_name total
---------------- -----
Richard Douglas 0.00
Labels: joins, t-sql programming