Friday, February 10, 2012

Why is my left outer join acting like an inner join?

The short answer is that when you specify a column from TableB in the where clause this can happen because of how nulls are handled differently between the on and where clauses. To better understand the issue, let’s look at an example.

Let’s assume you have two tables that you are joining as shown below.

TableA
ID
PartNumber
TableB
ID
PartNumber
DateCreated

 

Assume TableA has 3 rows of data and TableB has 5.

TableA rows:

ID PartNumber
1 P1234
2 P5678
3 P9012

 

TableB rows:

ID PartNumber DateCreated
1 P1234 2012-02-10
2 P1234 2012-02-09
3 P5678 2012-02-09
4 P5678 2012-02-10
5 P9876 2012-02-10

 

select * from TableA as A left outer join TableB as B on (A.PartNumber= B.PartNumber)

This would give the result

A.ID A.PartNumber B.ID B.PartNumber B.DateCreated
1 P1234 1 P1234 2012-02-10
1 P1234 2 P1234 2012-02-09
2 P5678 3 P5678 2012-02-09
2 P5678 4 P5678 2012-02-10
3 P9012 null null null

 

No problems so far. Everything looks good and it straight forward.

Now we decide that actually we should only be considering records in TableB that have a DateCreated = ‘2012-02-10’.

This is where it is easy to make a mistake.

 

The WRONG Way

select * from TableA as A left outer join TableB as B on (A.PartNumber= B.PartNumber)

where DateCreated = ‘2012-02-10’

This will work just like this inner join

select * from TableA as A inner join TableB as B on (A.PartNumber= B.PartNumber)

where DateCreated = ‘2012-02-10’

Both return:

A.ID A.PartNumber B.ID B.PartNumber B.DateCreated
1 P1234 1 P1234 2012-02-10
2 P5678 4 P5678 2012-02-10

Why?

Think about what you have asked SQL Server to do. You have said select all rows from TableA and any matching ones in TableB, but you have also said only show records in TableB that have the DateCreated = ‘2012-02-10’. This means that you have told it to remove any rows where DateCreated is null. Note that one row has a null for DateCreated because there is no matching record in TableB for part number P9012.

 

 

The RIGHT Ways

There are at least three ways to solve this problem.

Solution 1: Move where clause into the on clause

select * from TableA as A left outer join TableB as B on (A.PartNumber= B.PartNumber and DateCreated = ‘2012-02-10’)

I like this one because it is easy to read, though the implications of moving DateCreated to the where clause isn’t quite as clear IMHO.

 

Solution 2: Use an embedded select statement

select * from TableA as A left outer join (select * from TableB where DateCreated = ‘2012-02-10’) as B on (A.PartNumber= B.PartNumber)

I like this one because it is clear what is being selected in TableB, but it is a bit messier reading IMHO.

 

Solution 3: Add check for null in where clause

select * from TableA as A inner join TableB as B on (A.PartNumber= B.PartNumber)

where (DateCreated = ‘2012-02-10’ or DateCreated is null)

This is my least favorite because it uses an OR in the where clause which can have a big negative impact on your performance and there is no need for it. This is particularly true if it was something like DateCreated = (select max(DateCreated from TableB)).

 

Conclusion

I suspect I may have created this very issue many times over the years and not even realized it when doing quick adhoc queries that I didn’t really validate my data very well. Interestingly the results may seem reasonable if you are looking at a lot of rows and are expecting most cases to have matching rows in TableA. The easiest way to tell if you have this problem with your query is to check the number of rows you are getting is the same as the table on the left (TableA in this case). If you have less rows then you may have this problem (or some other problem).

1 comment:

Zachs said...

Hi, I have a more complex issue :

===================================================================================
FIRST QUERY
===================================================================================
SELECT
TAB1.[CDNO], TAB1.[IDTNO], IIF(ISNULL([TAB1.ECONIV3]), [TAB1.ECONIV2], 'none') AS [CONTRIB]
FROM TAB1 WHERE ( TAB1.[UNNN] <> "WATER" AND TAB1.[CCCCPP] <> "SALT")
;

===================================================================================
SECOND QUERY
===================================================================================

SELECT
TAB1.[CDNO], TAB1.[IDTNO], IIF(ISNULL([TAB1.ECONIV3]), [TAB1.ECONIV2], 'none') AS [CONTRIB]
, IIF(ISNULL([TAB2.ENTCOM3]), [TAB1.ENTCOM2],'none') AS [CONCOM] , IIF(ISNULL([TAB2.ENTSTI3]), [TAB1.ENTSTI2],'none') AS [CONSTI]
FROM
TAB1
LEFT OUTER JOIN TAB2
ON (TAB1.[IDTNO] = TAB2.[PTDINO] AND (IIF(ISNULL([TAB1.ECONIV3]), [TAB1.ECONIV2], 'none') = IIF(ISNULL([TAB2.ENTSTI3]), [TAB1.ENTSTI2],'none'))
WHERE ( TAB1.[UNNN] <> "WATER" AND TAB1.[CCCCPP] <> "SALT")
;
===================================================================================

TAB1 was has 8 times the number of rows of TAB2.
There is something wrong with the second query has it returns less rows than the first query.
So my issue is in the second query where I'm trying to have all the rows of the selected column of TAB1 with in addition the two columns from
TAB2 with a match on the 2 OUTER JOIN conditions.

Do you have any clue ?
Thanks in advance.