Monday, June 22, 2009

Tricky Updates: Weird, Wacky, and Wonderful UPDATE Syntax Variations in SQL Server

If you’ve been working with SQL Server for any length of time, you’ve probably seen a number of different variations of the UPDATE command. From the miraculous to the mundane, the UPDATE command has as many variants in TSQL as….something with a lot of varieties of stuff has things. Damn, it’s late.

Anyway, today’s blog post is more like a public service announcement. It seems that all too many developers are still dangerously unaware of the ability to create very ambiguous UPDATE statements in TSQL.

What am I blithering on about? Well, here’s a couple of simple update statements (using AdventureWorks2008):

--Update the comment field on ALL orders

UPDATE Sales.SalesOrderHeader

SET Comment = 'Unqualified Update'



--Update the comment field on all orders worth more than 10k

UPDATE Sales.SalesOrderHeader

SET Comment = 'Qualified Update - High Dollar Orders'

WHERE TotalDue > 10000.00

Straightforward, right? The first one sets the “Comment” field to “Unqualified Update” for all rows. That’s a bit scary in and of itself, and more than one person has suggested a compiler option to warn about performing unqualified updates. Unqualified updates are today’s topic though. The second script performs a qualified update, setting Comment to a new value for all orders worth more than 10k.

Now how about this?

--Update the comment for orders without a PO Number

UPDATE Sales.SalesOrderHeader

SET Comment = 'This order is for Territory 1'

FROM Sales.SalesOrderHeader

JOIN Sales.Customer ON Sales.SalesOrderHeader.CustomerID = Sales.Customer.CustomerID

WHERE Sales.Customer.TerritoryID = 1

Now we’ve added in a FROM clause, and we’re joining in another table (Sales.Customer) to facilitate updating the comment for customers from a single territory only. Still no rocket science, right? Good.

Howsabout this one?

--Update the comment for orders that contain red products

UPDATE Sales.SalesOrderHeader

SET Comment = 'This order contains red products!'

FROM Sales.SalesOrderDetail sod

JOIN Production.Product prod ON sod.ProductID = prod.ProductID

WHERE prod.Color = 'Red'

Same thing, right? We’re joining to the SalesOrderDetail table, to update the comments for orders containing red products. Or are we? If you’re following along at home, try running the update. Note how many rows were updated (I get 31465 rows affected). Now see how many records are in Sales.SalesOrderHeader. Same number, right? Maybe red is really popular.

Maybe not. I have about 6k orders with red products on them. So what happened? Notice that I didn’t reference Sales.SalesOrderHeader in the FROM clause? Ruh-roh. We effectively coded an all-or-nothing update – if ANY of the order had red products on them, they ALL got updated. And that’s where we get into trouble. An UPDATE doesn’t really demand anything of the FROM clause that a SELECT statement doesn’t demand. Technically, you don’t need to specify the table being updated in the FROM clause, as I just demonstrated.

There are a great many examples of this kind of ambiguous UPDATE syntax. Sometimes the unrestricted UPDATE syntax can be a boon, but all too often it’s a source of error, when developers don’t realize that they’ve made an omission. MVPs Adam Machanic, Steve Kass, Itzik Ben-Gan, and Erland Sommarskog seem to be particularly skilled at finding these kinds of disasters in the making – each of them has blogged or commented about them at great length in the past.

One of the aforementioned fine gentlemen (and I honestly can’t remember who – I think it was Adam) suggested a method for preventing this kind of issue – to always use aliases in the update statement. Here’s an example:

--Update the comment for orders that contain red products

UPDATE soh

SET soh.Comment = 'This order contains red products!'

FROM Sales.SalesOrderHeader soh

JOIN Sales.SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID

JOIN Production.Product prod ON sod.ProductID = prod.ProductID

WHERE prod.Color = 'Red'

By aliasing Sales.SalesOrderHeader, I can “update the alias”, rather than “updating the table”. In “updating the alias”, I make it impossible to omit the table being updated from the FROM clause.

I now use the alias syntax exclusively, when coding an UPDATE with a FROM clause. Unless there is a very good reason not to do so, I recommend that you do the same – unless of course, you really like red products ;-)

4 comments:

Brad Schulz said...

Thanks for that, Aaron... I was surprised that SQL would let you get away with UPDATEing a table that had nothing to do with the FROM clause. Luckily I've never been bitten by this. I think your exclusive use of the alias syntax is a good strategy which I will make a mental note to adopt.

Uri Dimant said...

Aaron
How is about using CTE?
WITH cte
AS
(
SELECT columns
Sales.SalesOrderDetail sodJOIN Production.Product prod ON sod.ProductID = prod.ProductIDWHERE prod.Color = 'Red'
)UPDATE cte SET Comment = 'This order contains red products!'

Pinal Dave said...

Very good post Aron!

I have been using CTE method which URI demonstrated below.

SQL said...

here is one more for you

CREATE TABLE #temp(id INT)
INSERT #temp VALUES(1)
INSERT #temp VALUES(2)
INSERT #temp VALUES(3)

DECLARE @d INT
SELECT @d = 5

UPDATE #temp
SET @d = id
WHERE id %2 =~-1
AND 2|2-2&2-2^2=~3

SELECT @d --what will this return?