Consider the following set of SQL data:
Id | Value |
---|---|
10 | 1142 |
11 | 1142 |
12 | 1142 |
13 | 1142 |
14 | 1185 |
15 | 1185 |
16 | 1185 |
Column Id
is an auto-increment primary key. It is used to uniquely identify records as well
as to specify row order. Column Value
is used to group together consecutive records.
The data above contain two groups:
(10, 1142), (11, 1142), (12, 1142), (13, 1142)
is the first group(14, 1185), (15, 1185), (16, 1185)
is the second group
Our aim is to update Value
field to match the Id
value of the first record of the same group.
Hence, after the UPDATE
takes place the data will look like this:
Id | Value |
---|---|
10 | 10 |
11 | 10 |
12 | 10 |
13 | 10 |
14 | 14 |
15 | 14 |
16 | 14 |
So, we essentially want to select the first value of each group and use this value to do the update. SQL Server (starting with version 2012) implements a window function that does exactly this. It is called FIRST_VALUE. Using it is as simple as:
SELECT Id, Value,
FIRST_VALUE(Id) OVER (PARTITION BY Val ORDER BY Id) AS newValue
FROM mytable
The query above returns in newValue
the required new value for Value
column.
Wrapping the above query in a
Common Table Expression
(CTE) gives us easily the UPDATE
:
;WITH ToUpdate AS (
SELECT Id, Val,
FIRST_VALUE(Id) OVER (PARTITION BY Val ORDER BY Id) AS newValue
FROM mytable
)
UPDATE ToUpdate
SET Val = newValue
What happens here is that the UPDATE
is performed on the CTE, but in reality it is propagated
to the actual records in the database.
Unfortunately FIRST_VALUE
isn't available in SQL Server versions 2008/2005. In this case we can
use the following query:
;WITH CTE_Rn AS (
SELECT Id, Val,
ROW_NUMBER() OVER (PARTITION BY Val ORDER BY Id) AS rn
FROM mytable
), ToUpdate AS (
SELECT t1.Id, t2.Val, t2.Id AS newValue
FROM mytable AS t1
JOIN CTE_Rn AS t2 ON t1.Val = t2.Val AND t2.rn = 1
)
UPDATE ToUpdate
SET Val = newValue