UPSERT and More with MERGE

One very common requirement is to merge source data to a target table. For example, merge the daily changes to accounts data from branch office table to central accounts table. On SQL Server version 2005 and prior this was accomplished using separate INSERT and UPDATE statements. This involved checking if key column exists to perform update and insert if not, or attempt an update first and then if not rows were affected perform insert. Not to mention if we have to check if account is missing from the branch office data and needs to be deleted from the central accounts table. That way we end up with multiple (sometimes complex) statements to implement one transaction, accessing both source and target tables multiple times.

SQL Server 2008 offers a lot more elegant way using the MERGE statement (MERGE is supported by ANSI SQL). It makes data merging very simple and elegant, as well as efficient.

Here is the base syntax for MERGE:

[ WITH <common_table_expression> [,...n] ]

MERGE

[ TOP ( expression ) [ PERCENT ] ]

[ INTO ] target_table [ [ AS ] table_alias ]

        [ WITH ( <merge_hint> ) ]

USING <table_source>

ON <search_condition>

[ WHEN MATCHED [ AND <search_condition> ]

        THEN <merge_matched> ]

[ WHEN [TARGET] NOT MATCHED [ AND <search_condition> ]

        THEN <merge_not_matched> ]

[ WHEN SOURCE NOT MATCHED [ AND <search_condition> ]

        THEN <merge_matched> ]

<output_clause>

[ OPTION ( <query_hint> [ ,...n ] ) ]   

;



Below is the initial setup for the source and target tables.

-- Static table as target for merging data

CREATE TABLE CentralOfficeAccounts (

 account_nbr INT PRIMARY KEY,

 company_name VARCHAR(35),

 primary_contact VARCHAR(35),

 contact_phone VARCHAR(12));

 

-- Dynamic table with daily updates to be merged

CREATE TABLE BranchOfficeAccounts (

 account_nbr INT PRIMARY KEY,

 company_name VARCHAR(35),

 primary_contact VARCHAR(35),

 contact_phone VARCHAR(12));

 

-- Sample centrall office static data

INSERT INTO CentralOfficeAccounts

VALUES (1, 'Bayside Motors', 'Jim Smith', '902-203-1234'),

       (2, 'Dallas Industries', 'Joe Doe', '301-663-9134'),

       (3, 'Sky Computer Systems', 'Jane Brown', '201-943-6053');

 

-- Daily updated branch office data

INSERT INTO BranchOfficeAccounts

VALUES (2, 'Dallas Industries, Inc.', 'Rick Gross', '301-663-9134'), -- changed

       (3, 'Sky Computer Systems', 'Jane Brown', '201-943-6053'),    -- same

       (4, 'Auto Insurance Co.', 'Chris Jefferson', '313-601-6201'); -- new



Here is how MERGE can be used to perform an update on existing accounts and insert of new accounts.

-- Update existing and add missing

MERGE INTO CentralOfficeAccounts AS C    -- Target

USING BranchOfficeAccounts AS B          -- Source

   ON C.account_nbr = B.account_nbr

WHEN MATCHED THEN                        -- On match update

    UPDATE SET C.company_name = B.company_name,

                C.primary_contact = B.primary_contact,

                C.contact_phone = B.contact_phone

WHEN NOT MATCHED THEN                    -- Add missing

    INSERT (account_nbr, company_name, primary_contact, contact_phone)

    VALUES (B.account_nbr, B.company_name, B.primary_contact, B.contact_phone);

 

SELECT account_nbr, company_name, primary_contact, contact_phone

FROM CentralOfficeAccounts;

 

/*

 

-- Results:

 

account_nbr company_name             primary_contact contact_phone

----------- ------------------------ ---------------- -------------

1           Bayside Motors           Jim Smith        902-203-1234

2           Dallas Industries, Inc. Rick Gross       301-663-9134

3           Sky Computer Systems     Jane Brown       201-943-6053

4           Auto Insurance Co.       Chris Jefferson 313-601-6201

 

*/



Next step is to enhance the statement including predicates to check and update only accounts that have changed. The only difference here is the additional conditions in the MATCHED clause, results will be the same as before.

-- Update existing that changed and add missing

-- Use of predicates

MERGE INTO CentralOfficeAccounts AS C   -- Target

USING BranchOfficeAccounts AS B         -- Source

   ON C.account_nbr = B.account_nbr

WHEN MATCHED                            -- On match update

 AND (C.company_name <> B.company_name  -- Additional search conditions

   OR C.primary_contact <> B.primary_contact

   OR C.contact_phone <> B.contact_phone) THEN                       

    UPDATE SET C.company_name = B.company_name,

                C.primary_contact = B.primary_contact,

                C.contact_phone = B.contact_phone

WHEN NOT MATCHED THEN                   -- Add missing

    INSERT (account_nbr, company_name, primary_contact, contact_phone)

    VALUES (B.account_nbr, B.company_name, B.primary_contact, B.contact_phone);



Going one step further, let's add the option to delete accounts that do not exist in the source table from the target table.

--Update existing that changed and add missing, delete missing in source

MERGE INTO CentralOfficeAccounts AS C   -- Target

USING BranchOfficeAccounts AS B         -- Source

   ON C.account_nbr = B.account_nbr

WHEN MATCHED                            -- On match update

 AND (C.company_name <> B.company_name  -- Additional search conditions

   OR C.primary_contact <> B.primary_contact

   OR C.contact_phone <> B.contact_phone) THEN                       

    UPDATE SET C.company_name = B.company_name,

                C.primary_contact = B.primary_contact,

                C.contact_phone = B.contact_phone

WHEN NOT MATCHED THEN                   -- Add missing

    INSERT (account_nbr, company_name, primary_contact, contact_phone)

    VALUES (B.account_nbr, B.company_name, B.primary_contact, B.contact_phone)

WHEN SOURCE NOT MATCHED THEN            -- Delete missing from source

    DELETE;

 

SELECT account_nbr, company_name, primary_contact, contact_phone

FROM CentralOfficeAccounts;

 

/*

 

-- Results:

 

account_nbr company_name             primary_contact contact_phone

----------- ------------------------ ---------------- -------------

2           Dallas Industries, Inc. Rick Gross       301-663-9134

3           Sky Computer Systems     Jane Brown       201-943-6053

4           Auto Insurance Co.       Chris Jefferson 313-601-6201

 

*/



There is a lot more to MERGE, but this just shows the tremendous power it provides for merging data.

Note:

This code has been tested with SQL Server 2008 CTP 6 (February 2008). As of SQL Server 2008 Release Candidate 0 (June 2008) the clause "WHEN SOURCE NOT MATCHED" has been replaced with "WHEN NOT MATCHED BY SOURCE". This makes the wording clearer. It is good to note that this clause of MERGE is not standard.

References:

MERGE (Transact-SQL)
http://technet.microsoft.com/en-us/library/bb510625(SQL.100).aspx

Labels: ,