Copying / archiving data

Copying data from one table to another is not always as simple as

INSERT INTO destination_database.schema.table (

columns

)

SELECT

columns

FROM source_database.schema.table WITH (NOLOCK)

This works fine for a small to medium number of rows but what if there a hundred million rows or more or if the rows are wide? And what is you have already copied part of the source table or if the source table is still being updated?

You can consider an incremental update in a loop…

For this you need something to identify the current position so its best if you have a unique primary key but a complex one would do too. For my example I am going to use a simple integer based primary key.

Here’s the script…

BEGIN TRY

BEGIN TRANSACTION

— get the max sequential id from the target table.

— This indicates what data has already been copied

— Therefore we can remove this data from the source database

DECLARE @MaxID int

SELECT @MaxID = MAX(ID) FROM source.schema.table

— If there is nothing in the target then this will return NULL so we account for this by setting to 0

IF @MaxID IS NULL

SET @MaxID = 0

— If the source table ID is an identity column don’t make the column an identity at the target

INSERT INTO target.schema.table

(column list)

SELECT TOP 10000

column list

FROM source.schema.table

WHERE ID > @MaxID

ORDER BY ID

COMMIT TRANSACTION

END TRY

BEGIN CATCH

IF @@TRANCOUNT > 0

BEGIN

ROLLBACK TRANSACTION

END

print ‘error message’

END CATCH

Now put the above in to a stored procedure and then call it repeatedly either in a loop or from a scheduled job. Each time the script runs it will start by removing data it has already copied. DO NOT do this the other way round as only the above method is fault tolerant! i.e. If the data does not copy then the data will not be removed from the source. You do not want the situation where a mistake in the copy command results in the data not being copied and then data is removed from the source without a check.

Advertisements

About davidbridge

I am a contractor for David Bridge Technology Limited specialising in database design and Web development using Microsoft technologies such as c#, MVC .net and SQL Server (full stack)
This entry was posted in SQL Stuff, Uncategorized and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s