Copying data from one table to another is not always as simple as
INSERT INTO destination_database.schema.table (
FROM source_database.schema.table WITH (NOLOCK)
This works fine for a small to medium number of rows but what if there a hundred million rows or more or if the rows are wide? And what is you have already copied part of the source table or if the source table is still being updated?
You can consider an incremental update in a loop…
For this you need something to identify the current position so its best if you have a unique primary key but a complex one would do too. For my example I am going to use a simple integer based primary key.
Here’s the script…
— get the max sequential id from the target table.
— This indicates what data has already been copied
— Therefore we can remove this data from the source database
DECLARE @MaxID int
SELECT @MaxID = MAX(ID) FROM source.schema.table
— If there is nothing in the target then this will return NULL so we account for this by setting to 0
IF @MaxID IS NULL
SET @MaxID = 0
— If the source table ID is an identity column don’t make the column an identity at the target
INSERT INTO target.schema.table
SELECT TOP 10000
WHERE ID > @MaxID
ORDER BY ID
IF @@TRANCOUNT > 0
print ‘error message’
Now put the above in to a stored procedure and then call it repeatedly either in a loop or from a scheduled job. Each time the script runs it will start by removing data it has already copied. DO NOT do this the other way round as only the above method is fault tolerant! i.e. If the data does not copy then the data will not be removed from the source. You do not want the situation where a mistake in the copy command results in the data not being copied and then data is removed from the source without a check.