This week I had a need to copy some data between SQL Server database tables. A client of ours has three tables containing users – subscribers, seminar users, and weekly report users. He wanted his subscribers and seminar users to be placed into the weekly report table, along with modifying the subscriber and seminar signups to also insert those users into the weekly report users table.

Copying data between tables in SQL is pretty straightforward:

INSERT INTO reportUsers (fname, lname, email)
SELECT fname, lname, email FROM seminarUsers

The problem with this is that some of the subscriber and seminar users already exist in the weekly report table. We can exclude them with a subquery:

INSERT INTO reportUsers (fname, lname, email)
SELECT fname, lname, email FROM seminarUsers
WHERE email NOT IN (select email FROM reportUsers)

But, the seminarUsers table contains signups for every seminar. That means a given user could have signed for several seminars, and would be imported into the reportUsers table more than once. To solve that we need to eliminate the duplicate records coming from the select statement. But for a given email address, which of the several record should we choose to import? On the second or third time the user signed up for a seminar, they may have given us an updated name or address. Only the name is relevant in this example, but in actuality I was copying more fields than just the first and last name. So lets grab the most recent signup. We can do that by just using the record with the highest userid, since it’s an identity field – the more recent records will have a higher user id number.

INSERT INTO reportUsers (fname, lname, email)
SELECT fname, lname, email FROM seminarUsers
WHERE email NOT IN (select email FROM reportUsers)
AND userid IN
  (
  select DISTINCT su1.userid
  from seminarUsers as su1, seminarUsers as su2
  where su1.userid > su2.userid and su1.email = su2.email
  AND su1.userid  =
	(
	SELECT TOP 1 su3.userid FROM seminarusers su3 WHERE
	su3.email = su1.email
	order by su3.userid DESC
	)
  )

It’s pretty complicated, but it gets the job done. Thanks to my co-worker Mike Klostermeyer who worked with me on this, he is an SQL Wizard.

One Comment

  1. Michael says:

    Any idea on how to handle a table with a more complex where clause?

    I have a table that has three columns that act like a unique field.

    Kind of like this: WHERE email, ssn, dob NOT IN (SELECT email, ssn, dob FROM reportUsers)

    Ideas?