Archive for the ‘SQL’ Category

My SQL savvy coworker Jason Troy has done some work recently on finding and deleting duplicate records in a database. One of the things he had to do was compare on a text field (that is, over 8000 characters and not a varchar). Very interesting.

http://phillyun.blogspot.com/2008/10/power-of-sql-find-and-delete-duplicates.html

I’m setting up mysql locally on my laptop this weekend, and ran into a snag that cost me at least an hour. To mimic a web app I am running locally, I needed to connect to mysql with a specific username and password. But no matter what I did, I always got this error:

MySQL: access denied for user: username (Using password: YES)

I got this when connecting via the command line client and via ColdFusion. Finally after some googling I came across this blog post suggesting to change the privilege details to allow connecting from ‘localhost’ instead of ‘%’. The % indicates a wildcard and should allow connections from any host. But for some reason this was not working for me. Changing it to ‘localhost’ fixed the problem and allowed me to connect. I think it may have had something to do with the fact that I checked a box when installing that would only allow connections from the localhost.

I used this SQL to fix the host setting after connecting to the ‘mysql’ database:

update user set Host=’localhost’ where User=’myUserName’

I’ve used this trick several times when needing to sort data into groups. For example say you need to sort car classified listings by price. Some of the dealers enter a price of “0” for their listings. When sorting by lowest price, you want the listings that actually have a low price to come first, rather than than the cars costing $0 (hey, I’d like to buy your free car, thank you).
Continue reading ‘SQL CASE statement with NULLs’ »

I needed to clean out some orphaned records from a MySQL database today. I thought this would be as simple as

DELETE FROM subTable WHERE ID IN (
  SELECT subTable.ID from subTable
  LEFT OUTER JOIN parentTable ON subTable.parentID = parentTable.ID
  WHERE  parentTable.ID IS NULL
  )

But this resulted in an error:
Continue reading ‘Cleaning up orphaned records in MySQL’ »

We had a need to do a case sensitive SQL comparison this week. As part of a login process a user is prompted for an answer to their security question. Now I really don’t think forcing users to enter their employer’s name in the exact case does anything to keep out hackers, but the client wanted it done this way.

Of course we could pull the values into ColdFusion and do a case sensitive comparison with Compare(). But the code was already setup to do the comparison at the SQL level, which is usually a better idea anyway if it can be done.

There are several ways to tell MS SQL to do a case sensitive comparison. Continue reading ‘Case sensitive comparisons in SQL’ »

Did you know that in MS SQL and MySQL, a UNION query will automatically filter out duplicate records between the tables that are being unioned?

I ran into this on an email tool I was working on. The client wanted to email users that had recently signed up for either of two different services. So I did a UNION between the two tables to pull all the users into one query that I could loop through. But for some reason, my test user that was in both tables (but with the same name and email address) was only showing up once. I had expected to need to filter out duplicates myself, and I was puzzled as to why both users weren’t showing up. Here is the query:
Continue reading ‘SQL ‘UNION’ queries automatically filter out duplicates’ »

This week I had a need to copy some data between SQL Server database tables. A client of ours has three tables containing users – subscribers, seminar users, and weekly report users. He wanted his subscribers and seminar users to be placed into the weekly report table, along with modifying the subscriber and seminar signups to also insert those users into the weekly report users table.

Copying data between tables in SQL is pretty straightforward:

INSERT INTO reportUsers (fname, lname, email)
SELECT fname, lname, email FROM seminarUsers

The problem with this is that some of the subscriber and seminar users already exist in the weekly report table. We can exclude them with a subquery:

INSERT INTO reportUsers (fname, lname, email)
SELECT fname, lname, email FROM seminarUsers
WHERE email NOT IN (select email FROM reportUsers)

But, the seminarUsers table contains signups for every seminar. That means a given user could have signed for several seminars, and would be imported into the reportUsers table more than once. To solve that we need to eliminate the duplicate records coming from the select statement. But for a given email address, which of the several record should we choose to import? On the second or third time the user signed up for a seminar, they may have given us an updated name or address. Only the name is relevant in this example, but in actuality I was copying more fields than just the first and last name. So lets grab the most recent signup. We can do that by just using the record with the highest userid, since it’s an identity field – the more recent records will have a higher user id number. Continue reading ‘Copying data into other tables while eliminating duplicates’ »

Sometimes you have the need in an application to compare dates without the time component. For example, a user signs up for a free 30-day trial of a client’s site. The user’s ‘expire_date’ will be set for 30 days from today, probably using something like DateAdd(“d”,30,getdate()). At some point you need to compare the expire_date against the current date. Maybe its a report of current free trial users, or a query to see if the user is allowed to login. Your query will look something like

... WHERE expire_date >= getdate()

The problem with this is that both values contain a time component. What you are really comparing is something like

... WHERE 2007-1-20 10:44:22.717 >= 2007-1-20 10:30:28.287

So a user may be considered still in their free trial at 10:30am, but when they try to login again 20 minutes later, they may be told their trial has expired. In reality the free trial should continue throughout the last day of the trial period. If you signed up for a free two week gym membership, would you expect to be allowed in before lunch on the last day but not after because of what time you initially signed up?

This is not a problem in MySQL, as it has a simple DATE type (in addition to the common DATETIME type) that does not store any time data. So what we need to do in SQL Server is strip off the time component. I typically do this when I store the date, but you could do it at compare time, too.

SQL Server datetimes are stored internally as floating point numbers. You can see this by CASTing the value of getdate to a float.

SELECT CAST( getdate() AS float)

When I ran this just now I got 39100.964272067904. That indicates 39100 days after 1/1/1900 (SQL Server’s base date), and 96/100ths of a fractional day (it’s almost midnight). We can use this to modify the float value and then CAST it back as a datetime type. The FLOOR function will strip off any decimal portion of a number. Thus:

SELECT CAST(FLOOR(CAST( getdate() AS float)) AS datetime)

returns “2007-01-20 00:00:00.000” – the date without the time.

There are other ways to do this, using the day, month and year functions or the CONVERT function. I find the above method to be more straight forward (although only by a little – its a pain no matter how you do it). You could add that code to a trigger so whenever that field is set or updated the time portion is automatically stripped off.

Now when we compare the field, we’ll need to do the same conversion to the current day:

... WHERE expire_date >= CAST(FLOOR(CAST( getdate() AS float)) AS datetime)

Keep in mind that queries with functions in the search condition are expensive. If you are going to run this often or on a lot of data, you may want to pass in the date from your application language:

... WHERE expire_date >= '2007-01-20'