Share on Facebook Share on Twitter Share on Digg Share on Stumble Upon Share via e-mail Print

Useful SQL Techiques

by Craig S. Mullins

SQL is the lingua franca for modifying and reading database data and any DBA worth his (or her) paycheck should be proficient in writing SQL queries. But SQL is a flexible and feature-rich language, so there are always things that can be learned – even by senior technicians. As such, this month’s column discusses several interesting SQL queries that you can put in your bag of tricks for future use.

Let’s start out with a simple one. Here is a technique you can deploy when you need to return a count of specific characters in a column using only SQL:

 

   select length('A,B,C,D') - length(replace('A,B,C,D', ',', ''))

   from    table;


The length() function returns the length of the character string supplied as the argument. Knowing this, we can use the replace() function to change the character we wish to count (in this case a comma) with an empty string. By subtracting the original length from the length of the modified string we get the number of characters removed. This can be adapted to count any character you wish simply by changing the ‘,’ to whatever character you desire.

Let’s move on to a little more complex one. We all can relate to dealing with systems that have data integrity problems. But some data integrity problems can be cleaned up using a dash of SQL. Consider the common data entry problem of extraneous spaces inserted into a name field. Not only is it annoying, sometimes it can cause the system to ignore relationships between data elements because the names do not match. For example, “Craig   Mullins” is not equivalent to “Craig Mullins”; the first one has three spaces between the first and last name whereas the second one only has one.

You can write an update statement to clean up these type problems, if you know how to use the replace() function. As we learned in the first technique, replace()does what it sounds like it would do: it reviews a source string and replaces all occurrences of a one string with another. For example, to replace all occurrences of Z with A in the string BZNZNZ you would code:

 

   replace('BZNZNZ', 'Z', 'A')


And the result would be BANANA. So, let’s code some SQL using the replace() function to get rid of any unwanted spaces in the NAME column of our EMPLOYEE table. Keep in mind that we have no idea how many extra spaces there may be in the NAME columns. One may have two extra spaces, another fifteen extra, and another only one. So the SQL has to be flexible. Consider this:


   update EMPLOYEE

     set NAME = replace(

                 replace(

                  replace(NAME, space(1), '<>')

                  '><', space(0))

                 '<>', space(1));


 Wait-a-minute, I hear you saying. What are all of those left and right carats and why do I need them? Well, let’s go from the inside out. The inside replace() function takes the NAME column and converts every occurrence of a single space into a left/right carat. The next replace() function  (working outward), takes the string we just created, and removes every occurrence of a right/left carat combination by replacing it with a zero length string. The final replace() function takes that string and replaces any left/right carats with a single space. The reversal of the carats is the key to removing all spaces except one – remember, we want to retain a single space anywhere there was a single space as well as anywhere that had multiple spaces. Try it, it works.

Of course, if you don’t like the carats you can use any two characters you like. But the left and right carat characters work well visually. Be sure that you do not choose to use characters that occur naturally in the string that you are acting upon. (I cannot think of anyone with a carat in their name, can you?)

Finally, I used the space() function for clarity. You could have used strings encased in single quotes (as in the first example), but the space() function is easier to read. It simply returns a string of spaces the length of which is specified as the integer argument. For example, space(11) would return a string of eleven spaces.

Keep this technique in your bag of tricks for when you need to clean up dirty data.

As you can see from these two examples, SQL is indeed a very, flexible and powerful language. It is a good idea for DBAs to learn all they can about the features, power, and intricacies of SQL. Frequently, the only thing you’ll need to accomplish many data-related tasks is a well-formulated SQL statement…







From Database Trends and Applications, November 2010.

© 2012 Craig S. Mullins,  

November 2010

DBA Corner