Archive for the ‘database’ Category

Select random value from a range of values

Earlier I blogged about creating random numbers using tsql functions.  Here are two techniques for selecting a random value from a pre-defined range of values in a tsql script.  The first technique uses a table variable ( MSSQL 2000 + ), and the second uses a Common Table Expression or CTE ( MSSQL 2005+ ).

Select a random value using a table variable


-- var to hold random integer
declare @field_val int

-- create table var to hold value range [ 0, 512, 1024, 2048, 4096 ]
-- inserting the first value sets the structure for the table variable
SELECT 0 AS 'num'
INTO #temp

-- insert data into table var
INSERT INTO #temp VALUES ( 512 )
INSERT INTO #temp VALUES ( 1024 )
INSERT INTO #temp VALUES ( 2048 )
INSERT INTO #temp VALUES ( 4096 )

-- assign random value
SELECT TOP 1 @field_val = num FROM #temp ORDER BY NEWID()

-- show value
SELECT @field_val

-- drop the table variable
DROP TABLE #temp

Select a random value using a CTE

-- define our data table
WITH data( car )
AS
(
	-- UNION together our range of values
	SELECT 'audi' AS 'car'
	UNION
	SELECT 'bmw' AS 'car'
	UNION
	SELECT 'infinity' AS 'car'
	UNION
	SELECT 'lexus' AS 'car'
	UNION
	SELECT 'porsche' AS 'car'
)
-- select a random value
SELECT TOP 1 car FROM data
ORDER BY NEWID()

Both of these techniques can be used with numbers or text. Just be sure to mind your quotes, and variable datatypes.  Being able to pick a random value in data generation scripts has proven very useful.  I hope this helps somebody else out as well.

I wonder if using Coldfusion9′s ORM would make this easier?

Rolling my own SQL solution

Recently I was working on a medical website and was tasked with building a doctor directory page that had to pull data from three related tables.  For the viewers at home, here is a simplified view of my tables.

Database tables : Doctors, Practices, and Locations

Doctors, Practices, and Locations oh my

The first version of the page was easy to whip up because we were only showing primary location address which lives in the practices table.  ( NOTE : The diagram above was dumbed down to support this post ).  When the request came in to show all locations for each doctor’s practice, that’s when I had to put on my wizard hat and get tricky.

After some thinking I came up with two options for adding all locations for each doctor row :

  1. Build a second locations query for each doctor in the first query, then use code to merge the data together
  2. Update the first query to include all necessary data, then use code to merge the data into a loopable structure

Being a SQL junky, I decided to take the second route so I’m still only hitting the database once.  In the first version of this page, we always had one row per doctor and didn’t have to do any post processing of the data prior to sending to the page.  However, when the locations table gets added to our sql query, all of the doctor data gets duplicated for each location they are associated with, making the dataset look like this.

doctors, practice, and locations all in a single query

When joining the third table, the first two get duplicated

Now that we’ve got all the data, you’ll notice that the doctor information gets duplicated.  This is where I built a cffunction to convert this query into a structure of individual doctors, each having an array of locations.  I used the doctor’s full name and id as the structure keys.  Great, now I can just loop through my new structure of doctors, then loop through each doctor.location array to draw the updated html table.  That is the goal here, but we’re not ready just yet.

If you’ve ever tried to loop over a Structure in Coldfusion, you’ve probably noticed controlling the order of your data coming back can be tricky.  Here’s an example of what I’m talking about when looping through a structure.

Looping through CF Structures, doesn't always loop in the order you'd expect

Looping through a coldfusion structure can be a challenge

To get around this random order of looping through a structure, I did two steps prior to building the updated html table.

	// UPDATE : extract key list ( DR full names ) so
	// you can sort the physicians my last name
	dr_list = StructKeyList( providers_struct );
	dr_list = ListSort( dr_list, "textnocase" );

This gives me an alphabetized list of all doctors inside my structure, which I convert to an array then loop through.  I know have a sorted loop that can pick all necessary doctor data out of my structure on command.  This solution took a little bit of time to figure, but wasn’t too bad.  After thinking about all the work needed to make this update, I wondered if using Coldfusion9′s Hibernate ORM would have made this update any easier.

Porting SQL to ORM

The first step in setting up  Coldfusion9′s ORM for this code is to enable the ORM in the application.

At the top of my Application.cfc I added the following lines.

	<cfscript>
		// enable ORM
		this.ormenabled = true;

		// my datasource name, setup in the CFAdmin
		this.datasource = "mysql-test";

		// the folder in my app where the cfc mapping files are location
		this.ormsettings.cfclocation = "mappings";

		// this wasn't setup automatically for me
		this.ormsettings.dialect = "MySQL";
	</cfscript>

If you haven’t setup ORM in your Coldfusion9 application before, be sure to bookmark this livedoc page on configuring ORM.  It will give you all the lowdown on all the fancy ORM bells and whistles.

After ORM is enabled, it was time to create my Hibernate mapping files for each of the database tables involved in this code.  For convenience sake, I went the CFC route, and only mapped the columns required for this example.  There are a number of ways to define ORM mappings, refer to this livedocs page to find out more. Here are my three hibernate mapping cfcs.

doctor.cfc

<cfcomponent persistent="true" entityname="Doctor" table="providers">
    <cfproperty name="id" column="provider_id" generator="increment">

    <!--- one to many relationship Doctor.practice = Practice --->
    <cfproperty name="practice" fieldtype="one-to-one" cfc="PRACTICE" fkcolumn="practice_id">

    <cfproperty name="PracticeId" column="practice_id" insert="false" update="false">

    <cfproperty name="FirstName" column="first_name">
    <cfproperty name="LastName" column="last_name">
    <cfproperty name="status" column="status">
    <cfproperty name="is_midlevel" column="is_midlevel">
</cfcomponent>

practice.cfc

<cfcomponent persistent="true" entityname="Practice" table="practices">
	<!--- tie this column back to Doctor.practice_id --->
	<cfproperty name="id" column="practice_id" fieldtype="id" generator="foreign" params="{property='Doctor'}">

	<!--- Relate locations to this practice via Locations.practice_id and Locations.active = 1 --->
        <cfproperty name="locations" type="array" fieldtype="one-to-many" cfc="Location" fkcolumn="practice_id" params="{property='Location'}" where="active = 1">

    <cfproperty name="name" column="practice_name">
    <cfproperty name="active" column="active">
</cfcomponent>

location.cfc

<cfcomponent persistent="true" entityname="Location" table="locations">
	<cfproperty name="id" column="location_id" fieldtype="id">

	<cfproperty name="practice_id" column="practice_id">

    <cfproperty name="address1" column="address1">
    <cfproperty name="address2" column="address2">
    <cfproperty name="city" column="city">
    <cfproperty name="state" column="state">
    <cfproperty name="zipcode" column="zipcode">
    <cfproperty name="phone" column="phone">
    <cfproperty name="active" column="active">
</cfcomponent>

At this point we have three ORM entities that are mapped to the three database tables. If you look closely, you’ll notice that relationships have also been defined for doctors to practice, and practice to locations.  Setting up the relationships at the mapping level takes care of physically joining the three tables together via sql.  Also, while it’s pretty straight forward reading a blog with working code, I’m not going to pretend I got these relationships setup first try.  It’s simple relating two tables together, but finding the right combination for doctror > practice > locations was tricky.  I did a lot of this to figure this part out.

<cfscript>
// load doctors
dr_list = ORMExecuteQuery( "FROM Doctor WHERE status = 'Active' AND provider_id = 1" );
// load a practice
// practice = ORMExecuteQuery( "FROM Practice WHERE practice_id = 1" );
// load a location
// location = ORMExecuteQuery( "FROM Location" );
</cfscript>

<cfdump var="#dr_list#" />
<cfabort />

You can find out more about defining ORM relationships on this livedocs page.

With the hard stuff out of the way, we’re down to two steps.  First it’s time to port my cfquery that grabs the data, and the cffunction that converts the query into our loopable doctor structure.  Are you ready for this?

dr_list = ORMExecuteQuery( "from Doctor WHERE status = 'Active' AND is_midlevel = 0 ORDER BY last_name, first_name" );

To give you the full idea of how much code was shrunk down to a single line, here are the two cffunctions used to query and massage the data into a useable structure.

	<!--- Retrieve listing of all active providers in the app  --->
    <cffunction name="getAllActiveProviders" access="public" returntype="struct">

		<cfquery name="providers_qry" DATASOURCE="#request.dsn#" USERNAME="#request.dbuser#" PASSWORD="#request.dbpswd#">
            SELECT	DISTINCT

					p.provider_id,
            		CONCAT( p.first_name, ' ', p.last_name ) AS 'full_name',
                    p.first_name,
                    p.last_name,

    	            p2.practice_name,

					l.address1,
                    l.address2,
                    l.city,
                    l.state,
                    l.zipcode,
                    l.phone

            FROM	providers p, practices p2

            LEFT OUTER JOIN locations l
            ON p2.practice_id = l.practice_id AND l.active = 1

            WHERE	p.status = 'Active'
            AND		p.practice_id = p2.practice_id
            AND		is_midlevel = 0

            ORDER BY	p.last_name, p.first_name
        </cfquery>

		<!--- return a translated structure --->
		<cfreturn providersQryToStruct( providers_qry ) />

    </cffunction>

	<!--- Translates the providers query into a Struct that has a row per doctor, each containing an Array for location(s) --->
	<cffunction name="providersQryToStruct" access="public" returntype="struct">

    	<cfargument name="providers_qry" type="query" required="yes">

		<cfscript>
		this_dr = "";
		last_dr = "";
		// array of provider arrays
		providers_struct = StructNew();

		// loop through qry and flatten into single row with Location = Array
		// The key to this qry is provider_name+provider_id
		for( xx = 1; xx <= providers_qry.RecordCount; xx++ )
		{
			// set key to this row [ LAST_NAME+FIRST_NAME+PROVIDER_ID ]
			this_dr = providers_qry.LAST_NAME[ xx ] & providers_qry.FIRST_NAME[ xx ] & providers_qry.PROVIDER_ID[ xx ];

			// clean up DR name
			this_dr = Replace( this_dr, " ", "" );
			this_dr = Replace( this_dr, ",", "" );
			this_dr = Replace( this_dr, ".", "" );

			// got same doctor?
			if( this_dr != last_dr )
			{
				// append to array
				provider = StructNew();
				// FILL UP PROVIDER
				provider['full_name'] = providers_qry["full_name"][ xx ];
				provider['practice_name'] = providers_qry['practice_name'][ xx ];

				// location array
				provider['location'] = ArrayNew(1);

				// add first location
				loc = StructNew();
				loc['address1']	 = providers_qry['address1'][ xx ];
				loc['address2']	 = providers_qry['address2'][ xx ];
				loc['city']		 = providers_qry['city'][ xx ];
				loc['state']	 = providers_qry['state'][ xx ];
				loc['zipcode']	 = providers_qry['zipcode'][ xx ];
				loc['phone']	 = providers_qry['phone'][ xx ];

				// add location struct to array
				ArrayAppend( provider['location'], loc );

				// Add to master struct
				providers_struct[ this_dr ] = provider;
			}
			else
			{
				// add first location
				loc = StructNew();
				loc['address1']	 = providers_qry['address1'][ xx ];
				loc['address2']	 = providers_qry['address2'][ xx ];
				loc['city']		 = providers_qry['city'][ xx ];
				loc['state']	 = providers_qry['state'][ xx ];
				loc['zipcode']	 = providers_qry['zipcode'][ xx ];
				loc['phone']	 = providers_qry['phone'][ xx ];

				// Append to this DR's location array
				ArrayAppend( providers_struct[ this_dr ]['location'], loc );
			}

			// store this_dr in last_dr for next iteration
			last_dr = this_dr;
		}

		return providers_struct;
        </cfscript>

    </cffunction>

After going through the brain fry of figuring out ORM relationships, replacing over one hundred lines of code with only a single line really made my day. Now it was time for the final step, porting the cfml that draws the html table into using the ORM collection instead.
The porting of the cfml that draws the doctor table was a piece of cake. The updates were so easy it was almost not worth showing, but I’m including just to complete the picture.

The original sql loop code

    <!--- use dr_list as sorted list. use each DR key to pick out data from providers_struct --->
    <cfloop list="#dr_list#" index="xx">
    <tr>

        <td>#providers_struct[ xx ]['full_name']#</td>

        <td>#providers_struct[ xx ]['practice_name']#</td>

        <td>
        <!--- loop through array of location structures --->
        <cfloop from="1" to="#ArrayLen( providers_struct[ xx ]['location'] )#" index="yy">
            <address>
            <cfset loc = #providers_struct[ xx ]['location'][ yy ]#>
            <li>#loc.address1#, #loc.address2#
            #loc.city#, #loc.state# #loc.zipcode#
            <br />
            <b>#loc.phone#</b>
            </li>
            </address>
        </cfloop>
        </td>

    </tr>
    </cfloop>

The new ORM loop code

	<!--- loop through parent query --->
    <cfloop array="#dr_list#" index="xx">
    <tr>

    	<!--- retrieve properties using ORM's getProperty() syntax --->
        <td>#xx.getFirstName()# #xx.getLastName()#</td>

        <td>#xx.getPractice().getName()#</td>

        <td>
		<!--- loop through locations collection Doctor > Practice > Location(s)  --->
        <cfloop array="#xx.getPractice().getLocations()#" index="loc">

            <address>
                <li>#loc.getAddress1()#, #loc.getAddress2()#
                #loc.getCity()#, #loc.getState()# #loc.getZipcode()#
                <br />
                <b>#loc.getPhone()#</b>
                </li>
            </address>

        </cfloop>
        </td>

    </tr>
    </cfloop>

A few things I’d like to point out in the new ORM loop code

  1. ORMExecuteQuery returned a data collection that I can loop through without post-processing
  2. When retrieving object properties, use the getPropertyName() syntax.  EX : doctor.getFirstName() instead of doctor.FirstName
  3. Practice.locations was declared as an Array collection, but you could use Structure as well.
  4. To get the ORMExecuteQuery record count, use ArrayLen.

Did ORM make this easier?

I’d say yes it did.  Mapping my tables via code, including the relationships, really saved time writing SQL.  Funny thing, while I was porting this over, I actually found a few data and code bugs in the home grown solution that I didn’t catch before.  Once the dust had settled I still had one difference in the home grown versus ORM code, and that turned out to be duplicated location rows.

The other thing that I noticed thanks to our good friend cfdump, is all the extra functionality that comes along with the objects returned from ORM.  On top of having a populated data object, you also have magical getters, setters, and other useful methods to work with your data objects.  To see what I mean, fire an ORMExecuteQuery and run your variable through CFDUMP.

To sum up, I’m really starting to buy into using ORMs.  Since I know and love SQL, I’ve been anti ORM for too long.  Having said that, I’m not fully advocating going ORM for everything.  Pretty much every project I work on these days is powered by some datasource, but I don’t really need all the ORM baggage for simple sites.  Now for large applications, or any site large enough to need at least two people, I’d give ORM a chance.  I still have lots of ins and outs to learn about using Hibernate, but it is definitely simple enough to get jamming right away.

Generate random integers using tsql UDFs

Ever need to generate random numbers from the integer family?  I had this need on a project so I whipped up these four tsql User Defined Functions to help with this task.  There are four functions in all, one for tinyint, smallint, int, and bigint.  Additionally, you will need to create one VIEW since you can not fire the tsql function RAND() inside of a udf.

With these functions, you can generate random integers in their native range.

 SELECT dbo.getRandomInt( NULL, NULL ) 

Or you can restrict your random integers to a range of your liking.

 SELECT dbo.getRandomInt( 1000, 1000000000) 

Just as a reminder, here are the native ranges for these four integer types as supported by MS SQL Server 2005

bigint -2^63 (-9,223,372,036,854,775,808) to 2^63-1 (9,223,372,036,854,775,807)
int -2^31 (-2,147,483,648) to 2^31-1 (2,147,483,647)
smallint -2^15 (-32,768) to 2^15-1 (32,767)
tinyint 0 to 255

Each of these functions have the same structure and primarily differ only by the integer type’s native range.  Here is the guts of one of the UDFs in case you want just the facts.

/******************************************************************************
Generate a random int
-------------------------------------------------------------------------------
USAGE :
 -- Get random int in the default range -2,147,483,648 to 2,147,483,647
 SELECT dbo.getRandomInt( NULL, NULL )

 -- Get random tinyint within a specific range
 SELECT dbo.getRandomInt( 1000, 30000 )

REQUIREMENT : Since you can't call RAND() inside of a UDF,
this function is dependant on the following VIEW vRand :

-- BEGIN VIEW
 -- This is only a helper VIEW since currently you can not use RAND() in a UDF
 -- DROP VIEW vRand
 CREATE VIEW [dbo].[vRand]
 AS
 SELECT RAND() AS 'number'
-- END VIEW

******************************************************************************/

USE SmartEarth
GO

IF OBJECT_ID (N'getRandomInt') IS NOT NULL
 DROP FUNCTION getRandomInt
GO

CREATE FUNCTION getRandomInt( @min_in int, @max_in int )
RETURNS int
WITH EXECUTE AS CALLER
AS
BEGIN
-------------------------------------------------------------------------------
 DECLARE @max int,
 @min int,
 @rand NUMERIC( 18,10 ),
 @max_big NUMERIC( 38, 0 ),
 @rand_num NUMERIC( 38, 0 ),
 @out int;

 -- define this datatype's natural range
 SET @min = -2147483648    -- -2,147,483,648
 SET @max = 2147483647    -- 2,147,483,647

 -- Check to see if a range has been passed in.
 -- Otherwise, set to default tinyint range
 IF( @min_in is not null AND @min_in > @min )
 SET @min = @min_in

 IF( @max_in is not null AND @max_in < @max )
 SET @max = @max_in
 -- end range check

 -- get RAND() from VIEW since we can't use it in UDF
 SELECT @rand = number FROM vRand

 -- CAST @max so the number generation doesn't overflow
 SET @max_big = CAST( @max AS NUMERIC(38,0) )

 -- make the number
 SELECT @rand_num = ( (@max_big + 1) - @min ) * @rand + @min;

 -- validate rand
 IF( @rand_num > @max )
 -- too big
 SET @out = @max
 ELSE IF ( @rand_num < @min )
 -- too small
 SET @out = @min
 ELSE
 -- just right, CAST it
 SET @out = CAST( @rand_num AS int )

 -- debug
 -- SELECT @min_in AS 'min_in', @max_in AS 'max_in', @min AS 'min', @max AS 'max', @rand, @rand_num AS 'rand_num', @out AS 'out'

 -- return appropriate
 RETURN @out;

-------------------------------------------------------------------------------

END;
GO

So where do you get the code?

You can view all functions and view online at the following gist.github urls:

Or you can just download all the source code in one zip file here.

Hopefully this will help somebody out.  If you’re a DBA or just a tsql wizard, let me know what you think.  Can I do these functions a better way?  Is this already built into SQL2005 and I just didn’t know it?  All of this tsql was written against SQL Server 2005, but I’m pretty sure it would work on SQL2000 and SQL2008 as well.

Store your SQL database in the same location as live

I do a lot of moving of databases between development and production servers.  If I’m lucky, the production server I’m working with gives me access to DTS services, or even the Database publishing wizard.  More often than not however, the SQLServer I’m pushing to is locked down in a way that I am required to Remote Desktop into the server, then update the database via restore.

So here’s the tip.  When you’re only means of updating a remote SQLServer database is by physically restoring the database, do your best to mirror the product server’s database location on your development machine.  Check out this screenshot, and you’ll see what I mean

MSSQL : Restore Files and Filegroups

MSSQL : Restore Files and Filegroups

The background of this image shows that my production server houses all of it’s databases at the path C:\DB\database.mdf.

The lower right box shows where I keep my databases on my development server. Since I do not store any vital data on my C: drive, I’ve changed the path to D:\DB.

While this isn’t an exact path match, this trick saves me a little bit of time and frustration when restoring a database remotely.  Especially when I have to do it more than a handful of times in the same day.

It’s assumed you know how to the following  :

  • Back up a SQLServer database
  • Copy the BAK file to a remote server
  • Connect remotely to your SQLServer
  • Restore a Database file from a file ( BAK file )