<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Eric Fickes &#187; cte</title>
	<atom:link href="http://ericfickes.com/tag/cte/feed/" rel="self" type="application/rss+xml" />
	<link>http://ericfickes.com</link>
	<description>Design minded Internet Programmer</description>
	<lastBuildDate>Fri, 28 Oct 2011 04:14:43 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Selecting random ids using TOP and a CTE</title>
		<link>http://ericfickes.com/2010/09/selecting-random-ids-using-top-and-a-cte/</link>
		<comments>http://ericfickes.com/2010/09/selecting-random-ids-using-top-and-a-cte/#comments</comments>
		<pubDate>Thu, 30 Sep 2010 05:43:49 +0000</pubDate>
		<dc:creator>Eric Fickes</dc:creator>
				<category><![CDATA[database]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[tips and tricks]]></category>
		<category><![CDATA[tsql]]></category>
		<category><![CDATA[common table expression]]></category>
		<category><![CDATA[cte]]></category>
		<category><![CDATA[mssqlserver]]></category>
		<category><![CDATA[sql server]]></category>
		<category><![CDATA[SQLSERVER]]></category>
		<category><![CDATA[top]]></category>

		<guid isPermaLink="false">http://ericfickes.com/?p=1675</guid>
		<description><![CDATA[While testing visualizations in a Flex application, I needed to do some underlying data cleanup in SQL Server.  One of my tasks was to manually update an entity table and set the status column to one of three possibilities.  Status &#8230; <a href="http://ericfickes.com/2010/09/selecting-random-ids-using-top-and-a-cte/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>While testing visualizations in a Flex application, I needed to do some underlying data cleanup in SQL Server.  One of my tasks was to manually update an entity table and set the status column to one of three possibilities.  Status group A and B both needed to be roughly 20% of my tables total record count, and status group C would be the remaining rows that weren&#8217;t touched by status A or status B.  Oh and there&#8217;s one more thing, the ids in each status group can not be in sequential order, they have to be random.</p>
<p>At first I thought no sweat.  My dataset is still small ( only 2000 rows ), so if we want uber control I could do the math and generate my id lists by hand.  Yes, hand crafting is possible and under a deadline that kind of logic almost makes sense.  However, I already know the table I&#8217;m working with will grow in the future, and I&#8217;ll probably have to do this data update again, so why not do this right?  While playing around with different select statements I had a &#8220;EUREKA!&#8221; moment.  <a title="TSQL's TOP operator on MSDN" href="http://msdn.microsoft.com/en-us/library/ms189463.aspx" target="_blank">SQL Server&#8217;s TOP operator</a> supports PERCENT, not just number.  I couldn&#8217;t believe it.  I use TOP at least once a week and I always forget about TOP PERCENT.  Since I already know how to select random rows via <a title="SQL Server's Common Table Expressions are super helpful" href="http://msdn.microsoft.com/en-us/library/ms190766.aspx" target="_blank">CTE</a>, it  was time to put it all together.</p>
<p>Before giving you the final SQL, here are the important parts to be familiar with.  Also, for the sake of example I&#8217;m using the <a title="Download the AdventureWorks database from Codeplex" href="http://msftdbprodsamples.codeplex.com/releases/view/37109" target="_blank">AdventureWorks database</a> so you can play along at home.</p>
<h2><a title="MSSQL's TOP operator on MSDN" href="http://msdn.microsoft.com/en-us/library/ms189463.aspx" target="_blank">TOP PERCENT</a></h2>
<p>If you just need 50% of the rows in a table, but you&#8217;re not concerned about the sequence returned, you can fire this query.  This will give you a sequential listing of ProductIDs</p>
<pre class="brush: sql; title: ; notranslate">
SELECT TOP 50 PERCENT ProductID
FROM Production.Product
ORDER BY ProductID
</pre>
<p>Which will look something like this.</p>
<p><a href="http://ericfickes.com/wp-content/uploads/2010/09/SELECT-TOP-20-PERCENT.png" rel="lightbox[1675]"><img class="aligncenter size-full wp-image-1682" title="SELECT TOP 20 PERCENT" src="http://ericfickes.com/wp-content/uploads/2010/09/SELECT-TOP-20-PERCENT.png" alt="SQL's TOP operator returns rows sequentially" width="274" height="255" /></a></p>
<h2><a title="A Common Table Expression can be thought of as a temporary result set that is defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. A CTE is similar to a derived table in that it is not stored as an object and lasts only for the duration of the query..." href="http://msdn.microsoft.com/en-us/library/ms190766.aspx" target="_blank">COMMON TABLE EXPRESSION</a></h2>
<p>Now let&#8217;s say you want to randomly pull all rows from a table.  This can be achieved using this CTE.</p>
<pre class="brush: sql; title: ; notranslate">
WITH data( ProductID ) AS (
	SELECT	ProductID
	FROM	Production.Product
)
SELECT	ProductID
FROM	data
ORDER BY NEWID()
</pre>
<p>Which will look like this</p>
<p><a href="http://ericfickes.com/wp-content/uploads/2010/09/SELECT-RANDOM-CTE.png" rel="lightbox[1675]"><img class="aligncenter size-full wp-image-1689" title="SELECT RANDOM DATA using CTE" src="http://ericfickes.com/wp-content/uploads/2010/09/SELECT-RANDOM-CTE.png" alt="Common Table Expressions in SQLSERVER are super helpful" width="271" height="381" /></a></p>
<p>If you&#8217;re looking to <a title="Select random value from preset list - tSQL, CTE" href="http://ericfickes.posterous.com/tsql-select-a-random-value-using-cte" target="_blank">randomly select values from a pre-determined list, see my CTE sample here</a>.</p>
<p>So now that you&#8217;ve seen TOP PERCENT and CTE in action, it&#8217;s time to put these together and solve my initial task of creating randomly selected groups of ids, of a percent size.</p>
<p>RANDOMLY SELECT TOP PERCENT</p>
<p>Putting it all together, here is the query I used to create my first status group.</p>
<pre class="brush: sql; title: ; notranslate">
WITH data( ProductID ) AS (
	SELECT	ProductID
	FROM	Production.Product
)
SELECT TOP 20 PERCENT ProductID
FROM	data
ORDER BY NEWID()
</pre>
<p>Which gives me a dataset that is 20% of all rows in Production.Product, and the ids are in random order.</p>
<p>And there you have it.  Randomly selecting a percent sized data set from a table in SQL Server.  The SQL here is really pretty simple, but for some reason I always forget TOP PERCENT.  I&#8217;m hoping this post will help me remember TOP PERCENT, and maybe even help somebody else with some TSQL.</p>
]]></content:encoded>
			<wfw:commentRss>http://ericfickes.com/2010/09/selecting-random-ids-using-top-and-a-cte/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Select random value from a range of values</title>
		<link>http://ericfickes.com/2010/01/select-random-value-from-a-range-of-values/</link>
		<comments>http://ericfickes.com/2010/01/select-random-value-from-a-range-of-values/#comments</comments>
		<pubDate>Sun, 03 Jan 2010 01:08:52 +0000</pubDate>
		<dc:creator>Eric Fickes</dc:creator>
				<category><![CDATA[database]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[tips and tricks]]></category>
		<category><![CDATA[tsql]]></category>
		<category><![CDATA[#table]]></category>
		<category><![CDATA[cte]]></category>
		<category><![CDATA[howto]]></category>
		<category><![CDATA[mssql]]></category>
		<category><![CDATA[mssql2000]]></category>
		<category><![CDATA[mssql2005]]></category>
		<category><![CDATA[random]]></category>
		<category><![CDATA[table variable]]></category>

		<guid isPermaLink="false">http://ericfickes.com/?p=1020</guid>
		<description><![CDATA[Earlier I blogged about creating random numbers using tsql functions.  Here are two techniques for selecting a random value from a pre-defined range of values in a tsql script.  The first technique uses a table variable ( MSSQL 2000 + &#8230; <a href="http://ericfickes.com/2010/01/select-random-value-from-a-range-of-values/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Earlier I blogged about <a title="TSQL UDFs for generating random numbers" href="http://ericfickes.com/2009/09/generate-random-integers-using-tsql-udfs/" target="_blank">creating random numbers using tsql functions</a>.  Here are two techniques for selecting a random value from a pre-defined range of values in a tsql script.  The first technique uses a <a title="MSSQL 2000 let's you create table variables" href="http://msdn.microsoft.com/en-us/library/aa260638%28SQL.80%29.aspx" target="_blank">table variable</a> ( MSSQL 2000 + ), and the second uses a <a title="CTEs in MSSQL 2005 let you build queries a little differently" href="http://technet.microsoft.com/en-us/library/ms175972%28SQL.90%29.aspx" target="_blank">Common Table Expression</a> or CTE ( MSSQL 2005+ ).</p>
<h3>Select a random value using a table variable</h3>
<pre class="brush: sql; title: ; notranslate">

-- var to hold random integer
declare @field_val int

-- create table var to hold value range [ 0, 512, 1024, 2048, 4096 ]
-- inserting the first value sets the structure for the table variable
SELECT 0 AS 'num'
INTO #temp

-- insert data into table var
INSERT INTO #temp VALUES ( 512 )
INSERT INTO #temp VALUES ( 1024 )
INSERT INTO #temp VALUES ( 2048 )
INSERT INTO #temp VALUES ( 4096 )

-- assign random value
SELECT TOP 1 @field_val = num FROM #temp ORDER BY NEWID()

-- show value
SELECT @field_val

-- drop the table variable
DROP TABLE #temp
</pre>
<h3>Select a random value using a CTE</h3>
<pre class="brush: sql; title: ; notranslate">
-- define our data table
WITH data( car )
AS
(
	-- UNION together our range of values
	SELECT 'audi' AS 'car'
	UNION
	SELECT 'bmw' AS 'car'
	UNION
	SELECT 'infinity' AS 'car'
	UNION
	SELECT 'lexus' AS 'car'
	UNION
	SELECT 'porsche' AS 'car'
)
-- select a random value
SELECT TOP 1 car FROM data
ORDER BY NEWID()
</pre>
<p>Both of these techniques can be used with numbers or text.  Just be sure to mind your quotes, and variable datatypes.  Being able to pick a random value in data generation scripts has proven very useful.  I hope this helps somebody else out as well.</p>
]]></content:encoded>
			<wfw:commentRss>http://ericfickes.com/2010/01/select-random-value-from-a-range-of-values/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

