I have a requirement to find the maximum number of consecutive zeros in a row in a 32 bit binary number. For example 1111 1111 1111 1000 1111 1111 0000 1111 = 4 1111 1110 1111 1110 1111 1111 1110 1111 = 1 I am actually storing the number in a BIGINT for simplicity. I can do this in a udf with a look fairly easily but was wondering if there is something more sql that will do the job. Cheers, Michael

0 |

3/18/2010 11:28:35 PM

That's an interesting question. Does it include leading zeroes? 0000 0000 0111 1000 1111 1111 0000 1111 = 9? -- Thanks Michael Coles SQL Server MVP Author, "Expert SQL Server 2008 Encryption" (http://www.apress.com/book/view/1430224649) ---------------- "Michael C" <mike@nospam.com> wrote in message news:%23jNl8KvxKHA.5132@TK2MSFTNGP05.phx.gbl... >I have a requirement to find the maximum number of consecutive zeros in a >row in a 32 bit binary number. For example > 1111 1111 1111 1000 1111 1111 0000 1111 = 4 > 1111 1110 1111 1110 1111 1111 1110 1111 = 1 > > I am actually storing the number in a BIGINT for simplicity. I can do this > in a udf with a look fairly easily but was wondering if there is something > more sql that will do the job. > > Cheers, > Michael >

0 |

3/19/2010 3:37:39 AM

One set based solution (not necessarily faster than a loop) Declare @TestTable Table (TestValue bigint); Insert @TestTable (TestValue) Select 0x07060504 Union All Select 0xFFF8FF0F Union All Select 0xFEFEFFEF; With Nbrs_3( n ) As ( Select 1 Union All Select 0 ), Nbrs_2( n ) As ( Select 1 From Nbrs_3 n1 Cross Join Nbrs_3 n2 ), Nbrs_1( n ) As ( Select 1 From Nbrs_2 n1 Cross Join Nbrs_2 n2 ), Nbrs_0 ( n ) As ( Select 1 From Nbrs_1 n1 Cross Join Nbrs_1 n2 ), Nbrs As (Select n-1 As n From ( Select Row_Number() Over (Order By n) From Nbrs_0 ) D ( n ) Where n <= 32), BitValues As (Select t.TestValue, n, Case When t.TestValue&Power(Cast (2 As bigint), n) > 0 Then 1 Else 0 End As BitValue From Nbrs n Cross Join @TestTable t Union All Select -1, -1, 1), BitSequences As (Select b1.TestValue, b1.n, b1.BitValue, b1.n - (Select Max(b2.n) From BitValues b2 Where b1.n >= b2.n And b2.BitValue = 1 And b1.TestValue = b2.TestValue) As Sequence From BitValues b1 Where b1.BitValue = 0) Select TestValue, Cast (TestValue As varbinary(4)) As TestValueBinary, Max(Sequence) As MaxNbrOfSeqZeroes From BitSequences Group By TestValue; Tom "Michael C" <mike@nospam.com> wrote in message news:%23jNl8KvxKHA.5132@TK2MSFTNGP05.phx.gbl... >I have a requirement to find the maximum number of consecutive zeros in a >row in a 32 bit binary number. For example > 1111 1111 1111 1000 1111 1111 0000 1111 = 4 > 1111 1110 1111 1110 1111 1111 1110 1111 = 1 > > I am actually storing the number in a BIGINT for simplicity. I can do this > in a udf with a look fairly easily but was wondering if there is something > more sql that will do the job. > > Cheers, > Michael >

0 |

3/19/2010 4:44:37 AM

On 2010-03-19 0:28, Michael C wrote: > I have a requirement to find the maximum number of consecutive zeros in a > row in a 32 bit binary number. For example > 1111 1111 1111 1000 1111 1111 0000 1111 = 4 > 1111 1110 1111 1110 1111 1111 1110 1111 = 1 > > I am actually storing the number in a BIGINT for simplicity. I can do this > in a udf with a look fairly easily but was wondering if there is something > more sql that will do the job. > If you're going to do this a lot, writing a CLR function for this is probably a good idea, because no SQL-based solution is going to have very good performance or maintainability. There is no simple arithmetic manipulation for this operation. Set-based solutions are certainly possible, as Tom Cooper has demonstrated, but they're pretty strained. -- J.

0 |

3/19/2010 5:54:20 AM

Here is another set based solution: CREATE TABLE Foo ( bin BIGINT NOT NULL PRIMARY KEY); INSERT INTO Foo VALUES(4294508303); --11111111111110001111111100001111 INSERT INTO Foo VALUES(4278124527); --11111110111111101111111111101111 WITH N1 (n) AS (SELECT 1 UNION ALL SELECT 1), N2 (n) AS (SELECT 1 FROM N1 AS X, N1 AS Y), N3 (n) AS (SELECT 1 FROM N2 AS X, N2 AS Y), N4 (n) AS (SELECT 1 FROM N3 AS X, N3 AS Y), Nums (n) AS (SELECT ROW_NUMBER() OVER(ORDER BY n) - 1 FROM N4), Bin AS ( SELECT bin, CAST('1' + bin_string + '1' AS CHAR(34)) AS bin_string, CAST(bin_string AS CHAR(32)) AS bin_string_org FROM Foo AS A CROSS APPLY (SELECT CASE WHEN B.bin & POWER(CAST(2 AS BIGINT), n) > 0 THEN '1' ELSE '0' END FROM Foo AS B CROSS JOIN Nums WHERE B.bin = A.bin AND n < 32 ORDER BY n DESC FOR XML PATH('')) AS C(bin_string)), Parsed AS ( SELECT B.bin, bin_string_org, SUBSTRING(bin_string, CHARINDEX('10', bin_string, n) + 1, CHARINDEX('01', bin_string, n) - CHARINDEX('10', bin_string, n)) AS bin_section, n FROM Bin AS B JOIN Nums ON n BETWEEN 1 AND LEN(B.bin_string) AND CHARINDEX('10', bin_string, n) < CHARINDEX('01', bin_string, n) AND CHARINDEX('10', bin_string, n) > 0) SELECT bin, bin_string_org, MAX(LEN(bin_section)) AS max_len_zero FROM Parsed GROUP BY bin, bin_string_org; /* bin bin_string_org max_len_zero -------------------- -------------------------------- ------------ 4278124527 11111110111111101111111111101111 1 4294508303 11111111111110001111111100001111 4 */ -- Plamen Ratchev http://www.SQLStudio.com

0 |

3/19/2010 4:56:10 PM

Interesting question. Here is another solution. -- auxiliary table CREATE TABLE Numbers(n int) INSERT INTO Numbers VALUES (1) INSERT INTO Numbers SELECT n+1 FROM Numbers INSERT INTO Numbers SELECT n+2 FROM Numbers INSERT INTO Numbers SELECT n+4 FROM Numbers INSERT INTO Numbers SELECT n+8 FROM Numbers INSERT INTO Numbers SELECT n+16 FROM Numbers INSERT INTO Numbers VALUES (0) -- data table CREATE TABLE Foo ( bin BIGINT NOT NULL PRIMARY KEY); INSERT INTO Foo VALUES(4294508303); --11111111111110001111111100001111 INSERT INTO Foo VALUES(4278124527); --11111110111111101111111111101111 insert into foo values(4294967295); -- all 1s insert into foo values(0); -- all 0s ;WITH zeros AS ( SELECT bin,n,CASE WHEN bin & POWER(CAST(2 as bigint),n) = 0 THEN 1 ELSE -32 END AS zero FROM Foo CROSS JOIN Numbers WHERE n BETWEEN 0 AND 31 ), zeros2 AS ( SELECT Z3.bin, Z1.n AS nmin, Z2.n AS nmax, SUM(Z3.zero) AS score FROM zeros Z1 JOIN zeros Z2 ON Z1.bin = Z2.bin AND Z1.n <= Z2.n JOIN zeros Z3 ON Z3.bin = Z1.bin AND Z3.n BETWEEN Z1.n AND Z2.n GROUP BY Z3.bin, Z1.n, Z2.n ) SELECT bin, COALESCE(NULLIF(MAX(score),-32),0) FROM zeros2 GROUP BY bin -- cleanup DROP TABLE Foo DROP TABLE Numbers -- Gert-Jan Michael C wrote: > > I have a requirement to find the maximum number of consecutive zeros in a > row in a 32 bit binary number. For example > 1111 1111 1111 1000 1111 1111 0000 1111 = 4 > 1111 1110 1111 1110 1111 1111 1110 1111 = 1 > > I am actually storing the number in a BIGINT for simplicity. I can do this > in a udf with a look fairly easily but was wondering if there is something > more sql that will do the job. > > Cheers, > Michael

0 |

3/19/2010 7:05:00 PM

Gert-Jan, Very clever! -- Plamen Ratchev http://www.SQLStudio.com

0 |

3/19/2010 8:26:04 PM

Thanks. I'm wondering who I borrowed this from, but I can't remember. The original problem for which this was a solution was the problem of people getting on and off the bus, one row for each transition. The query can be used to determine how many people are on the bus at any moment. Each "on" would have the value 1, each "off" the value -1. The sum will give the current number of people. Similarly this type of query allows you to get the maximum number; the maximum number of people on the bus, the maximum number of concurrent users, etc. In this case, the maximum number of consecutive zeros. -- Gert-Jan Plamen Ratchev wrote: > > Gert-Jan, > > Very clever! > > -- > Plamen Ratchev > http://www.SQLStudio.com

0 |

3/19/2010 9:03:57 PM

Here's another possibility. For this one I treated it as a gaps/islands problem. Each consecutive string of 1 bits is treated as an island, then the distances between them (the zero bits) are treated as gaps. I had to add some logic to shift the number left one bit (*2) and set the 1st and 33rd bits to 1 ("| 8589934593") to force leading and trailing zeroes to be treated as gaps. DECLARE @num BIGINT = 4294508303; WITH Powers AS ( SELECT 0 AS id, CAST(1 AS BIGINT) AS pwr UNION ALL SELECT p.id + 1, POWER(CAST(2 AS BIGINT), p.id + 1) FROM Powers p WHERE id < 33 ), Islands AS ( SELECT ROW_NUMBER() OVER (ORDER BY id) AS num, p.id, p.pwr FROM Powers p WHERE p.pwr & ((@num * 2 ) | CAST(8589934593 AS BIGINT)) <> 0 ) SELECT MAX(n.id - c.id - 1) AS zerobitstringlength FROM Islands AS c INNER JOIN Islands AS n ON n.num = c.num + 1 WHERE n.id - c.id > 1; -- Thanks Michael Coles SQL Server MVP Author, "Expert SQL Server 2008 Encryption" (http://www.apress.com/book/view/1430224649) ---------------- "Michael Coles" <admin@geocodenet.com> wrote in message news:%23aBLqWxxKHA.5364@TK2MSFTNGP05.phx.gbl... > That's an interesting question. Does it include leading zeroes? > > 0000 0000 0111 1000 1111 1111 0000 1111 = 9? > > -- > Thanks > > Michael Coles > SQL Server MVP > Author, "Expert SQL Server 2008 Encryption" > (http://www.apress.com/book/view/1430224649) > ---------------- > > "Michael C" <mike@nospam.com> wrote in message > news:%23jNl8KvxKHA.5132@TK2MSFTNGP05.phx.gbl... >>I have a requirement to find the maximum number of consecutive zeros in a >>row in a 32 bit binary number. For example >> 1111 1111 1111 1000 1111 1111 0000 1111 = 4 >> 1111 1110 1111 1110 1111 1111 1110 1111 = 1 >> >> I am actually storing the number in a BIGINT for simplicity. I can do >> this in a udf with a look fairly easily but was wondering if there is >> something more sql that will do the job. >> >> Cheers, >> Michael >> >

0 |

3/21/2010 2:33:54 AM

I think this puzzle might have been in Bentley's PROGRAMMING PEARLS books and/or columns, in the form of an array in a procedural language. I did it as blocks of sequential theater seats with a status of "available" rather than "sold" years ago. CREATE TABLE Theater (seat_seq INTEGER NOT NULL PRIMARY KEY, seat_status CHAR(1) DEFAULT 'A' NOT NULL CHECK (seat_status IN ('A', 'S')); SQL is not really a language for low-level bit fiddling.

0 |

3/21/2010 11:41:28 PM

"Michael C" <mike@nospam.com> wrote in message news:%23jNl8KvxKHA.5132@TK2MSFTNGP05.phx.gbl... >I have a requirement to find the maximum number of consecutive zeros in a >row in a 32 bit binary number. For example > 1111 1111 1111 1000 1111 1111 0000 1111 = 4 > 1111 1110 1111 1110 1111 1111 1110 1111 = 1 > > I am actually storing the number in a BIGINT for simplicity. I can do this > in a udf with a look fairly easily but was wondering if there is something > more sql that will do the job. Hi Guys, Thanks for all the replies and sorry for the delays, it took me a while to get around to looking at each solution. Mike Coles, I have to say I like your solution the best as you had the simplest method of generating powers of 2 and everything was nicely encapsulated. Although other solutions did probably have a speed advantage working on multiple values in the one statement. The only thing I would say is that you can avoid using the POWER function by just substituting pwr * 2 into the second part of your CTE. Gert, what the heck was yours doing? I couldn't work it out. :-) It looks like any sort of set based solution isn't going to be quick (yep, I know I asked for it). The looping code executed a loop of 10,000 numbers instantly where the fastest set based solution was 16 seconds. The award goes to Mike Coles for the fastest code which is suprising (I wrote the above before doing the speed testing). Michael

0 |

3/24/2010 3:20:50 AM

Yet another solution, based on bit shifting and LOG2. Declare @num bigint Set @num = 4294508303 ;WITH permutations AS ( SELECT 0 AS id, @num & 0xFFFFFFFF AS test UNION ALL SELECT p.id + 1, (p.test * 2 + 1) & 0xFFFFFFFF FROM permutations p WHERE p.id + 1 < 32 ) SELECT 32 - COALESCE(CEILING(LOG(NULLIF(MIN(test),0))/LOG(2)),0) FROM permutations I would expect this to run very efficiently! If you need more bits, then alter "32" and "0xFFFFFFFF". I needed to add the whole COALESCE(... NULLIF() ) stuff because it is illegal to execute LOG(0). I am sure that this time, you will be able to figure out what it does :-) -- Gert-Jan Michael C wrote: > > "Michael C" <mike@nospam.com> wrote in message > news:%23jNl8KvxKHA.5132@TK2MSFTNGP05.phx.gbl... > >I have a requirement to find the maximum number of consecutive zeros in a > >row in a 32 bit binary number. For example > > 1111 1111 1111 1000 1111 1111 0000 1111 = 4 > > 1111 1110 1111 1110 1111 1111 1110 1111 = 1 > > > > I am actually storing the number in a BIGINT for simplicity. I can do this > > in a udf with a look fairly easily but was wondering if there is something > > more sql that will do the job. > > Hi Guys, > > Thanks for all the replies and sorry for the delays, it took me a while to > get around to looking at each solution. Mike Coles, I have to say I like > your solution the best as you had the simplest method of generating powers > of 2 and everything was nicely encapsulated. Although other solutions did > probably have a speed advantage working on multiple values in the one > statement. The only thing I would say is that you can avoid using the POWER > function by just substituting pwr * 2 into the second part of your CTE. > > Gert, what the heck was yours doing? I couldn't work it out. :-) > > It looks like any sort of set based solution isn't going to be quick (yep, I > know I asked for it). The looping code executed a loop of 10,000 numbers > instantly where the fastest set based solution was 16 seconds. The award > goes to Mike Coles for the fastest code which is suprising (I wrote the > above before doing the speed testing). > > Michael

0 |

3/24/2010 7:16:42 PM

Oh you could always replace the CTE with a lookup table and avoid the calculation altogether :) I was originally playing around with pre-calculating the longest string of bits for each number and shoving it in a table, but 4 billion rows is a little ugggh. I was considering splitting it up into 8 bit bytes and treating each one separately, but that's overly complex. Just play around with it, and have fun :) -- Thanks Michael Coles SQL Server MVP Author, "Expert SQL Server 2008 Encryption" (http://www.apress.com/book/view/1430224649) ---------------- "Michael C" <mike@nospam.com> wrote in message news:OwQbCEwyKHA.1236@TK2MSFTNGP06.phx.gbl... > "Michael C" <mike@nospam.com> wrote in message > news:%23jNl8KvxKHA.5132@TK2MSFTNGP05.phx.gbl... >>I have a requirement to find the maximum number of consecutive zeros in a >>row in a 32 bit binary number. For example >> 1111 1111 1111 1000 1111 1111 0000 1111 = 4 >> 1111 1110 1111 1110 1111 1111 1110 1111 = 1 >> >> I am actually storing the number in a BIGINT for simplicity. I can do >> this in a udf with a look fairly easily but was wondering if there is >> something more sql that will do the job. > > Hi Guys, > > Thanks for all the replies and sorry for the delays, it took me a while to > get around to looking at each solution. Mike Coles, I have to say I like > your solution the best as you had the simplest method of generating powers > of 2 and everything was nicely encapsulated. Although other solutions did > probably have a speed advantage working on multiple values in the one > statement. The only thing I would say is that you can avoid using the > POWER function by just substituting pwr * 2 into the second part of your > CTE. > > Gert, what the heck was yours doing? I couldn't work it out. :-) > > It looks like any sort of set based solution isn't going to be quick (yep, > I know I asked for it). The looping code executed a loop of 10,000 numbers > instantly where the fastest set based solution was 16 seconds. The award > goes to Mike Coles for the fastest code which is suprising (I wrote the > above before doing the speed testing). > > Michael >

0 |

3/25/2010 12:24:48 AM

"Michael Coles" <admin@geocodenet.com> wrote in message news:F933C5EA-C148-4006-A810-9F1E794F65AC@microsoft.com... > Oh you could always replace the CTE with a lookup table and avoid the > calculation altogether :) Yeah but that would likely be slower. When I replaced the power function with pwr * 2 it didn't appear to make much difference but somehow I think such functions should be avoided when doing bit fiddling. It just doesn't seem right to me to use a complex function when trying to do something so simple. Everything should be possible with & | + - and bit shifts. > I was originally playing around with pre-calculating the longest string of > bits for each number and shoving it in a table, but 4 billion rows is a > little ugggh. I was considering splitting it up into 8 bit bytes and > treating each one separately, but that's overly complex. Just play around > with it, and have fun :) I've had my fun and the procedural solution has been implemented. Thanks for your replies. Michael

0 |

3/25/2010 3:44:31 AM

"Gert-Jan Strik" <sorrytoomuchspamalready@xs4all.nl> wrote in message news:4BAA651A.20DDD1C1@xs4all.nl... > Yet another solution, based on bit shifting and LOG2. > > Declare @num bigint > Set @num = 4294508303 > > ;WITH permutations > AS ( > SELECT 0 AS id, @num & 0xFFFFFFFF AS test > > UNION ALL > > SELECT p.id + 1, (p.test * 2 + 1) & 0xFFFFFFFF > FROM permutations p > WHERE p.id + 1 < 32 > ) > SELECT 32 - COALESCE(CEILING(LOG(NULLIF(MIN(test),0))/LOG(2)),0) > FROM permutations > > > I would expect this to run very efficiently! > > If you need more bits, then alter "32" and "0xFFFFFFFF". I needed to add > the whole COALESCE(... NULLIF() ) stuff because it is illegal to execute > LOG(0). > > I am sure that this time, you will be able to figure out what it does > :-) Thanks Gert but you didn't explain your first solution. I like your little dig at me asking but I'm confident enough in my abilities to ask how something works. Michael

0 |

3/25/2010 3:51:18 AM

> Yeah but that would likely be slower. When I replaced the power function > with pwr * 2 it didn't appear to make much difference but somehow I think > such functions should be avoided when doing bit fiddling. It just doesn't > seem right to me to use a complex function when trying to do something so > simple. SQL is optimized for querying, not bit fiddling (hence no << and >> operators). C is optimized for bit fiddling and loops. When you try to do procedural bit fiddling in a declarative language with minimal purpose-built operators the most efficient solution in that language is not necessarily going to be a loop. > Everything should be possible with & | + - and bit shifts. ....and variables ...and assignment statements ...and a loop ...and etc.... Is this a computation you have to perform for large sets of numbers? Or is this a one-off deal? -- Thanks Michael Coles SQL Server MVP Author, "Expert SQL Server 2008 Encryption" (http://www.apress.com/book/view/1430224649) ----------------

0 |

3/25/2010 4:15:16 AM

"Michael Coles" <admin@geocodenet.com> wrote in message news:uKsPbH9yKHA.5936@TK2MSFTNGP04.phx.gbl... > SQL is optimized for querying, not bit fiddling (hence no << and >> > operators). C is optimized for bit fiddling and loops. When you try to > do procedural bit fiddling in a declarative language with minimal > purpose-built operators the most efficient solution in that language is > not necessarily going to be a loop. The procedural method was 1000s of times faster than any of the set based solutions. > ...and variables ...and assignment statements ...and a loop ...and etc.... Yep, but they are all things that translate directly to cpu instructions. Power is possibly hundreds of instructions assuming it's not optimised for powers of 2. I know it didn't make any speed difference but it just feels wrong to me. It just doesn't seem right to base a simpler function on a more complex instruction. Power isn't that big a deal, using floating point stuff like log would be much worse. > Is this a computation you have to perform for large sets of numbers? Or > is this a one-off deal? This is for a large frame relay network where customers can purchase timeslots from 1 to 32 on a line. If we are moving a customer from one location to another then we need to know that they will fit. As an example, a customer with 5 timeslots will need 5 *consecutive* timeslots in the new location. We need to know for any of the moves coming up if we've got a problem. This could be a few thousand rows where I am currently using it, but there is potential for 100,000+ rows if my function was reused in other reports that worked across the network. As an example, someone might ask which ports have less than 4 consecutive timeslots. Michael

0 |

3/25/2010 5:10:19 AM

"Gert-Jan Strik" <sorrytoomuchspamalready@xs4all.nl> wrote in message news:4BAA651A.20DDD1C1@xs4all.nl... > I would expect this to run very efficiently! But not accurately. :-) It fails on all values with a single bit in them up to 32768 and then at 2147483648

0 |

3/25/2010 5:21:27 AM

Michael C wrote: > > "Gert-Jan Strik" <sorrytoomuchspamalready@xs4all.nl> wrote in message > news:4BAA651A.20DDD1C1@xs4all.nl... > > I would expect this to run very efficiently! > > But not accurately. :-) It fails on all values with a single bit in them up > to 32768 and then at 2147483648 I see. Yes, there was a logical error / bug. Sorry about that. Here is a correct version. Also, it is set based. CREATE TABLE Foo (bin BIGINT NOT NULL PRIMARY KEY); INSERT INTO Foo VALUES(4294508303); --11111111111110001111111100001111 INSERT INTO Foo VALUES(4278124527); --11111110111111101111111111101111 INSERT INTO Foo VALUES(4294967295); -- all 1s INSERT INTO Foo VALUES(0); -- all 0s INSERT INTO Foo VALUES(32768) INSERT INTO Foo VALUES(2147483648) INSERT INTO Foo VALUES(0x7FFFFFFF) ;WITH permutations AS ( SELECT bin, 0 AS id, bin AS test FROM Foo UNION ALL SELECT bin, p.id + 1, (p.test * 2 + 1) FROM permutations p WHERE p.id + 1 < 32 ) SELECT bin , CAST(bin AS binary(4)) AS HexRepresentation , 32 - COALESCE(FLOOR(LOG(NULLIF(MIN(test & 0xFFFFFFFF),0))/LOG(2)) + 1,0) AS MaxSequenceOf0Bits FROM permutations GROUP BY bin DROP TABLE Foo It works as follows: 1. The CTE will take a sample for the original value, and each bit shift to the left. After each bit shift to the left (*2) the least significant bit is set to 1 (+1). 2. Bits 32 and higher are filtered out (& 0xFFFFFFFF). After that, the minimum (MIN(test)) indicates the sample with the most leading zeros, which is the one we want. 3. The Log2 value (LOG(...)/LOG(2)) returns an indication of the bit number of the most significant 1-bit. The integer part indicates the bit number. The remainder is irrelevant, and therefore removed (FLOOR). Then the bit number (zero-based) is converted to a bit count (+1). The rest of the 32 bits must be the maximum sequence of 0 bits (32 - ...) 4. Handling the value 0 (sequence of 32 0-bits) requires special attention. NULLIF will "escape" the value to NULL. The COALESCE will specify the correct result for that situation, which is no 1-bits (0). -- Gert-Jan

0 |

3/25/2010 7:11:56 PM

Michael, Here a short explanation of this solution. This is just FYI, because I think the other solution that I posted is much better. 1. The CTE "zeros" determines a very specific value for each individual bit in the "bin". A 0-bit gets the number 1, a 1-bit gets the number -32. This number -32 is the number of (relevant) bits in "bin". 2. In CTE "zeros2", the cross join between Z1 and Z2 will generate all combinations of "from" (nmin) and "to" (nmax) bits in a "bin". 3. In CTE "zeros3", the Z3 cross join will "traverse" between the each "from" and "to" and SUM the values of step 1. The SUM indicates the number of 0-bits. If there is a 1-bit in the range, its value of -32 will immediately result in a negative value. 4. The MAX(score) selects the range with the most 0-bits and no 1-bits. 5. If there is no such value in step 4, so if all bits are 1, then the "MAX" score will be -32. -- Gert-Jan Gert-Jan Strik wrote: > > Interesting question. > > Here is another solution. > > -- auxiliary table > CREATE TABLE Numbers(n int) > INSERT INTO Numbers VALUES (1) > INSERT INTO Numbers SELECT n+1 FROM Numbers > INSERT INTO Numbers SELECT n+2 FROM Numbers > INSERT INTO Numbers SELECT n+4 FROM Numbers > INSERT INTO Numbers SELECT n+8 FROM Numbers > INSERT INTO Numbers SELECT n+16 FROM Numbers > INSERT INTO Numbers VALUES (0) > > -- data table > CREATE TABLE Foo ( > bin BIGINT NOT NULL PRIMARY KEY); > > INSERT INTO Foo VALUES(4294508303); --11111111111110001111111100001111 > INSERT INTO Foo VALUES(4278124527); --11111110111111101111111111101111 > insert into foo values(4294967295); -- all 1s > insert into foo values(0); -- all 0s > > ;WITH zeros AS ( > SELECT bin,n,CASE WHEN bin & POWER(CAST(2 as bigint),n) = 0 THEN 1 > ELSE -32 END AS zero > FROM Foo > CROSS JOIN Numbers > WHERE n BETWEEN 0 AND 31 > ), zeros2 AS ( > SELECT Z3.bin, Z1.n AS nmin, Z2.n AS nmax, SUM(Z3.zero) AS score > FROM zeros Z1 > JOIN zeros Z2 > ON Z1.bin = Z2.bin > AND Z1.n <= Z2.n > JOIN zeros Z3 > ON Z3.bin = Z1.bin > AND Z3.n BETWEEN Z1.n AND Z2.n > GROUP BY Z3.bin, Z1.n, Z2.n > ) > SELECT bin, COALESCE(NULLIF(MAX(score),-32),0) > FROM zeros2 > GROUP BY bin > > -- cleanup > DROP TABLE Foo > DROP TABLE Numbers > > -- > Gert-Jan > > Michael C wrote: > > > > I have a requirement to find the maximum number of consecutive zeros in a > > row in a 32 bit binary number. For example > > 1111 1111 1111 1000 1111 1111 0000 1111 = 4 > > 1111 1110 1111 1110 1111 1111 1110 1111 = 1 > > > > I am actually storing the number in a BIGINT for simplicity. I can do this > > in a udf with a look fairly easily but was wondering if there is something > > more sql that will do the job. > > > > Cheers, > > Michael

0 |

3/25/2010 7:21:13 PM

"Michael C" <mike@nospam.com> wrote in message news:%23STq4l9yKHA.928@TK2MSFTNGP05.phx.gbl... > "Michael Coles" <admin@geocodenet.com> wrote in message > news:uKsPbH9yKHA.5936@TK2MSFTNGP04.phx.gbl... >> SQL is optimized for querying, not bit fiddling (hence no << and >> >> operators). C is optimized for bit fiddling and loops. When you try to >> do procedural bit fiddling in a declarative language with minimal >> purpose-built operators the most efficient solution in that language is >> not necessarily going to be a loop. > > The procedural method was 1000s of times faster than any of the set based > solutions. Personally I was just getting started experimenting with set-based methods when you called game over. >> ...and variables ...and assignment statements ...and a loop ...and >> etc.... > > Yep, but they are all things that translate directly to cpu instructions. > Power is possibly hundreds of instructions assuming it's not optimised for > powers of 2. I know it didn't make any speed difference but it just feels > wrong to me. It just doesn't seem right to base a simpler function on a > more complex instruction. Power isn't that big a deal, using floating > point stuff like log would be much worse. > >> Is this a computation you have to perform for large sets of numbers? Or >> is this a one-off deal? > > This is for a large frame relay network where customers can purchase > timeslots from 1 to 32 on a line. If we are moving a customer from one > location to another then we need to know that they will fit. As an > example, a customer with 5 timeslots will need 5 *consecutive* timeslots > in the new location. We need to know for any of the moves coming up if > we've got a problem. This could be a few thousand rows where I am > currently using it, but there is potential for 100,000+ rows if my > function was reused in other reports that worked across the network. As an > example, someone might ask which ports have less than 4 consecutive > timeslots. That being the case, if performance is critical, I'll throw my vote in with the folks who think you should do it in a procedural language that's optimized for bit fiddling. -- Thanks Michael Coles SQL Server MVP Author, "Expert SQL Server 2008 Encryption" (http://www.apress.com/book/view/1430224649) ----------------

0 |

3/26/2010 1:50:51 AM

"Michael Coles" <admin@geocodenet.com> wrote in message news:B5EB3741-4323-4F4D-89E7-A15C11A06BF8@microsoft.com... > Personally I was just getting started experimenting with set-based methods > when you called game over. Sorry, I certainly wasn't calling game over, just stating the facts. It was me who asked for a set based solution. > That being the case, if performance is critical, I'll throw my vote in > with the folks who think you should do it in a procedural language that's > optimized for bit fiddling. It didn't seem like it was necessary to me. Procedural sql was fast enough. Michael

0 |

3/26/2010 5:39:19 AM

"Michael C" <mike@nospam.com> wrote in message news:%23JirxaKzKHA.4156@TK2MSFTNGP06.phx.gbl... > "Michael Coles" <admin@geocodenet.com> wrote in message > news:B5EB3741-4323-4F4D-89E7-A15C11A06BF8@microsoft.com... >> Personally I was just getting started experimenting with set-based >> methods when you called game over. > > Sorry, I certainly wasn't calling game over, just stating the facts. It > was me who asked for a set based solution. > >> That being the case, if performance is critical, I'll throw my vote in >> with the folks who think you should do it in a procedural language that's >> optimized for bit fiddling. > > It didn't seem like it was necessary to me. Procedural sql was fast > enough. At any rate my initial thought was to create a lookup table, which would probably be considerably faster than the procedural alternative. Of course a lookup table that covers 4 billion entries (32 bits) will take a long time to create, eat up a whole bunch of storage space, and probably just won't be the most efficient method in general. My idea to counter this was to look for simple patterns that would make calculation of the number of contiguous zero bits simple. Consider the following for 32 bit numbers from 0 to 65535: First Number - Last Number, Number of Consecutive Zero Bits, Number of Entries in Range 0 - 0 , 32 , 1 1 - 1 , 31 , 1 2 - 3 , 30 , 2 4 - 7 , 29 , 4 8 - 15 , 28 , 8 16 - 31 , 27 , 16 32 - 63 , 26 , 32 64 - 127 , 25 , 64 128 - 255 , 24 , 128 256 - 511 , 23 , 256 512 - 1023 , 22 , 512 1024 - 2047 , 21 , 1024 2048 - 4095 , 20 , 2048 4096 - 8191 , 19 , 4096 8192 - 16383 , 18 , 8192 16384 - 32767 , 17 , 16384 32768 - 65535 , 16 , 32768 Of course these rely on the fact that all the bits to the left are leading zero bits, which makes for a very simple pattern. The complexity comes in when you have numbers with larger runs of zero bit that lie between 1 bits. My thinking was that there might be a useful pattern that could be derived for all 32 bit numbers to create a small and efficient reference table or a simple formula that would not require looping and bit manipulations. Is it possible? Who knows. But I'd bet you dollars to doughnuts that a lookup against a properly indexed reference table would outperform procedural code in T-SQL. -- Thanks Michael Coles SQL Server MVP Author, "Expert SQL Server 2008 Encryption" (http://www.apress.com/book/view/1430224649) ----------------

0 |

3/27/2010 7:30:37 PM

Michael Coles wrote: [snipped] > But I'd bet you dollars to doughnuts that a lookup > against a properly indexed reference table would outperform procedural code > in T-SQL. If I were a betting man, I would take that bet. IMO, you greatly underestimate the cost of lookups. They incur locking, lots of memory management and physical I/O (unless you have 40 GB of RAM dedicated to this lookup table). The alternative is just a few mathematical operations. From a performance perspective, the lookup table will not even come close. -- Gert-Jan

0 |

3/27/2010 10:01:44 PM

"Gert-Jan Strik" <sorrytoomuchspamalready@xs4all.nl> wrote in message news:4BAE8048.897B7EC8@xs4all.nl... > Michael Coles wrote: [snipped] >> But I'd bet you dollars to doughnuts that a lookup >> against a properly indexed reference table would outperform procedural >> code >> in T-SQL. > > If I were a betting man, I would take that bet. IMO, you greatly > underestimate the cost of lookups. They incur locking, lots of memory > management and physical I/O (unless you have 40 GB of RAM dedicated to > this lookup table). The alternative is just a few mathematical > operations. From a performance perspective, the lookup table will not > even come close. You're on. 40 GB of RAM? Maybe... if you declare all your columns varbinary(max) and pad them with excessive zeroes. Just for reference how many GB of RAM do you think need to be dedicated to the 17 row lookup table in my previous post? This particular lookup table would not ever need to be updated ever after it is initially built, so you can avoid data locks with the NOLOCK query hint against this table. The "few mathematical operations" you mention are going to be performed against large numbers of rows, loops nested within loops (O(n^2)). Of course all of this is just conjecture until you do an actual performance comparison. -- Thanks Michael Coles SQL Server MVP Author, "Expert SQL Server 2008 Encryption" (http://www.apress.com/book/view/1430224649) ----------------

0 |

3/28/2010 4:05:24 PM

Oops, that's O(m*n) not O(n^2) -- Thanks Michael Coles SQL Server MVP Author, "Expert SQL Server 2008 Encryption" (http://www.apress.com/book/view/1430224649) ---------------- "Michael Coles" <admin@geocodenet.com> wrote in message news:7D733D71-6A8F-497C-9BBD-A7A2E8908F67@microsoft.com... > "Gert-Jan Strik" <sorrytoomuchspamalready@xs4all.nl> wrote in message > news:4BAE8048.897B7EC8@xs4all.nl... >> Michael Coles wrote: [snipped] >>> But I'd bet you dollars to doughnuts that a lookup >>> against a properly indexed reference table would outperform procedural >>> code >>> in T-SQL. >> >> If I were a betting man, I would take that bet. IMO, you greatly >> underestimate the cost of lookups. They incur locking, lots of memory >> management and physical I/O (unless you have 40 GB of RAM dedicated to >> this lookup table). The alternative is just a few mathematical >> operations. From a performance perspective, the lookup table will not >> even come close. > > You're on. 40 GB of RAM? Maybe... if you declare all your columns > varbinary(max) and pad them with excessive zeroes. Just for reference how > many GB of RAM do you think need to be dedicated to the 17 row lookup > table in my previous post? > > This particular lookup table would not ever need to be updated ever after > it is initially built, so you can avoid data locks with the NOLOCK query > hint against this table. > > The "few mathematical operations" you mention are going to be performed > against large numbers of rows, loops nested within loops (O(n^2)). > > Of course all of this is just conjecture until you do an actual > performance comparison. > > -- > Thanks > > Michael Coles > SQL Server MVP > Author, "Expert SQL Server 2008 Encryption" > (http://www.apress.com/book/view/1430224649) > ---------------- >

0 |

3/28/2010 4:57:28 PM

Michael, You can't have it both ways. Either you have 4 billion row lookup table (which is what I assumed with my 40 GB RAM remark), or you have some kind of solution (that either does or does not require an auxiliary table). As long as that solution is not there, we can't discuss it, can we? As for your remark about "nested loops within loops": if that is a problem (in perception or performance), then it is easy to rewrite my solution to a UDF, or to write it as a scalar subquery as below. CREATE TABLE Foo (bin BIGINT NOT NULL PRIMARY KEY); INSERT INTO Foo VALUES(4294508303); --11111111111110001111111100001111 INSERT INTO Foo VALUES(4278124527); --11111110111111101111111111101111 INSERT INTO Foo VALUES(4294967295); -- all 1s INSERT INTO Foo VALUES(0); -- all 0s INSERT INTO Foo VALUES(32768) INSERT INTO Foo VALUES(2147483648) INSERT INTO Foo VALUES(0x7FFFFFFF) INSERT INTO Foo VALUES(0xFFFFFFFE) SELECT bin , CAST(bin AS binary(4)) AS HexRepresentation ,( SELECT 32 - COALESCE(FLOOR(LOG(NULLIF(MIN(test),0))/LOG(2)) + 1,0) FROM ( SELECT bin AS test UNION ALL SELECT bin*0x00000002 & 0xFFFFFFFF | 0x00000001 UNION ALL SELECT bin*0x00000004 & 0xFFFFFFFF | 0x00000003 UNION ALL SELECT bin*0x00000008 & 0xFFFFFFFF | 0x00000007 UNION ALL SELECT bin*0x00000010 & 0xFFFFFFFF | 0x0000000F UNION ALL SELECT bin*0x00000020 & 0xFFFFFFFF | 0x0000001F UNION ALL SELECT bin*0x00000040 & 0xFFFFFFFF | 0x0000003F UNION ALL SELECT bin*0x00000080 & 0xFFFFFFFF | 0x0000007F UNION ALL SELECT bin*0x00000100 & 0xFFFFFFFF | 0x000000FF UNION ALL SELECT bin*0x00000200 & 0xFFFFFFFF | 0x000001FF UNION ALL SELECT bin*0x00000400 & 0xFFFFFFFF | 0x000003FF UNION ALL SELECT bin*0x00000800 & 0xFFFFFFFF | 0x000007FF UNION ALL SELECT bin*0x00001000 & 0xFFFFFFFF | 0x00000FFF UNION ALL SELECT bin*0x00002000 & 0xFFFFFFFF | 0x00001FFF UNION ALL SELECT bin*0x00004000 & 0xFFFFFFFF | 0x00003FFF UNION ALL SELECT bin*0x00008000 & 0xFFFFFFFF | 0x00007FFF UNION ALL SELECT bin*0x00010000 & 0xFFFFFFFF | 0x0000FFFF UNION ALL SELECT bin*0x00020000 & 0xFFFFFFFF | 0x0001FFFF UNION ALL SELECT bin*0x00040000 & 0xFFFFFFFF | 0x0003FFFF UNION ALL SELECT bin*0x00080000 & 0xFFFFFFFF | 0x0007FFFF UNION ALL SELECT bin*0x00100000 & 0xFFFFFFFF | 0x000FFFFF UNION ALL SELECT bin*0x00200000 & 0xFFFFFFFF | 0x001FFFFF UNION ALL SELECT bin*0x00400000 & 0xFFFFFFFF | 0x003FFFFF UNION ALL SELECT bin*0x00800000 & 0xFFFFFFFF | 0x007FFFFF UNION ALL SELECT bin*0x01000000 & 0xFFFFFFFF | 0x00FFFFFF UNION ALL SELECT bin*0x02000000 & 0xFFFFFFFF | 0x01FFFFFF UNION ALL SELECT bin*0x04000000 & 0xFFFFFFFF | 0x03FFFFFF UNION ALL SELECT bin*0x08000000 & 0xFFFFFFFF | 0x07FFFFFF UNION ALL SELECT bin*0x10000000 & 0xFFFFFFFF | 0x0FFFFFFF UNION ALL SELECT bin*0x20000000 & 0xFFFFFFFF | 0x1FFFFFFF UNION ALL SELECT bin*0x40000000 & 0xFFFFFFFF | 0x3FFFFFFF UNION ALL SELECT bin*0x80000000 & 0xFFFFFFFF | 0x7FFFFFFF ) T ) AS MaxSequenceOf0Bits FROM Foo DROP TABLE Foo -- Gert-Jan Michael Coles wrote: > > "Gert-Jan Strik" <sorrytoomuchspamalready@xs4all.nl> wrote in message > news:4BAE8048.897B7EC8@xs4all.nl... > > Michael Coles wrote: [snipped] > >> But I'd bet you dollars to doughnuts that a lookup > >> against a properly indexed reference table would outperform procedural > >> code > >> in T-SQL. > > > > If I were a betting man, I would take that bet. IMO, you greatly > > underestimate the cost of lookups. They incur locking, lots of memory > > management and physical I/O (unless you have 40 GB of RAM dedicated to > > this lookup table). The alternative is just a few mathematical > > operations. From a performance perspective, the lookup table will not > > even come close. > > You're on. 40 GB of RAM? Maybe... if you declare all your columns > varbinary(max) and pad them with excessive zeroes. Just for reference how > many GB of RAM do you think need to be dedicated to the 17 row lookup table > in my previous post? > > This particular lookup table would not ever need to be updated ever after it > is initially built, so you can avoid data locks with the NOLOCK query hint > against this table. > > The "few mathematical operations" you mention are going to be performed > against large numbers of rows, loops nested within loops (O(n^2)). > > Of course all of this is just conjecture until you do an actual performance > comparison. > > -- > Thanks > > Michael Coles > SQL Server MVP > Author, "Expert SQL Server 2008 Encryption" > (http://www.apress.com/book/view/1430224649) > ----------------

0 |

3/28/2010 8:17:30 PM

Gert, If you read my previous message then you understand where I am going. I posted contents of a lookup table with 17 rows to cover the first set of combinations. As I also stated, the key is to determine if there is some pattern here. Don't worry, we will discuss it soon enough. I'm not overly concerned with the performance of your solution. The "nested loops within loops" refers to a procedural looping solution. For instance, a solution might use a de Bruijn sequence to count the number of right-most zero bits, which can be done in a loop combined with bit shifts to answer the OP's question. Shove in the next number. Wash. Rinse. Repeat. That's a loop within a loop. -- Thanks Michael Coles SQL Server MVP Author, "Expert SQL Server 2008 Encryption" (http://www.apress.com/book/view/1430224649) ---------------- "Gert-Jan Strik" <sorrytoomuchspamalready@xs4all.nl> wrote in message news:4BAFB95A.536A5639@xs4all.nl... > Michael, > > You can't have it both ways. Either you have 4 billion row lookup table > (which is what I assumed with my 40 GB RAM remark), or you have some > kind of solution (that either does or does not require an auxiliary > table). As long as that solution is not there, we can't discuss it, can > we? > > As for your remark about "nested loops within loops": if that is a > problem (in perception or performance), then it is easy to rewrite my > solution to a UDF, or to write it as a scalar subquery as below. > > CREATE TABLE Foo (bin BIGINT NOT NULL PRIMARY KEY); > > INSERT INTO Foo VALUES(4294508303); --11111111111110001111111100001111 > INSERT INTO Foo VALUES(4278124527); --11111110111111101111111111101111 > INSERT INTO Foo VALUES(4294967295); -- all 1s > INSERT INTO Foo VALUES(0); -- all 0s > INSERT INTO Foo VALUES(32768) > INSERT INTO Foo VALUES(2147483648) > INSERT INTO Foo VALUES(0x7FFFFFFF) > INSERT INTO Foo VALUES(0xFFFFFFFE) > > SELECT bin > , CAST(bin AS binary(4)) AS HexRepresentation > ,( SELECT 32 - COALESCE(FLOOR(LOG(NULLIF(MIN(test),0))/LOG(2)) + 1,0) > FROM ( SELECT bin AS test > UNION ALL SELECT bin*0x00000002 & 0xFFFFFFFF | 0x00000001 > UNION ALL SELECT bin*0x00000004 & 0xFFFFFFFF | 0x00000003 > UNION ALL SELECT bin*0x00000008 & 0xFFFFFFFF | 0x00000007 > UNION ALL SELECT bin*0x00000010 & 0xFFFFFFFF | 0x0000000F > UNION ALL SELECT bin*0x00000020 & 0xFFFFFFFF | 0x0000001F > UNION ALL SELECT bin*0x00000040 & 0xFFFFFFFF | 0x0000003F > UNION ALL SELECT bin*0x00000080 & 0xFFFFFFFF | 0x0000007F > UNION ALL SELECT bin*0x00000100 & 0xFFFFFFFF | 0x000000FF > UNION ALL SELECT bin*0x00000200 & 0xFFFFFFFF | 0x000001FF > UNION ALL SELECT bin*0x00000400 & 0xFFFFFFFF | 0x000003FF > UNION ALL SELECT bin*0x00000800 & 0xFFFFFFFF | 0x000007FF > UNION ALL SELECT bin*0x00001000 & 0xFFFFFFFF | 0x00000FFF > UNION ALL SELECT bin*0x00002000 & 0xFFFFFFFF | 0x00001FFF > UNION ALL SELECT bin*0x00004000 & 0xFFFFFFFF | 0x00003FFF > UNION ALL SELECT bin*0x00008000 & 0xFFFFFFFF | 0x00007FFF > UNION ALL SELECT bin*0x00010000 & 0xFFFFFFFF | 0x0000FFFF > UNION ALL SELECT bin*0x00020000 & 0xFFFFFFFF | 0x0001FFFF > UNION ALL SELECT bin*0x00040000 & 0xFFFFFFFF | 0x0003FFFF > UNION ALL SELECT bin*0x00080000 & 0xFFFFFFFF | 0x0007FFFF > UNION ALL SELECT bin*0x00100000 & 0xFFFFFFFF | 0x000FFFFF > UNION ALL SELECT bin*0x00200000 & 0xFFFFFFFF | 0x001FFFFF > UNION ALL SELECT bin*0x00400000 & 0xFFFFFFFF | 0x003FFFFF > UNION ALL SELECT bin*0x00800000 & 0xFFFFFFFF | 0x007FFFFF > UNION ALL SELECT bin*0x01000000 & 0xFFFFFFFF | 0x00FFFFFF > UNION ALL SELECT bin*0x02000000 & 0xFFFFFFFF | 0x01FFFFFF > UNION ALL SELECT bin*0x04000000 & 0xFFFFFFFF | 0x03FFFFFF > UNION ALL SELECT bin*0x08000000 & 0xFFFFFFFF | 0x07FFFFFF > UNION ALL SELECT bin*0x10000000 & 0xFFFFFFFF | 0x0FFFFFFF > UNION ALL SELECT bin*0x20000000 & 0xFFFFFFFF | 0x1FFFFFFF > UNION ALL SELECT bin*0x40000000 & 0xFFFFFFFF | 0x3FFFFFFF > UNION ALL SELECT bin*0x80000000 & 0xFFFFFFFF | 0x7FFFFFFF > ) T ) AS MaxSequenceOf0Bits > FROM Foo > > DROP TABLE Foo > > -- > Gert-Jan > > > Michael Coles wrote: >> >> "Gert-Jan Strik" <sorrytoomuchspamalready@xs4all.nl> wrote in message >> news:4BAE8048.897B7EC8@xs4all.nl... >> > Michael Coles wrote: [snipped] >> >> But I'd bet you dollars to doughnuts that a lookup >> >> against a properly indexed reference table would outperform procedural >> >> code >> >> in T-SQL. >> > >> > If I were a betting man, I would take that bet. IMO, you greatly >> > underestimate the cost of lookups. They incur locking, lots of memory >> > management and physical I/O (unless you have 40 GB of RAM dedicated to >> > this lookup table). The alternative is just a few mathematical >> > operations. From a performance perspective, the lookup table will not >> > even come close. >> >> You're on. 40 GB of RAM? Maybe... if you declare all your columns >> varbinary(max) and pad them with excessive zeroes. Just for reference >> how >> many GB of RAM do you think need to be dedicated to the 17 row lookup >> table >> in my previous post? >> >> This particular lookup table would not ever need to be updated ever after >> it >> is initially built, so you can avoid data locks with the NOLOCK query >> hint >> against this table. >> >> The "few mathematical operations" you mention are going to be performed >> against large numbers of rows, loops nested within loops (O(n^2)). >> >> Of course all of this is just conjecture until you do an actual >> performance >> comparison. >> >> -- >> Thanks >> >> Michael Coles >> SQL Server MVP >> Author, "Expert SQL Server 2008 Encryption" >> (http://www.apress.com/book/view/1430224649) >> ----------------

0 |

3/28/2010 10:05:07 PM

"Michael Coles" <admin@geocodenet.com> wrote in message news:77267149-94D0-4BD0-9DD8-0A36C484EE1A@microsoft.com... > Gert, > > If you read my previous message then you understand where I am going. I > posted contents of a lookup table with 17 rows to cover the first set of > combinations. As I also stated, the key is to determine if there is some > pattern here. Don't worry, we will discuss it soon enough. > > I'm not overly concerned with the performance of your solution. The > "nested loops within loops" refers to a procedural looping solution. For > instance, a solution might use a de Bruijn sequence to count the number of > right-most zero bits, which can be done in a loop combined with bit shifts > to answer the OP's question. Shove in the next number. Wash. Rinse. > Repeat. That's a loop within a loop. Hi Michael and Gert, I think we can take another piece of code I wrote as an example here. As part of this problem I also needed something to find the number of bits turned on in a 32 bit number. I tested both a set based and procedural based method. Set based method used a lookup table with 32 rows with the values of 1,2,4,8,16,32 etc. I wrote a query to simply select count(*) from this lookup table where the bitmask and my number was not zero. The procedural method just incremented a value if my number & 1 was 1 and then divided the number by 2 until it was zero. This was placed in a udf. This wasn't particularly smart in any way and could have been optimised further. The result was that the unoptimised procedural method was 6 times faster than the set based method. Also, I timed every solution provided in this thread and compared it to my procedural method with similar results. The procedural method run 100,000 iterations in 9 seconds where the quickest set based method took 58 seconds. This isn't as big a difference as I expected but still quite significant. The thing was when I optimised the procedural method to remove the loop then it executed 10 times faster again making it 60 times faster than the set based method. This was the function I used to count the number of bits: ALTER FUNCTION [dbo].[udfBitCount32](@Value BIGINT) RETURNS INT AS BEGIN SET @Value = @Value - ((@Value / 2) & 1431655765) SET @Value = (@Value & 858993459) + ((@Value / 4) & 858993459) RETURN (((@Value + (@Value / 16) & 252645135) * 16843009) & CAST(4294967295 AS BIGINT)) / 16777216 END Interesting, C# did around 150,000,000 bitcounts per second so was around 1350 times faster.

0 |

3/29/2010 1:00:48 AM