duplicate records

  • Follow


Hi, using sql2005. I want to select unique records from a table. 
Unfortunitely the field that is used to sort the output cannot be in the 
select distinct statement. Any ideas?

Many thanks,
Jonathan


0
Reply Utf 12/5/2009 11:51:01 AM

> Hi, using sql2005. I want to select unique records from a table.
> Unfortunitely the field that is used to sort the output cannot be in the
> select distinct statement. Any ideas?

If I understand correctly, you have a scenario like:

CREATE TABLE dbo.FooBar(
    Foo int NOT NULL,
    Bar int NOT NULL
    );

INSERT INTO dbo.FooBar VALUES
    (1,1),
    (1,3),
    (2,2);

SELECT DISTINCT Foo
FROM dbo.FooBar
ORDER BY Bar;

Of course, the above SELECT will not work because the ORDER BY expression 
must be in the SELECT list when you use DISTINCT.  The desired sequence 
would otherwise be ambiguous.

Can you elaborate on your expected results?  Maybe you could use a GROUP BY 
instead of DISTINCT so that a well-defined aggregate value can be specified 
in the ORDER BY:

SELECT Foo
FROM dbo.FooBar
GROUP BY Foo
ORDER BY MIN(Bar);

-- 
Hope this helps.

Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/


"Jonathan" <Jonathan@discussions.microsoft.com> wrote in message 
news:0E3E7D65-9770-46E2-93BE-7C006A8566A9@microsoft.com...
> Hi, using sql2005. I want to select unique records from a table.
> Unfortunitely the field that is used to sort the output cannot be in the
> select distinct statement. Any ideas?
>
> Many thanks,
> Jonathan
>
> 
0
Reply Dan 12/5/2009 3:06:15 PM


> INSERT INTO dbo.FooBar VALUES
>    (1,1),
>    (1,3),
>    (2,2);

By the way, the INSERT script I posted uses SQL 2008 syntax but my 
suggestion applies to SQL 2005 as well.

-- 
Hope this helps.

Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/


 

0
Reply Dan 12/5/2009 3:13:10 PM

Jonathan (Jonathan@discussions.microsoft.com) writes:
> Hi, using sql2005. I want to select unique records from a table.
> Unfortunitely the field that is used to sort the output cannot be in the
> select distinct statement. Any ideas?

In a general perspective your request does not make sense. Consider this
data:

   id   City
    1   London
    2   Brussels
    3   C�rdoba
    4   Birmingham
    5   London
    6   Copenhagen
    7   C�rdoba
    8   Copenhagen
    9   London

So you want:

   SELECT DISTINCT City FROM tbl ORDER  BY id

But what would this mean? Should London sort before or after Brussels?

What could make sense is to sort the cities by the smallest id (or
largest, sum of ids etc), which can be expressed as:

   SELECT City
   FROM   (SELECT City, minid = MIN(id)
           FROM   cities
           GROUP  BY City) AS x
   ORDER BY minid

Now, it may be that in your case there is a constraint of some sort
which makes your request sensible, but I since don't know your
query, I can't tell. There is also the possibility that you I have
really thought the whole way what you are asking for.

I would suggest that you post more information about your query.


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

0
Reply Erland 12/5/2009 3:20:01 PM

You can use the ranking functions with a partition on the distinct columns and then select rank 1:

CREATE TABLE Foo (
  keycol INT NOT NULL PRIMARY KEY,
  col1 INT,
  col2 INT,
  col3 INT);

INSERT INTO Foo VALUES(1, 1, 1, 1);
INSERT INTO Foo VALUES(2, 1, 1, 2);
INSERT INTO Foo VALUES(3, 1, 1, 3);
INSERT INTO Foo VALUES(4, 1, 2, 4);

SELECT col1, col2
FROM (
SELECT col1, col2, col3,
        ROW_NUMBER() OVER(PARTITION BY col1, col2 ORDER BY col3) AS rn
FROM Foo) AS T
WHERE rn = 1
ORDER BY col3;

/*

col1        col2
----------- -----------
1           1
1           2

*/

-- 
Plamen Ratchev
http://www.SQLStudio.com
0
Reply Plamen 12/5/2009 3:31:16 PM

Hi Erland, thanks for this suggestion I'll give it a go.
Many thanks,
Jonathan

"Erland Sommarskog" wrote:

> Jonathan (Jonathan@discussions.microsoft.com) writes:
> > Hi, using sql2005. I want to select unique records from a table.
> > Unfortunitely the field that is used to sort the output cannot be in the
> > select distinct statement. Any ideas?
> 
> In a general perspective your request does not make sense. Consider this
> data:
> 
>    id   City
>     1   London
>     2   Brussels
>     3   Córdoba
>     4   Birmingham
>     5   London
>     6   Copenhagen
>     7   Córdoba
>     8   Copenhagen
>     9   London
> 
> So you want:
> 
>    SELECT DISTINCT City FROM tbl ORDER  BY id
> 
> But what would this mean? Should London sort before or after Brussels?
> 
> What could make sense is to sort the cities by the smallest id (or
> largest, sum of ids etc), which can be expressed as:
> 
>    SELECT City
>    FROM   (SELECT City, minid = MIN(id)
>            FROM   cities
>            GROUP  BY City) AS x
>    ORDER BY minid
> 
> Now, it may be that in your case there is a constraint of some sort
> which makes your request sensible, but I since don't know your
> query, I can't tell. There is also the possibility that you I have
> really thought the whole way what you are asking for.
> 
> I would suggest that you post more information about your query.
> 
> 
> --
> Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
> 
> Links for SQL Server Books Online:
> SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
> SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
> SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
> 
> .
> 
0
Reply Utf 12/6/2009 7:57:01 AM

Hi Plamen, thanks for this suggestion I'll give it a go.
Many thanks,
Jonathan

"Plamen Ratchev" wrote:

> You can use the ranking functions with a partition on the distinct columns and then select rank 1:
> 
> CREATE TABLE Foo (
>   keycol INT NOT NULL PRIMARY KEY,
>   col1 INT,
>   col2 INT,
>   col3 INT);
> 
> INSERT INTO Foo VALUES(1, 1, 1, 1);
> INSERT INTO Foo VALUES(2, 1, 1, 2);
> INSERT INTO Foo VALUES(3, 1, 1, 3);
> INSERT INTO Foo VALUES(4, 1, 2, 4);
> 
> SELECT col1, col2
> FROM (
> SELECT col1, col2, col3,
>         ROW_NUMBER() OVER(PARTITION BY col1, col2 ORDER BY col3) AS rn
> FROM Foo) AS T
> WHERE rn = 1
> ORDER BY col3;
> 
> /*
> 
> col1        col2
> ----------- -----------
> 1           1
> 1           2
> 
> */
> 
> -- 
> Plamen Ratchev
> http://www.SQLStudio.com
> .
> 
0
Reply Utf 12/6/2009 7:58:01 AM

Worked like a charm.
Thanks,
Jonathan

"Erland Sommarskog" wrote:

> Jonathan (Jonathan@discussions.microsoft.com) writes:
> > Hi, using sql2005. I want to select unique records from a table.
> > Unfortunitely the field that is used to sort the output cannot be in the
> > select distinct statement. Any ideas?
> 
> In a general perspective your request does not make sense. Consider this
> data:
> 
>    id   City
>     1   London
>     2   Brussels
>     3   Córdoba
>     4   Birmingham
>     5   London
>     6   Copenhagen
>     7   Córdoba
>     8   Copenhagen
>     9   London
> 
> So you want:
> 
>    SELECT DISTINCT City FROM tbl ORDER  BY id
> 
> But what would this mean? Should London sort before or after Brussels?
> 
> What could make sense is to sort the cities by the smallest id (or
> largest, sum of ids etc), which can be expressed as:
> 
>    SELECT City
>    FROM   (SELECT City, minid = MIN(id)
>            FROM   cities
>            GROUP  BY City) AS x
>    ORDER BY minid
> 
> Now, it may be that in your case there is a constraint of some sort
> which makes your request sensible, but I since don't know your
> query, I can't tell. There is also the possibility that you I have
> really thought the whole way what you are asking for.
> 
> I would suggest that you post more information about your query.
> 
> 
> --
> Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
> 
> Links for SQL Server Books Online:
> SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
> SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
> SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
> 
> .
> 
0
Reply Utf 12/6/2009 8:04:01 AM

Hi Plamen, unfortunitely I could not make this method get the desired 
outcome. However, I did appreciated exploring this approach.

Many thanks,
Jonathan

"Plamen Ratchev" wrote:

> You can use the ranking functions with a partition on the distinct columns and then select rank 1:
> 
> CREATE TABLE Foo (
>   keycol INT NOT NULL PRIMARY KEY,
>   col1 INT,
>   col2 INT,
>   col3 INT);
> 
> INSERT INTO Foo VALUES(1, 1, 1, 1);
> INSERT INTO Foo VALUES(2, 1, 1, 2);
> INSERT INTO Foo VALUES(3, 1, 1, 3);
> INSERT INTO Foo VALUES(4, 1, 2, 4);
> 
> SELECT col1, col2
> FROM (
> SELECT col1, col2, col3,
>         ROW_NUMBER() OVER(PARTITION BY col1, col2 ORDER BY col3) AS rn
> FROM Foo) AS T
> WHERE rn = 1
> ORDER BY col3;
> 
> /*
> 
> col1        col2
> ----------- -----------
> 1           1
> 1           2
> 
> */
> 
> -- 
> Plamen Ratchev
> http://www.SQLStudio.com
> .
> 
0
Reply Utf 12/6/2009 8:16:01 AM

Hi Plamen, yep I got it to work. Thanks.

Jonathan

"Plamen Ratchev" wrote:

> You can use the ranking functions with a partition on the distinct columns and then select rank 1:
> 
> CREATE TABLE Foo (
>   keycol INT NOT NULL PRIMARY KEY,
>   col1 INT,
>   col2 INT,
>   col3 INT);
> 
> INSERT INTO Foo VALUES(1, 1, 1, 1);
> INSERT INTO Foo VALUES(2, 1, 1, 2);
> INSERT INTO Foo VALUES(3, 1, 1, 3);
> INSERT INTO Foo VALUES(4, 1, 2, 4);
> 
> SELECT col1, col2
> FROM (
> SELECT col1, col2, col3,
>         ROW_NUMBER() OVER(PARTITION BY col1, col2 ORDER BY col3) AS rn
> FROM Foo) AS T
> WHERE rn = 1
> ORDER BY col3;
> 
> /*
> 
> col1        col2
> ----------- -----------
> 1           1
> 1           2
> 
> */
> 
> -- 
> Plamen Ratchev
> http://www.SQLStudio.com
> .
> 
0
Reply Utf 12/6/2009 8:34:01 AM

One more
http://dimantdatabasesolutions.blogspot.com/2007/02/dealing-with-duplicates.html



"Jonathan" <Jonathan@discussions.microsoft.com> wrote in message 
news:0E3E7D65-9770-46E2-93BE-7C006A8566A9@microsoft.com...
> Hi, using sql2005. I want to select unique records from a table.
> Unfortunitely the field that is used to sort the output cannot be in the
> select distinct statement. Any ideas?
>
> Many thanks,
> Jonathan
>
> 


0
Reply Uri 12/6/2009 1:26:46 PM

nice
Many thanks,
Jonathan

"Uri Dimant" wrote:

> One more
> http://dimantdatabasesolutions.blogspot.com/2007/02/dealing-with-duplicates.html
> 
> 
> 
> "Jonathan" <Jonathan@discussions.microsoft.com> wrote in message 
> news:0E3E7D65-9770-46E2-93BE-7C006A8566A9@microsoft.com...
> > Hi, using sql2005. I want to select unique records from a table.
> > Unfortunitely the field that is used to sort the output cannot be in the
> > select distinct statement. Any ideas?
> >
> > Many thanks,
> > Jonathan
> >
> > 
> 
> 
> .
> 
0
Reply Utf 12/13/2009 7:34:01 PM

11 Replies
115 Views

(page loaded in 1.215 seconds)

Similiar Articles:































7/22/2012 6:00:35 PM


Reply: