Finding and deleting dupe queries across two fields

  • Follow


How can I display duplicate addresses with matching postcodes. At the moment 
my SQL will only find duplicate Street Names, which will only be duplicate if 
they also have a matching postcode. Because like a lt of big cities, you can 
have more than one identical street name, but it is when combined with the 
postcode that it becomes unique, but with possible duplicates.

Below is the SQL to find duplicate Street Names, but I need it to match 
against the corresponding postcode to find the duplicate addresses.

How can i find the dupes across both fields; [StreetName] - [Postcode] and 
then delete just the dupes, leaving the one copy of each record intact?




SELECT [Street Names].StreetName, [Street Names].StreetNameID, [Street 
Names].Postcode
FROM [Street Names]
WHERE ((([Street Names].StreetName) In (SELECT [StreetName] FROM [Street 
Names] As Tmp GROUP BY [StreetName] HAVING Count(*)>1 )))
ORDER BY [Street Names].StreetName;

0
Reply Utf 5/15/2007 3:24:00 PM

You can try the following modification.  I think it may give you the result 
you are looking for.

SELECT [Street Names].StreetName
, [Street Names].StreetNameID
, [Street Names].Postcode
FROM [Street Names]
WHERE [Street Names].StreetName In
(SELECT [StreetName]
FROM [Street Names] As Tmp
GROUP BY [StreetName], PostCode
HAVING Count(*)>1  AND  Tmp.PostCode = [Street Names].PostCode)
ORDER BY [Street Names].StreetName;

-- 
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..

"efandango" <efandango@discussions.microsoft.com> wrote in message 
news:87A5E01D-930C-4917-9A3A-3954AE3E6162@microsoft.com...
> How can I display duplicate addresses with matching postcodes. At the 
> moment
> my SQL will only find duplicate Street Names, which will only be duplicate 
> if
> they also have a matching postcode. Because like a lt of big cities, you 
> can
> have more than one identical street name, but it is when combined with the
> postcode that it becomes unique, but with possible duplicates.
>
> Below is the SQL to find duplicate Street Names, but I need it to match
> against the corresponding postcode to find the duplicate addresses.
>
> How can i find the dupes across both fields; [StreetName] - [Postcode] and
> then delete just the dupes, leaving the one copy of each record intact?
>
>
>
>
> SELECT [Street Names].StreetName, [Street Names].StreetNameID, [Street
> Names].Postcode
> FROM [Street Names]
> WHERE ((([Street Names].StreetName) In (SELECT [StreetName] FROM [Street
> Names] As Tmp GROUP BY [StreetName] HAVING Count(*)>1 )))
> ORDER BY [Street Names].StreetName;
> 


0
Reply John 5/15/2007 4:06:12 PM


Thanks John,

But how do I delete the dupes leaving a single record of each entry?



"John Spencer" wrote:

> You can try the following modification.  I think it may give you the result 
> you are looking for.
> 
> SELECT [Street Names].StreetName
> , [Street Names].StreetNameID
> , [Street Names].Postcode
> FROM [Street Names]
> WHERE [Street Names].StreetName In
> (SELECT [StreetName]
> FROM [Street Names] As Tmp
> GROUP BY [StreetName], PostCode
> HAVING Count(*)>1  AND  Tmp.PostCode = [Street Names].PostCode)
> ORDER BY [Street Names].StreetName;
> 
> -- 
> John Spencer
> Access MVP 2002-2005, 2007
> Center for Health Program Development and Management
> University of Maryland Baltimore County
> ..
> 
> "efandango" <efandango@discussions.microsoft.com> wrote in message 
> news:87A5E01D-930C-4917-9A3A-3954AE3E6162@microsoft.com...
> > How can I display duplicate addresses with matching postcodes. At the 
> > moment
> > my SQL will only find duplicate Street Names, which will only be duplicate 
> > if
> > they also have a matching postcode. Because like a lt of big cities, you 
> > can
> > have more than one identical street name, but it is when combined with the
> > postcode that it becomes unique, but with possible duplicates.
> >
> > Below is the SQL to find duplicate Street Names, but I need it to match
> > against the corresponding postcode to find the duplicate addresses.
> >
> > How can i find the dupes across both fields; [StreetName] - [Postcode] and
> > then delete just the dupes, leaving the one copy of each record intact?
> >
> >
> >
> >
> > SELECT [Street Names].StreetName, [Street Names].StreetNameID, [Street
> > Names].Postcode
> > FROM [Street Names]
> > WHERE ((([Street Names].StreetName) In (SELECT [StreetName] FROM [Street
> > Names] As Tmp GROUP BY [StreetName] HAVING Count(*)>1 )))
> > ORDER BY [Street Names].StreetName;
> > 
> 
> 
> 
0
Reply Utf 5/15/2007 8:06:03 PM

Do you have a primary key field in the table street names?  And if so is 
it a single field?

For instance if StreetNameID is unique for every record in the table, 
you can use the query below to delete selected records.

DELETE *
FROM [Street Names]
WHERE StreetNameID in
   (SELECT Max([Street Names].StreetNameID)
   FROM [Street Names] as Tmp2
   WHERE Tmp2.StreetName In
      (SELECT StreetName
       FROM [Street Names] As Tmp
       GROUP BY StreetName, PostCode
       HAVING Count(*)>1  AND  Tmp.PostCode = Tmp2.PostCode)
   GROUP BY Tmp2.StreetName)

That may fail due to the table name being [Street Names] instead of 
StreetNames (no spaces).  If it does, you will have to build a series of 
queries to do this.


'====================================================
  John Spencer
  Access MVP 2002-2005, 2007
  Center for Health Program Development and Management
  University of Maryland Baltimore County
'====================================================


efandango wrote:
> Thanks John,
> 
> But how do I delete the dupes leaving a single record of each entry?
> 
> 
> 
> "John Spencer" wrote:
> 
>> You can try the following modification.  I think it may give you the result 
>> you are looking for.
>>
>> SELECT [Street Names].StreetName
>> , [Street Names].StreetNameID
>> , [Street Names].Postcode
>> FROM [Street Names]
>> WHERE [Street Names].StreetName In
>> (SELECT [StreetName]
>> FROM [Street Names] As Tmp
>> GROUP BY [StreetName], PostCode
>> HAVING Count(*)>1  AND  Tmp.PostCode = [Street Names].PostCode)
>> ORDER BY [Street Names].StreetName;
>>
>> -- 
>> John Spencer
>> Access MVP 2002-2005, 2007
>> Center for Health Program Development and Management
>> University of Maryland Baltimore County
>> ..
>>
>> "efandango" <efandango@discussions.microsoft.com> wrote in message 
>> news:87A5E01D-930C-4917-9A3A-3954AE3E6162@microsoft.com...
>>> How can I display duplicate addresses with matching postcodes. At the 
>>> moment
>>> my SQL will only find duplicate Street Names, which will only be duplicate 
>>> if
>>> they also have a matching postcode. Because like a lt of big cities, you 
>>> can
>>> have more than one identical street name, but it is when combined with the
>>> postcode that it becomes unique, but with possible duplicates.
>>>
>>> Below is the SQL to find duplicate Street Names, but I need it to match
>>> against the corresponding postcode to find the duplicate addresses.
>>>
>>> How can i find the dupes across both fields; [StreetName] - [Postcode] and
>>> then delete just the dupes, leaving the one copy of each record intact?
>>>
>>>
>>>
>>>
>>> SELECT [Street Names].StreetName, [Street Names].StreetNameID, [Street
>>> Names].Postcode
>>> FROM [Street Names]
>>> WHERE ((([Street Names].StreetName) In (SELECT [StreetName] FROM [Street
>>> Names] As Tmp GROUP BY [StreetName] HAVING Count(*)>1 )))
>>> ORDER BY [Street Names].StreetName;
>>>
>>
>>
0
Reply John 5/15/2007 11:13:54 PM

Thanks John,

Yes, I do have a primary key field in the table street names. it's called 
[StreetNameID].

Here's a sample of the data:


StreetNameID	StreetName	Postcode
1	Abberley Mews 	SW4
2	Abbevelle Mews 	SW4
3	Abbeville Mews 	SW4
4	Abbeville Road 	SW4
5	Abbey Business Centre 	SW8

I will try your suggested SQL and let you know. It's late here now (UK) and 
I've been trying to sort this huge table of adresses all day and slowly going 
out of my mind in the process...






"John Spencer" wrote:

> Do you have a primary key field in the table street names?  And if so is 
> it a single field?
> 
> For instance if StreetNameID is unique for every record in the table, 
> you can use the query below to delete selected records.
> 
> DELETE *
> FROM [Street Names]
> WHERE StreetNameID in
>    (SELECT Max([Street Names].StreetNameID)
>    FROM [Street Names] as Tmp2
>    WHERE Tmp2.StreetName In
>       (SELECT StreetName
>        FROM [Street Names] As Tmp
>        GROUP BY StreetName, PostCode
>        HAVING Count(*)>1  AND  Tmp.PostCode = Tmp2.PostCode)
>    GROUP BY Tmp2.StreetName)
> 
> That may fail due to the table name being [Street Names] instead of 
> StreetNames (no spaces).  If it does, you will have to build a series of 
> queries to do this.
> 
> 
> '====================================================
>   John Spencer
>   Access MVP 2002-2005, 2007
>   Center for Health Program Development and Management
>   University of Maryland Baltimore County
> '====================================================
> 
> 
> efandango wrote:
> > Thanks John,
> > 
> > But how do I delete the dupes leaving a single record of each entry?
> > 
> > 
> > 
> > "John Spencer" wrote:
> > 
> >> You can try the following modification.  I think it may give you the result 
> >> you are looking for.
> >>
> >> SELECT [Street Names].StreetName
> >> , [Street Names].StreetNameID
> >> , [Street Names].Postcode
> >> FROM [Street Names]
> >> WHERE [Street Names].StreetName In
> >> (SELECT [StreetName]
> >> FROM [Street Names] As Tmp
> >> GROUP BY [StreetName], PostCode
> >> HAVING Count(*)>1  AND  Tmp.PostCode = [Street Names].PostCode)
> >> ORDER BY [Street Names].StreetName;
> >>
> >> -- 
> >> John Spencer
> >> Access MVP 2002-2005, 2007
> >> Center for Health Program Development and Management
> >> University of Maryland Baltimore County
> >> ..
> >>
> >> "efandango" <efandango@discussions.microsoft.com> wrote in message 
> >> news:87A5E01D-930C-4917-9A3A-3954AE3E6162@microsoft.com...
> >>> How can I display duplicate addresses with matching postcodes. At the 
> >>> moment
> >>> my SQL will only find duplicate Street Names, which will only be duplicate 
> >>> if
> >>> they also have a matching postcode. Because like a lt of big cities, you 
> >>> can
> >>> have more than one identical street name, but it is when combined with the
> >>> postcode that it becomes unique, but with possible duplicates.
> >>>
> >>> Below is the SQL to find duplicate Street Names, but I need it to match
> >>> against the corresponding postcode to find the duplicate addresses.
> >>>
> >>> How can i find the dupes across both fields; [StreetName] - [Postcode] and
> >>> then delete just the dupes, leaving the one copy of each record intact?
> >>>
> >>>
> >>>
> >>>
> >>> SELECT [Street Names].StreetName, [Street Names].StreetNameID, [Street
> >>> Names].Postcode
> >>> FROM [Street Names]
> >>> WHERE ((([Street Names].StreetName) In (SELECT [StreetName] FROM [Street
> >>> Names] As Tmp GROUP BY [StreetName] HAVING Count(*)>1 )))
> >>> ORDER BY [Street Names].StreetName;
> >>>
> >>
> >>
> 
0
Reply Utf 5/16/2007 12:06:00 AM

JOhn,

That didn't work. (it just slowly goes through small progress bar botom left 
of screen and eventually results nothing)

When you mention 'failure due to the table name being [Street Names] instead 
of StreetNames (no spaces).' Can I just rename the query (it is independent 
of everything else). I don't really understand  "If it does, you will have to 
build a series of queries to do this." I can't envision this series of 
queries.





"John Spencer" wrote:

> Do you have a primary key field in the table street names?  And if so is 
> it a single field?
> 
> For instance if StreetNameID is unique for every record in the table, 
> you can use the query below to delete selected records.
> 
> DELETE *
> FROM [Street Names]
> WHERE StreetNameID in
>    (SELECT Max([Street Names].StreetNameID)
>    FROM [Street Names] as Tmp2
>    WHERE Tmp2.StreetName In
>       (SELECT StreetName
>        FROM [Street Names] As Tmp
>        GROUP BY StreetName, PostCode
>        HAVING Count(*)>1  AND  Tmp.PostCode = Tmp2.PostCode)
>    GROUP BY Tmp2.StreetName)
> 
> That may fail due to the table name being [Street Names] instead of 
> StreetNames (no spaces).  If it does, you will have to build a series of 
> queries to do this.
> 
> 
> '====================================================
>   John Spencer
>   Access MVP 2002-2005, 2007
>   Center for Health Program Development and Management
>   University of Maryland Baltimore County
> '====================================================
> 
> 
> efandango wrote:
> > Thanks John,
> > 
> > But how do I delete the dupes leaving a single record of each entry?
> > 
> > 
> > 
> > "John Spencer" wrote:
> > 
> >> You can try the following modification.  I think it may give you the result 
> >> you are looking for.
> >>
> >> SELECT [Street Names].StreetName
> >> , [Street Names].StreetNameID
> >> , [Street Names].Postcode
> >> FROM [Street Names]
> >> WHERE [Street Names].StreetName In
> >> (SELECT [StreetName]
> >> FROM [Street Names] As Tmp
> >> GROUP BY [StreetName], PostCode
> >> HAVING Count(*)>1  AND  Tmp.PostCode = [Street Names].PostCode)
> >> ORDER BY [Street Names].StreetName;
> >>
> >> -- 
> >> John Spencer
> >> Access MVP 2002-2005, 2007
> >> Center for Health Program Development and Management
> >> University of Maryland Baltimore County
> >> ..
> >>
> >> "efandango" <efandango@discussions.microsoft.com> wrote in message 
> >> news:87A5E01D-930C-4917-9A3A-3954AE3E6162@microsoft.com...
> >>> How can I display duplicate addresses with matching postcodes. At the 
> >>> moment
> >>> my SQL will only find duplicate Street Names, which will only be duplicate 
> >>> if
> >>> they also have a matching postcode. Because like a lt of big cities, you 
> >>> can
> >>> have more than one identical street name, but it is when combined with the
> >>> postcode that it becomes unique, but with possible duplicates.
> >>>
> >>> Below is the SQL to find duplicate Street Names, but I need it to match
> >>> against the corresponding postcode to find the duplicate addresses.
> >>>
> >>> How can i find the dupes across both fields; [StreetName] - [Postcode] and
> >>> then delete just the dupes, leaving the one copy of each record intact?
> >>>
> >>>
> >>>
> >>>
> >>> SELECT [Street Names].StreetName, [Street Names].StreetNameID, [Street
> >>> Names].Postcode
> >>> FROM [Street Names]
> >>> WHERE ((([Street Names].StreetName) In (SELECT [StreetName] FROM [Street
> >>> Names] As Tmp GROUP BY [StreetName] HAVING Count(*)>1 )))
> >>> ORDER BY [Street Names].StreetName;
> >>>
> >>
> >>
> 
0
Reply Utf 5/16/2007 3:50:03 PM

STANDARD ADVICE:  Backup your data before doing this.  There is no undo 
available when you run the last query.

First query is one you used to identify the duplicate records (save it for 
example as  - qDeleteA)
SELECT *
FROM [Street Names]
WHERE StreetName in (
 SELECT StreetName
 FROM [Street Names] As Tmp
 GROUP BY StreetName, PostCode
 HAVING Count(*)>1  AND  Tmp.PostCode = [StreetNames].PostCode)

qDeleteA should give you all the duplicates.  Now use that in another query 
that you save as qDeleteB
SELECT StreetName, PostCode, Max(StreetNameID) as MaxStreetID
FROM qDeleteA
GROUP BY StreetName, PostCode

This should give you the records to keep

Now use that in the last query
DELETE DISTINCT ROW [A].*
FROM [Street Names]  as A INNER JOIN qDeleteB
ON A.StreetName = qDeleteB.StreetName
And A.PostCode = qDeleteB.PostCode
WHERE A.StreetNameID <> B.MaxStreetID

STANDARD ADVICE:  Backup your data before doing this.  There is no undo 
available when you run the last query.

-- 
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..

"efandango" <efandango@discussions.microsoft.com> wrote in message 
news:5F840B4B-5DD0-4C0A-B1F6-7B43DFEED2DC@microsoft.com...
> JOhn,
>
> That didn't work. (it just slowly goes through small progress bar botom 
> left
> of screen and eventually results nothing)
>
> When you mention 'failure due to the table name being [Street Names] 
> instead
> of StreetNames (no spaces).' Can I just rename the query (it is 
> independent
> of everything else). I don't really understand  "If it does, you will have 
> to
> build a series of queries to do this." I can't envision this series of
> queries.
>
>
>
>
>
> "John Spencer" wrote:
>
>> Do you have a primary key field in the table street names?  And if so is
>> it a single field?
>>
>> For instance if StreetNameID is unique for every record in the table,
>> you can use the query below to delete selected records.
>>
>> DELETE *
>> FROM [Street Names]
>> WHERE StreetNameID in
>>    (SELECT Max([Street Names].StreetNameID)
>>    FROM [Street Names] as Tmp2
>>    WHERE Tmp2.StreetName In
>>       (SELECT StreetName
>>        FROM [Street Names] As Tmp
>>        GROUP BY StreetName, PostCode
>>        HAVING Count(*)>1  AND  Tmp.PostCode = Tmp2.PostCode)
>>    GROUP BY Tmp2.StreetName)
>>
>> That may fail due to the table name being [Street Names] instead of
>> StreetNames (no spaces).  If it does, you will have to build a series of
>> queries to do this.
>>
>>
>> '====================================================
>>   John Spencer
>>   Access MVP 2002-2005, 2007
>>   Center for Health Program Development and Management
>>   University of Maryland Baltimore County
>> '====================================================


0
Reply John 5/16/2007 5:17:12 PM

6 Replies
127 Views

(page loaded in 0.258 seconds)

Similiar Articles:











7/29/2012 5:10:12 AM


Reply: