How to best approach deleting records with both duplicate a address and a duplicate name in a large flat file

Hello,
I have a medium sized file (approx 30,000 records) that has many names
and addresses that have been entered twice. I need to find an
efficient way to delete all records that have both a duplicate name
and a duplicate address. Records with a duplicate name but not a
duplicate address should stay and vice versa.

Is there a best way to approach this?  Should this typically be
handled with a query. Does the Find duplicates query handle this
problem well?

Thanks for any insight anyone can provide. I am sure this is a common
problem, but i have not had to do it before and my first few attempts
don't seem to work.

(I will also need to attempt to delete Like records as the next step,
but for now I guess I will just figure out how to get the main file's
exact duplicates out of the equation.)
0
Mike
3/15/2008 1:48:13 PM
access 16762 articles. 3 followers. Follow

5 Replies
548 Views

Similar Articles

[PageSpeed] 9

This example assumes:
- duplicate defined on combination of:
    Surname + FirstName + Address + City
- you can ignore records that have nulls in those fields
- a primary key field of ClientID
- The record you want to keep is the one with the lower primary key value.

Use a subquery like this:
    DELETE FROM tblClient
    WHERE tblClient.ClientID <>
        (SELECT Min(ClientID) AS MinClient
        FROM tblClient AS Dupe
        WHERE (Dupe.Surname = tblClient.Surname)
        AND (Dupe.FirstName = tblClient.FirstName)
        AND (Dupe.Address = tblClient.Address)
        AND (Dupe.City = tblClient.City));

If subqueries are new, here's an introduction:
    http://allenbrowne.com/subquery-01.html

-- 
Allen Browne - Microsoft MVP.  Perth, Western Australia
Tips for Access users - http://allenbrowne.com/tips.html
Reply to group, rather than allenbrowne at mvps dot org.

"Mike C" <js2k111@yahoo.com> wrote in message
news:4533dae3-67a4-4c70-9cf1-30c1b16044d1@m34g2000hsc.googlegroups.com...
> Hello,
> I have a medium sized file (approx 30,000 records) that has many names
> and addresses that have been entered twice. I need to find an
> efficient way to delete all records that have both a duplicate name
> and a duplicate address. Records with a duplicate name but not a
> duplicate address should stay and vice versa.
> 
> Is there a best way to approach this?  Should this typically be
> handled with a query. Does the Find duplicates query handle this
> problem well?
> 
> Thanks for any insight anyone can provide. I am sure this is a common
> problem, but i have not had to do it before and my first few attempts
> don't seem to work.
> 
> (I will also need to attempt to delete Like records as the next step,
> but for now I guess I will just figure out how to get the main file's
> exact duplicates out of the equation.)
0
Allen
3/15/2008 2:10:33 PM
This is a multi-part message in MIME format.

------=_NextPart_000_0104_01C8866B.EDE3A180
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi Mike,

Build one query at a time (warning air code - might be a typo or =
something)

(qs_DupeNames) SELECT * FROM myTable WHERE Name IN(SELECT Name FROM =
myTable AS tmp GROUP BY Name HAVING Count(*)>1);

(qs_DupeAddress) SELECT * FROM qs_DupeNames WHERE Address IN(SELECT =
Address FROM qs_DupeNames as tmp GROUP BY Address HAVING Count(*)>1);

Take a look at the output of qs_DupeAddress - if it looks right, you can =
change it to a DELETE * etc. and that takes care of that.

There is a way to join those two into a single query but I'm a little =
shaky on the syntax and frankly, I think the above is a lot =
cleaner/easier to follow.=20

Hope this helps,
Gordon


"Mike C" <js2k111@yahoo.com> wrote in message =
news:4533dae3-67a4-4c70-9cf1-30c1b16044d1@m34g2000hsc.googlegroups.com...=

> Hello,
> I have a medium sized file (approx 30,000 records) that has many names
> and addresses that have been entered twice. I need to find an
> efficient way to delete all records that have both a duplicate name
> and a duplicate address. Records with a duplicate name but not a
> duplicate address should stay and vice versa.
>=20
> Is there a best way to approach this?  Should this typically be
> handled with a query. Does the Find duplicates query handle this
> problem well?
>=20
> Thanks for any insight anyone can provide. I am sure this is a common
> problem, but i have not had to do it before and my first few attempts
> don't seem to work.
>=20
> (I will also need to attempt to delete Like records as the next step,
> but for now I guess I will just figure out how to get the main file's
> exact duplicates out of the equation.)
------=_NextPart_000_0104_01C8866B.EDE3A180
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.6000.16609" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY>
<DIV><FONT face=3DArial size=3D2>Hi Mike,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Build one&nbsp;query at a time (warning =
air code -=20
might be a typo or something)</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>(qs_DupeNames) SELECT * FROM myTable =
WHERE Name=20
IN(SELECT Name FROM myTable AS tmp GROUP BY Name HAVING=20
Count(*)&gt;1);</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>(qs_DupeAddress) SELECT * FROM =
qs_DupeNames WHERE=20
Address IN(SELECT Address FROM qs_DupeNames as tmp GROUP BY Address =
HAVING=20
Count(*)&gt;1);</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Take a look at the output of =
qs_DupeAddress - if it=20
looks right, you can change it&nbsp;to a DELETE * etc. and that takes =
care of=20
that.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>There is a way to join those two into a =
single=20
query but I'm a little shaky on the syntax and frankly, I think the =
above is a=20
lot cleaner/easier to follow. </FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Hope this helps,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>Gordon</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>"Mike C" &lt;</FONT><A=20
href=3D"mailto:js2k111@yahoo.com"><FONT face=3DArial=20
size=3D2>js2k111@yahoo.com</FONT></A><FONT face=3DArial size=3D2>&gt; =
wrote in message=20
</FONT><A=20
href=3D"news:4533dae3-67a4-4c70-9cf1-30c1b16044d1@m34g2000hsc.googlegroup=
s.com"><FONT=20
face=3DArial=20
size=3D2>news:4533dae3-67a4-4c70-9cf1-30c1b16044d1@m34g2000hsc.googlegrou=
ps.com</FONT></A><FONT=20
face=3DArial size=3D2>...</FONT></DIV><FONT face=3DArial size=3D2>&gt; =
Hello,<BR>&gt; I=20
have a medium sized file (approx 30,000 records) that has many =
names<BR>&gt; and=20
addresses that have been entered twice. I need to find an<BR>&gt; =
efficient way=20
to delete all records that have both a duplicate name<BR>&gt; and a =
duplicate=20
address. Records with a duplicate name but not a<BR>&gt; duplicate =
address=20
should stay and vice versa.<BR>&gt; <BR>&gt; Is there a best way to =
approach=20
this?&nbsp; Should this typically be<BR>&gt; handled with a query. Does =
the Find=20
duplicates query handle this<BR>&gt; problem well?<BR>&gt; <BR>&gt; =
Thanks for=20
any insight anyone can provide. I am sure this is a common<BR>&gt; =
problem, but=20
i have not had to do it before and my first few attempts<BR>&gt; don't =
seem to=20
work.<BR>&gt; <BR>&gt; (I will also need to attempt to delete Like =
records as=20
the next step,<BR>&gt; but for now I guess I will just figure out how to =
get the=20
main file's<BR>&gt; exact duplicates out of the =
equation.)</FONT></BODY></HTML>

------=_NextPart_000_0104_01C8866B.EDE3A180--

0
gllincoln
3/15/2008 2:12:27 PM
This is a multi-part message in MIME format.

------=_NextPart_000_0024_01C88691.633DBEF0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Sorry, but this code is not exactly right: it will delete all duplicates =
but without leaving behind one value.  Usually, when you want to delete =
duplicates, you also want to keep at least one record from each =
duplicate.

--=20
Sylvain Lafontaine, ing.
MVP - Technologies Virtual-PC
E-mail: sylvain aei ca (fill the blanks, no spam please)


  "gllincoln" <gllincoln@live.com> wrote in message =
news:C75B5C7E-1423-463F-A068-0BCE28B4B8BD@microsoft.com...
  Hi Mike,

  Build one query at a time (warning air code - might be a typo or =
something)

  (qs_DupeNames) SELECT * FROM myTable WHERE Name IN(SELECT Name FROM =
myTable AS tmp GROUP BY Name HAVING Count(*)>1);

  (qs_DupeAddress) SELECT * FROM qs_DupeNames WHERE Address IN(SELECT =
Address FROM qs_DupeNames as tmp GROUP BY Address HAVING Count(*)>1);

  Take a look at the output of qs_DupeAddress - if it looks right, you =
can change it to a DELETE * etc. and that takes care of that.

  There is a way to join those two into a single query but I'm a little =
shaky on the syntax and frankly, I think the above is a lot =
cleaner/easier to follow.=20

  Hope this helps,
  Gordon


  "Mike C" <js2k111@yahoo.com> wrote in message =
news:4533dae3-67a4-4c70-9cf1-30c1b16044d1@m34g2000hsc.googlegroups.com...=

  > Hello,
  > I have a medium sized file (approx 30,000 records) that has many =
names
  > and addresses that have been entered twice. I need to find an
  > efficient way to delete all records that have both a duplicate name
  > and a duplicate address. Records with a duplicate name but not a
  > duplicate address should stay and vice versa.
  >=20
  > Is there a best way to approach this?  Should this typically be
  > handled with a query. Does the Find duplicates query handle this
  > problem well?
  >=20
  > Thanks for any insight anyone can provide. I am sure this is a =
common
  > problem, but i have not had to do it before and my first few =
attempts
  > don't seem to work.
  >=20
  > (I will also need to attempt to delete Like records as the next =
step,
  > but for now I guess I will just figure out how to get the main =
file's
  > exact duplicates out of the equation.) 
------=_NextPart_000_0024_01C88691.633DBEF0
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.6000.16608" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>Sorry, but this code is not exactly =
right: it will=20
delete all duplicates but without leaving behind one value.&nbsp; =
Usually, when=20
you want to delete duplicates, you also want to keep at least one record =
from=20
each duplicate.</FONT></DIV>
<DIV><BR>-- <BR>Sylvain Lafontaine, ing.<BR>MVP - Technologies=20
Virtual-PC<BR>E-mail: sylvain aei ca (fill the blanks, no spam =
please)</DIV>
<DIV>&nbsp;</DIV>
<DIV>&nbsp;</DIV>
<BLOCKQUOTE=20
style=3D"PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; =
BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">
  <DIV>"gllincoln" &lt;<A=20
  href=3D"mailto:gllincoln@live.com">gllincoln@live.com</A>&gt; wrote in =
message=20
  <A=20
  =
href=3D"news:C75B5C7E-1423-463F-A068-0BCE28B4B8BD@microsoft.com">news:C75=
B5C7E-1423-463F-A068-0BCE28B4B8BD@microsoft.com</A>...</DIV>
  <DIV><FONT face=3DArial size=3D2>Hi Mike,</FONT></DIV>
  <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>Build one&nbsp;query at a time =
(warning air code=20
  - might be a typo or something)</FONT></DIV>
  <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>(qs_DupeNames) SELECT * FROM myTable =
WHERE Name=20
  IN(SELECT Name FROM myTable AS tmp GROUP BY Name HAVING=20
  Count(*)&gt;1);</FONT></DIV>
  <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>(qs_DupeAddress) SELECT * FROM =
qs_DupeNames WHERE=20
  Address IN(SELECT Address FROM qs_DupeNames as tmp GROUP BY Address =
HAVING=20
  Count(*)&gt;1);</FONT></DIV>
  <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>Take a look at the output of =
qs_DupeAddress - if=20
  it looks right, you can change it&nbsp;to a DELETE * etc. and that =
takes care=20
  of that.</FONT></DIV>
  <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>There is a way to join those two into =
a single=20
  query but I'm a little shaky on the syntax and frankly, I think the =
above is a=20
  lot cleaner/easier to follow. </FONT></DIV>
  <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>Hope this helps,</FONT></DIV>
  <DIV><FONT face=3DArial size=3D2>Gordon</FONT></DIV>
  <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>"Mike C" &lt;</FONT><A=20
  href=3D"mailto:js2k111@yahoo.com"><FONT face=3DArial=20
  size=3D2>js2k111@yahoo.com</FONT></A><FONT face=3DArial size=3D2>&gt; =
wrote in=20
  message </FONT><A=20
  =
href=3D"news:4533dae3-67a4-4c70-9cf1-30c1b16044d1@m34g2000hsc.googlegroup=
s.com"><FONT=20
  face=3DArial=20
  =
size=3D2>news:4533dae3-67a4-4c70-9cf1-30c1b16044d1@m34g2000hsc.googlegrou=
ps.com</FONT></A><FONT=20
  face=3DArial size=3D2>...</FONT></DIV><FONT face=3DArial size=3D2>&gt; =
Hello,<BR>&gt;=20
  I have a medium sized file (approx 30,000 records) that has many =
names<BR>&gt;=20
  and addresses that have been entered twice. I need to find an<BR>&gt;=20
  efficient way to delete all records that have both a duplicate =
name<BR>&gt;=20
  and a duplicate address. Records with a duplicate name but not =
a<BR>&gt;=20
  duplicate address should stay and vice versa.<BR>&gt; <BR>&gt; Is =
there a best=20
  way to approach this?&nbsp; Should this typically be<BR>&gt; handled =
with a=20
  query. Does the Find duplicates query handle this<BR>&gt; problem=20
  well?<BR>&gt; <BR>&gt; Thanks for any insight anyone can provide. I am =
sure=20
  this is a common<BR>&gt; problem, but i have not had to do it before =
and my=20
  first few attempts<BR>&gt; don't seem to work.<BR>&gt; <BR>&gt; (I =
will also=20
  need to attempt to delete Like records as the next step,<BR>&gt; but =
for now I=20
  guess I will just figure out how to get the main file's<BR>&gt; exact=20
  duplicates out of the equation.)</FONT> </BLOCKQUOTE></BODY></HTML>

------=_NextPart_000_0024_01C88691.633DBEF0--

0
Sylvain
3/15/2008 3:40:37 PM
This is a multi-part message in MIME format.

------=_NextPart_000_0121_01C886B2.549DD670
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi Mike,

In my initial response, I told you how to find the dupes.  I did fail to =
go into much detail regarding how you would select the ones to delete. =
That's because I was tired and because there are so many potential =
constraints to deal with that it's mind-boggling. Most important advice =
- make backups frequently (before every major alteration) and don't =
over-write the backups, make a collection of them like a janitor's key =
ring. Label/sort them so you can find what you need because it's a near =
certainty that you will mess up at least once and find yourself needing =
to restore whatever. Keep in mind that you may not realize when you mess =
up, not right away - you could be 3 or 4 steps down the line before you =
have your epiphany.=20

For one thing - if you have related transaction tables (and you probably =
do), then along with the deletions - you have to figure out what you =
want to do about the related records. How are you using  this data, and =
how might it be used down the road? More than one person has discarded =
'old data' or 'dupe data' only to regret it later - they didn't need the =
information at the time but then, six months or a year later - uh oh.=20

There is the dilemna of the phones and email contact info. The two =
records may reflect alternate contact numbers or emails - and no matter =
which way you go, if you keep one and dump the other - you will be wrong =
close to 50% of the time. Murphy's Law insures that the ones you get =
wrong, will be ones that matter.=20

Where you have dupe clients, you may have purchase histories, =
transactions, split across both clients. You can almost guarantee that, =
no matter which one you delete, 50% of the time customer service will =
need to (for warranty or support) access the 'orphaned' transaction - =
they will search for transactions with that client and have to tell the =
client - I can't find your purchase here - do you have your invoice =
number? <not fun for the svc rep or the client>

If these are sales leads - then the dupe might be the result of the =
prospect being in contact with the company on two separate ocasions, =
maybe talking to two different sales reps, regarding two different =
products or services.  What is value of this transaction history? If you =
have a professional sales force, It can be substantial.=20

If prospect Jones calls back again, asking (again) about an item or new =
item - then the level of interest should be considered significantly =
greatly than a first contact - the sales rep should make every possible =
attempt to negotiate a close - the client has a sustained interest  in =
doing business with your company; it's time to find out whatever the =
hidden object is and deal with it.=20

The sales rep could go to an upsell, explore the feasibility of bundling =
the products or services together with this one - maybe discounting the =
bundle to sweeten the deal while maintaining a comfortable transaction =
payout overall.  The point being, these bits of extra insight/knowledge =
are like money in the bank if you have a skilled sales force. Just =
mentioning that previous contact (in context of the presentation) makes =
the customer feel that you take him seriously, that he is important to =
you and that you are a professional - he perceives that you have your =
'stuff' together, you know what you are doing. This is a rather huge =
edge for the sales team.=20

You have to make some value judgements to make and research to do, =
before you hit the switch on this one. I'd make a list of every table in =
the application (and any applications that links to these tables) that =
uses the primary key of this table as a foreign key.=20

When you are done with this, you still won't be done. You will have some =
dupes where the address is almost identical but not quite Ste vs PMB vs =
#, one address has apt, next one says #, third skips the apt number =
altogether.  (That is, unless your entries are being address corrected =
real time and if they were, you would have probably been catching dupes =
on entry - right? )  On the names side, you may have middle initial on =
one entry, none on the 2nd, and the whole middle name on the third. =
Checking for dupe emails and dupe phones is the best way to nail 98% of =
the 'rest of the dupes'.=20

Hope this helps,
Gordon




"Mike C" <js2k111@yahoo.com> wrote in message =
news:4533dae3-67a4-4c70-9cf1-30c1b16044d1@m34g2000hsc.googlegroups.com...=

> Hello,
> I have a medium sized file (approx 30,000 records) that has many names
> and addresses that have been entered twice. I need to find an
> efficient way to delete all records that have both a duplicate name
> and a duplicate address. Records with a duplicate name but not a
> duplicate address should stay and vice versa.
>=20
> Is there a best way to approach this?  Should this typically be
> handled with a query. Does the Find duplicates query handle this
> problem well?
>=20
> Thanks for any insight anyone can provide. I am sure this is a common
> problem, but i have not had to do it before and my first few attempts
> don't seem to work.
>=20
> (I will also need to attempt to delete Like records as the next step,
> but for now I guess I will just figure out how to get the main file's
> exact duplicates out of the equation.)
------=_NextPart_000_0121_01C886B2.549DD670
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.6000.16609" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY>
<DIV><FONT face=3DArial size=3D2>Hi Mike,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>In my initial response, I told you how =
to find the=20
dupes.&nbsp; I did fail to go into much detail regarding how you would =
select=20
the ones to delete. That's because I was tired and because there are so =
many=20
potential constraints to deal with that it's mind-boggling. Most =
important=20
advice - make backups frequently (before every major alteration) and =
don't=20
over-write the backups, make a collection of them like a janitor's key=20
ring.&nbsp;Label/sort them so you can find what you need because it's a =
near=20
certainty that you will mess up at least once and find&nbsp;yourself =
needing to=20
restore whatever. Keep in mind that you may not realize when you mess =
up,=20
not&nbsp;right away - you could be 3 or 4 steps down the line before you =
have=20
your epiphany. </FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>For one thing - if you have related =
transaction=20
tables (and you probably do), then along with the deletions - you have =
to figure=20
out what you want to do about the related records. How are you =
using&nbsp; this=20
data, and how might it be used down the road? More than one person has =
discarded=20
'old data' or 'dupe data' only to regret it later - they didn't need the =

information at the time but then, six months or a year later - uh oh.=20
</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>There is the dilemna of the phones and =
email=20
contact info. The two records may reflect alternate contact numbers or =
emails -=20
and no matter which way you go, if you keep one and dump the other - you =
will=20
be&nbsp;wrong close to 50% of the time.&nbsp;Murphy's Law insures that =
the ones=20
you get wrong, will be ones that matter. </FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Where you have dupe clients, you=20
may&nbsp;have&nbsp;purchase histories, transactions, split across both=20
clients.&nbsp;You can almost guarantee that, no matter which one you =
delete, 50%=20
of the time customer service will need to (for warranty or support) =
access the=20
'orphaned' transaction - they will search for transactions with =
that&nbsp;client=20
and have to tell the client - I can't find your purchase here - do you =
have your=20
invoice number? &lt;not fun for the svc rep or the =
client&gt;</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>If these are sales leads - then the =
dupe might be=20
the result of the prospect being in contact with the company on two =
separate=20
ocasions, maybe talking to two different sales reps, regarding two =
different=20
products or services.&nbsp;&nbsp;What is value of this transaction =
history? If=20
you have a professional sales force, It can be&nbsp;substantial. =
</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>If prospect Jones calls back again, =
asking (again)=20
about&nbsp;an item or&nbsp;new&nbsp;item - then the level of interest =
should be=20
considered significantly greatly than a first contact - the sales rep =
should=20
make every possible attempt to negotiate a close - the client has a =
sustained=20
interest&nbsp; in doing business with your company;&nbsp;it's time to =
find out=20
whatever the hidden object is and deal with it. </FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>The sales rep could&nbsp;go&nbsp;to an =
upsell,=20
explore the feasibility of bundling the products or services together =
with this=20
one - maybe discounting the bundle to sweeten the deal while maintaining =
a=20
comfortable transaction payout&nbsp;overall.&nbsp; The point=20
being,&nbsp;these&nbsp;bits of extra insight/knowledge&nbsp;are like =
money in=20
the bank if you have a skilled sales force. Just mentioning that =
previous=20
contact (in context of the presentation) makes the customer feel that =
you take=20
him seriously, that he is important&nbsp;to you and that you are a =
professional=20
- he perceives that you have your 'stuff' together, you know what you =
are doing.=20
This is a rather huge edge for the sales team. </FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>You have&nbsp;to make some&nbsp;value =
judgements to=20
make and research to do,&nbsp;before you hit the switch on this one. I'd =
make a=20
list of every table in the application (and any applications that links =
to these=20
tables) that uses the primary key of this table as a foreign key. =
</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>When you are done with this, you still =
won't be=20
done. You will have some dupes where the address is almost identical but =
not=20
quite Ste vs PMB vs #, one address has&nbsp;apt, next one says #, third =
skips=20
the apt number altogether.&nbsp;&nbsp;(That is, unless your entries are =
being=20
address corrected real time and if they were, you would have probably =
been=20
catching dupes&nbsp;on entry&nbsp;- right? )&nbsp; On the names side, =
you may=20
have middle initial on one entry, none on the 2nd, and the whole middle =
name on=20
the third. Checking for dupe emails and dupe phones is the best way to =
nail 98%=20
of the 'rest of the dupes'. </FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Hope this helps,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>Gordon</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>"Mike C" &lt;</FONT><A=20
href=3D"mailto:js2k111@yahoo.com"><FONT face=3DArial=20
size=3D2>js2k111@yahoo.com</FONT></A><FONT face=3DArial size=3D2>&gt; =
wrote in message=20
</FONT><A=20
href=3D"news:4533dae3-67a4-4c70-9cf1-30c1b16044d1@m34g2000hsc.googlegroup=
s.com"><FONT=20
face=3DArial=20
size=3D2>news:4533dae3-67a4-4c70-9cf1-30c1b16044d1@m34g2000hsc.googlegrou=
ps.com</FONT></A><FONT=20
face=3DArial size=3D2>...</FONT></DIV><FONT face=3DArial size=3D2>&gt; =
Hello,<BR>&gt; I=20
have a medium sized file (approx 30,000 records) that has many =
names<BR>&gt; and=20
addresses that have been entered twice. I need to find an<BR>&gt; =
efficient way=20
to delete all records that have both a duplicate name<BR>&gt; and a =
duplicate=20
address. Records with a duplicate name but not a<BR>&gt; duplicate =
address=20
should stay and vice versa.<BR>&gt; <BR>&gt; Is there a best way to =
approach=20
this?&nbsp; Should this typically be<BR>&gt; handled with a query. Does =
the Find=20
duplicates query handle this<BR>&gt; problem well?<BR>&gt; <BR>&gt; =
Thanks for=20
any insight anyone can provide. I am sure this is a common<BR>&gt; =
problem, but=20
i have not had to do it before and my first few attempts<BR>&gt; don't =
seem to=20
work.<BR>&gt; <BR>&gt; (I will also need to attempt to delete Like =
records as=20
the next step,<BR>&gt; but for now I guess I will just figure out how to =
get the=20
main file's<BR>&gt; exact duplicates out of the =
equation.)</FONT></BODY></HTML>

------=_NextPart_000_0121_01C886B2.549DD670--

0
gllincoln
3/15/2008 10:36:21 PM

"Mike C" <js2k111@yahoo.com> сообщил(а) в новостях 
следующее:4533dae3-67a4-4c70-9cf1-30c1b16044d1@m34g2000hsc.googlegroups.com...
> Hello,
> I have a medium sized file (approx 30,000 records) that has many names
> and addresses that have been entered twice. I need to find an
> efficient way to delete all records that have both a duplicate name
> and a duplicate address. Records with a duplicate name but not a
> duplicate address should stay and vice versa.
>
> Is there a best way to approach this?  Should this typically be
> handled with a query. Does the Find duplicates query handle this
> problem well?
>
> Thanks for any insight anyone can provide. I am sure this is a common
> problem, but i have not had to do it before and my first few attempts
> don't seem to work.
>
> (I will also need to attempt to delete Like records as the next step,
> but for now I guess I will just figure out how to get the main file's
> exact duplicates out of the equation.) 

0
viktor
3/17/2008 1:32:37 AM
Reply:

Similar Artilces:

Backup address book ?????
If i am going to restore outlook express how do i backup my address book as to not loose them on the restore or fresh install. ?? Thks Tim This is a group to support Outlook from the Office group of programs. Outlook Express is a part of Internet Explorer and is a quite different program despite its similar name.. You will probably get a faster and more expert answer if you post this to an outlook express news group. Try posting in one of these newsgroups: microsoft.public.windows.inetexplorer.ie5.outlookexpress for OE 5.x microsoft.public.windows.inetexplorer.ie55.outlookexpress for O...

Compressing xml file
I am using sharpziplib ti compress & zip an XML file. But the compression ration is very bad for this utility compared to that i do with winzip. Any idea how to integrate winzip with our VB.net application to compress xml files. Or is there any other utility to do the kob Devesh "Devesh Aggarwal" <solutions@unieme.com> schreef in bericht news:#TwLpW8kDHA.1708@TK2MSFTNGP12.phx.gbl... > I am using sharpziplib ti compress & zip an XML file. But the compression > ration is very bad for this utility compared to that i do with winzip. Any > idea how to integrate...

Show email address instead of contact name
How can I make Outlook 2000 show the typed email address instead of resolving to the contact name in printouts of the contacts list and in the address boxes of email ? For example, I have a list of contacts in a custom list view with the email fields showing. Several of the entries in the email fields have resolved to the name of the contact. I want to see their email address not their name again. Also, a second example. I have three email addresses for a person, in the Email1, Email2, and Email3 fields. I go to send an email and instead of selecting the default from the addressbook/contact ...

File Sharing Problems
Can anyone explain this error/information message to me? - "The file 'filename.xls' may have been changed by another user since you last saved it. In that case, what do you want to do?" Options are to Save a Copy or Overwrite Changes. The file is not being shared with anyone else - it comes up whenever making general saves in Excel. -THANKS! Hi try deleting all files in your TEMP directory, reboot your computer and try again -- Regards Frank Kabel Frankfurt, Germany "Kristi - Skills Group" <KristiSkillsGroup@discussions.microsoft.com> schrieb im...

Computer won't read files.
When we try to open files in Excel or Word the computer tells us that the file can't be read or it is unavaileable. Dose anybody know why we can't retrieve our files? Thank You, Gaile Storm Hi Gaile it might help if you let us know: -your operating system -version of excel & word you have installed, including service packs (found under help / about) - whether this is a new problem - whether you can create files in these programs - if you can create and save can you then open & view these files? and anything else that may impact on using these files. Cheers JulieD "...

best home based business
<a href="http://www.bossesfactory.ws/GDI/GDI-work-at-home.html">Best Work at Home Job ever rated - BEST Freelance Job</a> <a href="http://www.bossesfactory.ws">BEST HOME BASED BUSINESS - HOME BASED BUSINESS OPPORTUNITY - TOP HOME BASED BUSINESS - HOME BASED BUSINESS</a> <a href="http://www.bossesfactory.ws/GDI/Ganar-Dinero-En-Internet-con- GDI.html">Como GANAR DINERO EN INTERNET | GANAR DINERO EN LA RED | GANAR DINERO DESDE CASA | GANAR DINERO POR INTERNET</a> ...

Publisher 2000 file keeps closing when I try to work in upper lef.
Every time I try to open a text box, insert a picture or resize anything in the upper left hand portion of page 1 on my 4-page publication, the program closes and Microsoft Error Reporting comes up. This happened to me in another document, so i started all over again in this new one. Any ideas? I have installed all my applicable updates... Any chance you'd send it to me to take a look at it? If so, send it to jl dot paules at gmail dot com. -- JoAnn Paules MVP Microsoft [Publisher] "Amy~Lynne" <Amy~Lynne@discussions.microsoft.com> wrote in message news:89E12002-...

Keeping the focus on record being displayed
I'm trying to work up a set of actions in a macro that will send a message to users of a form about something they need to do before moving to another record; the macro is displaying the necessary message box and is attached tothe On Current property of the form. I can do everything I need to except the fairly crucial bit of keeping the current record and not moving to another record. I think I must be being dense, but would appreciate any help, including pointing me to the right bit of an existing question here (I just couldn't find one...). Thanks in advance! Run your checks ...

copy sheet with range names
I'm making 2 copies of an existing worksheet within a workbook. The original sheet has lots of range names. Now, when I make my copies, the range names are screwed up and are referring to the new sheet instead of the original one. What's up with that? I don't see any options related to range names. Deleting and resetting the names would take forever, as would "re-defining". I shouldn't have to do that. Any ideas? -- RMC,CPA I think if you look at your range names, you'll find that xl localized the copies. When it copied the worksheet, it had to do something ...

Highlighting Updated Records
Hi I wonder whether someone may be able to help me please with a problem I have with a db I'm working on. I have a subform with approx 15 fields on it where on any given day can have their data amended, added or deleted. What I would like to do is to capture which records have been amended in a report within a given month. I know how to show which records have been updated by adding another field to the table which has a date stamp but my problem is as follows: I would like to actually show the individual fields which have been updated, rather than the whole record. Now I ha...

duplicate transactions
A fatal error occurs when I work in Money I get booted out and when I log back in the transactions of my money homepage have been tripled thus rendering the home page useless. Any ideas what may be wrong? It depends on what that 'fatal error' says. Suggest you post the details of that too. -- Glyn Simpson, Microsoft MVP - Money Check http://support.microsoft.com/default.aspx?scid=fh;EN-GB;mny for UK tips and fixes for MS Money. To send Microsoft your wishes or suggestions, use http://register.microsoft.com/mswish/suggestion.asp or email mnyukwsh@microsoft.com especially if it&#...

Writing XML File from Dataset
Read many of the posting but still doen't understand the process. I have a dataset with child records. I have a specific XML "Account.XSD" format that I need to write this out to. How do I marry these (2) items to create the output file. Dim dataSet As DataSet = New DataSet("AccountRecord") PremisesAdapter.Fill(dataSet, "Premises") SupAdapter.Fill(dataSet, "Supplemental") cn.Close() Dim AccountRecord As DataRelation = dataSet.Relations.Add("Prem_Species", _ data...

Example using MFC WinInet classes with POST of FILE and other INPUT data
I am looking for a good example for POSTing both FILE and other INPUT data to a URL. I understand that the FILE must use a multi-part encoding but I do not understand how the atcual file contents get POSTed or how the INPUT data gets POSTed also. I am guessing the INPUT data can be added as part of the query parameters to the URL. ...

Microsoft Excel ASC II Text Files
How do I convert an Excel spreadsheet into this type of file in order to import to another program such as "MaiList & AddressBook" Hi Bill try saving as 'CSV' file in the Save As dialog. This is a comma separated text (ASCII) file HTH Frank Bill Cadwallader wrote: > How do I convert an Excel spreadsheet into this type of > file in order to import to another program such > as "MaiList & AddressBook" ...

how to change business address
I can't seem to figure out for the life of me WHERE OR HOW to change my business address so my NEW address prints on my invoices now that I have moved. Can anyone help me with this? PLEASE????? LGLEIM In microsoft.public.money, LGleim wrote: > can't seem to figure out for the life of me WHERE OR HOW to change my >business address so my NEW address prints on my invoices now that I have >moved. Can anyone help me with this? PLEASE????? Is there a Tools->Settings->BusinessInformation place? YES!!! You did it! Thank you so much! "Cal Learner-- MVP" wrote: ...

PDF file
I have Publisher 2002 and would like to insert a .pdf file into a document I am working on. How do / Can I do this? Many thanks, Ger ...

Automatice email address generation
Is there an automated way to create a standard naming convention using the AD users and Computers tool based on the user name entered. If you were to enter John Doe as a new user the email address generated like John.Doe@domain.com for example. Or any other naming convention you would care to standardize. I should have also mentioned it is Exchange 2003 Standard. "Rockn" <rmiller2112@gmail.com> wrote in message news:%232k07RSzKHA.5940@TK2MSFTNGP02.phx.gbl... > Is there an automated way to create a standard naming convention using the > AD users and...

deleting pages in publisher 2002
When deleting p.5 (or any page) of my 19 page document, last page deletes instead! What's happening here? How can I delete my page? I'm using "edit" delete page. Also, I am not in Master Page mode. K. On Thu, 23 Feb 2006 22:39:35 +0000, KLeDrew wrote (in article <CA87F080-CAF5-4A38-956C-FDF8896CA92A@microsoft.com>): > When deleting p.5 (or any page) of my 19 page document, last page deletes > instead! What's happening here? How can I delete my page? I'm using "edit" > delete page. Also, I am not in Master Page mode. > > ...

Excel Services Publish
I have a 64 bit client (Windows Server 2008) and my colleague has Windows 7 64 bit client. We are trying to publish an Excel 2007 document to MOSS, which is running under https, and is sitting on a 64 bit operating system. We keep getting the "File Not Saved" error when we try to publish from our clients. However, a 3rd colleague has Windows XP 32 bit and can publish the same exact Excel files just fine up to the very same MOSS location we tried on the 64 bit clients. Can someone from MSFT confirm that this is an issue with 64 bit clients? Can you present a solut...

Best Practise
Hi Everyone, In the process of creating a new Exchange 2003 SP2 Cluster, and had some questions about breaking up the stores. I am a bit confused between Storage Groups, and Mailbox Stores. To break things down, here is the current setup on the existing Exchange Server (the non cluster one) 400+ Users Store is at 110GB Backup Time - Long When I start moving staff over to the new Exchange box, we want to implement quotas, and break up some of the accounts. Thought process was this: 1)Directors\Management\Tech - No Quotas 2)Supervisors and other Critical Staff - 700MB 3)Other staff - ...

Deleting a posted message
Is there a way I can delete a posted message OR at least take my email off. I have mistakenly given my email and have been receiving spam like anything. It would be really helpful if the admins of this group can help me delete that specific post. Use the .NET serializer to turn it to a string . Use the .Whatever in 2005, asking for MS for Permission to post an 'unmanaged' question. "Neo" <anonymous@discussions.microsoft.com> wrote in message news:026501c3a56a$f77b3870$a401280a@phx.gbl... > Is there a way I can delete a posted message OR at least > ...

Retrieving a File
I need to set up a procedure to automatically retrieve a file(s) and send to a printer at a specified time (6:45) everyday. I'm not that familiar with VBA and have no idea where to start. Can anyone help me or point me in the right direction to start this. Thanks! Tammy for starters try this Sub PrintFile() Workbooks.Open FileName:="C:\Temp\Test FILE.xls" ActiveWindow.SelectedSheets.PrintOut Copies:=1, Collate:=True Application.OnTime TimeValue("06:45:00"), "PrintFile", TimeValue("06:45:10"), True End Sub --- Message posted from http://www...

Call up computer name
[Excel 2003] Is there a way to retrieve the computer identity/name and display it in a listbox on a userform? regards, Roger uName =3D Environ("computername") Me.ListBox1.AddItem uName On Dec 3, 9:30=A0am, Roger on Excel <RogeronEx...@discussions.microsoft.com> wrote: > [Excel 2003] > > Is there a way to retrieve the computer identity/name and display it in a > listbox on a userform? > > regards, > > Roger Thanks - works nicely Roger "muddan madhu" wrote: > > uName = Environ("computern...

previewing windows files in Outlook
Is there a way to preview or auto preview Windows documents using Outlook? I have placed shortcuts to various windows folders on my Outlook bar. When I open the folders in Outlook, there is no AutoPreview or preview pane feature. I would like to view these files in Outlook without having to open each file. I use Outlook 2000. Peter Henderson <anonymous@discussions.microsoft.com> wrote: > Is there a way to preview or auto preview Windows > documents using Outlook? I have placed shortcuts to > various windows folders on my Outlook bar. When I open > the folders in...

no option to add display name
I am running outlook xp on a both win98 and xp machines. In neither case can I find or set a display name for my email account (ie so others will see my display name). I have seen various suggestions about going to email accounts/change but there is just no option there for 'display name'. Is it something to do with the 'mode' of install? any advice appreciated david marcus David Marcus <dmarcus@netspeed.NO-SPAM.com.au> wrote: > I am running outlook xp on a both win98 and xp machines. In neither > case can I find or set a display name for my email account ...