#### how do i "de-dupe?" rows by a column value?

```I have a complex operation to perform.  My spreadsheet has rows that indicate
items (control number in column A) linked to titles (control number in column
B).  A title can have multiple items in the spredsheet.  I want to number the
titles and assign random numbers to them so that I can get two random
samples.

I also want to have my random numbering retained in the spreadsheet as a
whole, so I can see all the items assigned to my samples.

How can I "dedupe" column B, assign a random number to each title control,
then transfer that data to my "undeduped" spreadsheet (or have it
retained...)?

I found an article describing a way to assign the random numbers
(http://tutorialized.com/tutorial/Select-a-random-sample-of-rows-in-Excel/7197);
it's how to apply this only to the title controls, one per title control,
that I don't know.  Someone described how to do this in another statistical
package, I just don't know what to do with Excel.  Help, please!
```
 0
hailah (2)
5/15/2005 6:29:01 PM
excel.misc 78881 articles. 5 followers.

1 Replies
322 Views

Similar Articles

[PageSpeed] 33

```In case anyone is interested, I figured out how to do what I wanted - without
"deduping"... using If function.

Realize that the key to the item/title relationship is another column, call
it V, call number.

First, to assign random numbers to the titles...  sort spreadsheet by call
number column V, so all the rows with same title control are together.
Create a new column next to title control column B, with If function such
that if title control is the same as the one above it, value is "V",
otherwise value is RAND().

Copy that column, past over it as values only (to set the random values so
they don't recalculate every time).

Sort spreadsheet by the new column (C), ascending so all the random numbers
are at the top and V's at the bottom.  Create another column to the side of
it (D) and label (and count if desired, by using data fill handle) the number
of randomly selected titles I want.  I did two samples and then left the rest
blank.

Re-sort the spreadsheet by call number.  Now the titles in the sample are
labeled, but I want the row with volumes of the same title also to be
labeled.  Right now they're blank.

Create a new column (E) next to the sample label column, all with formula.
In E2:

=If C2<>"V",D2,E1

Was sure this wasn't going to work but it correctly copied the labels on my
samples and labeled the associated volume rows.  Now I can "fix" the values
in column E, delete D and use it instead.

Whew!  Thought I was going to have to write a macro for this one...

"hailah" wrote:

> I have a complex operation to perform.  My spreadsheet has rows that indicate
> items (control number in column A) linked to titles (control number in column
> B).  A title can have multiple items in the spredsheet.  I want to number the
> titles and assign random numbers to them so that I can get two random
> samples.
>
> I also want to have my random numbering retained in the spreadsheet as a
> whole, so I can see all the items assigned to my samples.
>
> How can I "dedupe" column B, assign a random number to each title control,
> then transfer that data to my "undeduped" spreadsheet (or have it
> retained...)?
>
> I found an article describing a way to assign the random numbers
> (http://tutorialized.com/tutorial/Select-a-random-sample-of-rows-in-Excel/7197);
> it's how to apply this only to the title controls, one per title control,
> that I don't know.  Someone described how to do this in another statistical
> package, I just don't know what to do with Excel.  Help, please!
```
 0
hailah (2)
5/20/2005 3:04:02 AM