How to randomly split a whole dataset into two sub-dataset?

 Hi,

   At your possible convenience, might anyone please kindly answer my
question? Thank you very much.

   How to "RANDOMLY" split the whole data set (n=2000) into two sub
dataset (n=1000; n=1000)  in SPSS or Excel?

     Thank you very much.

     Please take care

     Caroline

0
7/17/2007 5:27:12 AM
excel 39879 articles. 2 followers. Follow

4 Replies
975 Views

Similar Articles

[PageSpeed] 46

zencaroline <zencaroline@gmail.com> writes:

>    How to "RANDOMLY" split the whole data set (n=2000) into two sub
> dataset (n=1000; n=1000)  in SPSS or Excel?

Create a new variable whose value is randomly distributed.  Sort
the data on this variable.  Take the first 1000 cases as the
first sub-dataset, the last 1000 cases as the second sub-dataset.
-- 
Ben Pfaff 
http://benpfaff.org
0
blp (1)
7/17/2007 5:37:44 AM
Caroline  -

In Excel, one method is to use a "helper column."

Arrange the data in a single column.

In an adjacent column, enter =RAND() into the top cell, copy, and paste to 
the other cells, so there's a random number associated with each adjacent 
data value.

As an optional additional step, select the column with random numbers, copy, 
and Paste Special Values (so that the random numbers don't change after the 
sort).

Select the entire range of  data values (two columns wide and 2000 rows 
long).

Choose Data | Sort, "Sort by" the column containing the random numbers, and 
click OK.

Use the first 1000 sorted data values for one sub data set, and use the 
others for the second sub data set.

-  Mike

www.MikeMiddleton.com

"zencaroline" <zencaroline@gmail.com> wrote in message 
news:1184650032.814382.172400@i13g2000prf.googlegroups.com...
> Hi,
>   At your possible convenience, might anyone please kindly answer my
> question? Thank you very much.
>   How to "RANDOMLY" split the whole data set (n=2000) into two sub
> dataset (n=1000; n=1000)  in SPSS or Excel?
>     Thank you very much.
>     Please take care
>     Caroline


0
mike5208 (300)
7/17/2007 5:44:54 AM
Hello Caroline,

If your data is in A1:A2000, select B1:B2000 and array-enter
=UniqRandInt(2,1000)
[enter with CTRL + SHIFT + ENTER, not only with ENTER]

My user-defined function UniqRandInt you can get here:
http://www.sulprobil.com/html/uniqrandint.html
[Press ALT + F11, insert a new module, copy my macro text into that
new module and go back to your worksheet]

You will get exactly 1000 ones and 1000 twos which indicate the
subset.

Regards,
Bernd

0
bplumhoff1 (208)
7/17/2007 8:50:25 AM
* create a random variable.
compute ranorder= rv.uniform(1, 2E9).
*sort the cases in a random order.
sort cases by ranorder.
*assign to groups by odd and even casenum.
compute group = mod($casenum,2) .
value labels group 0 'even' 1 'odd'.
frequencies vars=group.


Art Kendall
Social Research Consultants


zencaroline wrote:
>  Hi,
> 
>    At your possible convenience, might anyone please kindly answer my
> question? Thank you very much.
> 
>    How to "RANDOMLY" split the whole data set (n=2000) into two sub
> dataset (n=1000; n=1000)  in SPSS or Excel?
> 
>      Thank you very much.
> 
>      Please take care
> 
>      Caroline
> 
0
7/17/2007 6:10:54 PM
Reply:

Similar Artilces:

Syntax for Refreshing Control on Sub-Subform
I'm close, but get no cigar... Can anybody please tell me the proper syntax for refreshing or requerying a combobox on a subform that is embedded within another subform from another subform on the mainform? I'm trying to set it up (in the AfterUpdate event property of the Chrgs field on Subform1) so that changes made to the Chrgs field on subform1 causes the cboLineItems combobox on Sub-Subform2 to be refreshed or requeried, so that a calculated textbox on the same sub-subform can be recalculated? I know that's a mouthful... If you need further clarification, I will de...

comparing two columns
I need to compare cells 2 columns and return a value if different. e.g. A1: 2002 B:2002 A2: 2002 B:2003 A3: 2002 B:2002 Need to identify row 2 is different (with an x and a color?) I have about 1000 rows to compare. I tried some of the formulas listed in this section e.g. =if(countif(a:a,a25)>1,a25,"") but didn't return what i wanted. Paste this formula in C1 cell =IF(A1=B1,"Both Are Same","Mismatch") Now copy the C1 cell and paste it to the remaining cells of C Column depends upon your A & B Column Data. For Applying the Co...

Counting specific text in two columns.
I have two columns (a and B) which contain the following:- Column A:- Not_Chased, Regret, Stockist, Ordered, Chased, Pending, Lost, Opp_Log Column B:- List of countries (e.g. China, U.K. Netherlands, Germany, Spain, Italy, Czech). I would like to produce a matrix as following showing say Not-Chased for all the individual countries, Regret for all countries, Stockist for all countries… (see below) China Czech France Germany India Grand Totals LOST Not_Chased OPP_LOG ORDERED PENDING REGRET STOCKIST ...

Random
How to get random numbers random(0xff) ? int RandomNumber = (rand() * 0xff) / RAND_MAX; -- Bjarne Nielsen "]GHO[" <yousefk@taux01.nsc.com> skrev i en meddelelse news:0b6a01c3a84c$5733a890$a501280a@phx.gbl... > How to get random numbers random(0xff) ? rnd()%0x100; or rnd() & 0xff; Ruben On Tue, 11 Nov 2003 04:07:13 -0800, "]GHO[" <yousefk@taux01.nsc.com> wrote: >How to get random numbers random(0xff) ? ...

split text in cell into 2 #2
I am needing help with formula. I have a cell that takes values from several other cells to create one long line text (for a command used in other app). For ex: a1=jack b1=ran c1="There was a person named"&a1&"who was hungry and"&b1&" to Mcdonalds." This process works great except when the line is greater than 72. I need an if statment that says if c1 length is greater than 72 then to split / replace line AFTER name(a1). Where the split occurs a +, and carriage return needs to be added. So in the example above (pretend length will be over 72,...

combining two woeksheets into one
Hi, Is it possible to combine two work sheets into one? ex: I have first.xls, second.xls firsr.xls has columns aa,bb,cc second.xls has columns aa,bb,cc,dd Both sheets has different data. I want to see them in a single sheet at once side by side and should be able to compare. appreciate your help. Thanks The easiest thing to do, is to copy/paste from second.xls into ff, gg, hh. Or.... if you have Excel 2003, you can go to Window-->compare side by side with or.... Use helper columns to do the comparison for you. In first.xls go to column ff. Use the following formula =EXACT(AA1,...

Can We have two separate email Icons 03-22-10
Can I set up two email icons on my desktop? One that checks my email address and one that checks my sons. Thamks, Mike No. They would both open the same instance of OE. The only way to be completely separate is to have two Windows logons. -- Bruce Hagen MS-MVP [Mail] Imperial Beach, CA "amdx" <amdx@knology.net> wrote in message news:512b5$4ba817b2$18ec6dd7$4464@KNOLOGY.NET... > Can I set up two email icons on my desktop? > One that checks my email address and one that checks my sons. > &...

Splitting a workbook
Is there an easier way for splitting a workbook into 2 smaller workbooks than copying the workbook to 2 separate files and deleting the appropriate worksheets from each? Thanks in advance. -- Tony Vella in Ottawa, Canada http://www.datapigtechnologies.com/freeware.htm Tony Vella wrote: > Is there an easier way for splitting a workbook into 2 smaller workbooks > than copying the workbook to 2 separate files and deleting the appropriate > worksheets from each? Thanks in advance. > -- > Tony Vella in Ottawa, Canada Why do you want to split the file? If you do you'll h...

DEFINING RELATIONSHIP BETWEEN TWO or more TABLES
I am often required to design queries involving two or more tables andlinking them together. However, in linking them together, i am oftenforced to define the relationship (join properties) between thesetables and I often get it wrong. As a result, i get duplicate recordsall the time. Can anyone explain to me how does one-to-many and many-to-many relationship really works in Access queries and how are theybeing retrieved. And how can i easily identify and fix this probem.Plese give some examples.Thanks,Pago On Mar 27, 8:23 am, pago_b...@yahoo.com.au wrote:> I am often required to design ...

Random Numbers not Random
When I create random numbers for an application with the following code, they do not appear to be truely random. Me.RandomNumber = Int(9 * Rnd + 1) ' Random Number 1-9 If I have two users setting side by side, they frequently get the same results when they start up Am I missing somthing? Any help would be appreciated -- Frank Wagner fwagner111@aol.com Frank, Have you tried a Randomize statement just before the RND? That should do it. -- hth Al Campagna Microsoft Access MVP http://home.comcast.net/~cccsolutions/index.html "Find a job tha...

Exchange server randomly asking random users to Authenticate randomly...
Has anyone had a problem like this where a user is in their Exchange mailbox using Outlook 2003 and the server prompts them for their credentials? If they hit cancel and exit out of Outlook and run it again it lets them into their mailbox. Thank you, Benjamin Pls make sure the outlook 2003 Client does not using Cache mode If is ,pls clearn this option -- Jammy "Benjamin Chan" <bchan[pleasenospam]@controlproductsinc.com> ���g��l��s�D :#BxdizrtEHA.3972@TK2MSFTNGP15.phx.gbl... > Has anyone had a problem like this where a user is in their Exchange mailbox > using Outlo...

Placing dataset into XPathDocument
I have a merged dataset that contains xml read from SQL Server. I need to place the data into an XPathDocument. I can do the following: mydataset.writeXML("mydata.xml") dim xpdoc as new XPathDocument("mydata.xml") Problem is it seem rather redundent to write data currently in memory to disk in order to be read on the next line. According to the documentation the writeXML method supports writing to System.IO.Stream and the XPathDocument supports load from System.IO.Stream but I can not seem to get this to work. Any suggestions? Cheers Keith Keith Chadwick wrote: &...

Generating Correlated Random Values in Excel
Hi Mike, Thanks so much for your advice. I think that I have it figured out, but can you confirm this for me? I can't afford to screw this up. If mean(inflation) = .031 stdev(inflation) = .047 mean(stock return) = .106 stdev(stock return) = .204 Column Headers: A B C D 1 Z1 Z2 Inflation Stock Return Data Generation Formulas A2 =NORMINV(RAND(),0,1) B2 =NORMINV(RAND(),0,1) C2 =0.031+0.047*A2 D2 =0.106+0.204*(A2*rho+B2*(1-rho^2)^0.5) Are these formulas correct to generate the random data? Particularly cell D2. Does this give me a complete value that t...

random pages
I have developed pages that randomly define a set of normal lung function data based on gender, age and height. I have applied formulas to this data simulate 5 classic lung function abnormalities (normal, restrictive, obstructive, mixed and early obstructive.) Currently the sheets that define the abnormal are named by one of the above lung abnormalities. I would like to present data from random pages in a work book to students and rather than have the pages named by the abnormality be named a simulated patient name. My questions. Can I have the name of a sheet set to equal data from to ...

dataset schemas
Hi All, I'm trying to create a dataset schema so that when I call the dataset function ds.writexml I get the following result: <?xml version="1.0" standalone="yes"?> <eConnect xmlns="http://tempuri.org/RMCustomerMaster.xsd"> <RMCustomerMasterType> <taUpdateCreateCustomerRcd> <CUSTNMBR>12345</CUSTNMBR> <CUSTNAME>BubbaK, Inc.</CUSTNAME> <UpdateIfExists>1</UpdateIfExists> </taUpdateCreateCustomerRcd> <taCreateInternetAddresses> <Master_Type>CUS</Master_Type...

Field name into two lines in a query?
This may be a silly question but... Is it possible to make the name of a field into two lines instead of one long line in a query? No, there is no way to do that. -- John Spencer Access MVP 2002-2005, 2007 Center for Health Program Development and Management University of Maryland Baltimore County .. "mayj" <mayj@discussions.microsoft.com> wrote in message news:9372FB14-BDDE-49E3-8A0E-7E5F3B598FAC@microsoft.com... > This may be a silly question but... > > Is it possible to make the name of a field into two lines instead of one > long line in a query? O...

Random Distribution
I need to randomly order numbers from a list or range of consecutive whole numbers, such that each number is listed only once. Can this be done in excel using functions, formulas or whatever? I want to incorporate it into a spreadsheet with a macro to automatically generate the list and use it to randomly designate selected entries. "Rick via OfficeKB.com" wrote: > I need to randomly order numbers from a list or range of consecutive whole > numbers, such that each number is listed only once. Can this be done in > excel using functions, formulas or whatever? I want to...

I had about tensubjects now only two
I have been informed that 8 of the subjects I regularly used are no longer available and I am left with 2 This one and Windows XP security What has happened to the others? Blair "bm" <Darroch@aol.com> wrote in message news:%23sxVynBILHA.4120@TK2MSFTNGP02.phx.gbl... >I have been informed that 8 of the subjects I regularly used are no >longer available and I am left with 2 > This one and Windows XP security > What has happened to the others? > Blair > Subjects meaning newsgroups? They're being terminated on MS servers. Microsof...

Split Names
How can we split the last names to another cell ? Als if we have Mr & mrs john doe mike how can we split the title ? hi you might try using text to columns. on the menu bar.... Data>text to columns...>follow the wizzard. If your have a lot of names of varing lengths, with and without titles, you may have problems. you may need to group name of similar length and title so that you can parce the data in blocks or groups. otherwise you may have to do i one name at a time. also when parcing data, make sure you have enough blank cells to the right of the data you are parcing becau...

How to Retrieve Random Records through webservice method
Hi All, I'm writing an application that retrieve records from CRM 3.0 automatically using the web service. My question is, is there any way we can retrieve records randomly by using any webservice method? Like we can use RAND() or NEWID() to achieve this in SQL. Please help me if you have any idea about it. My sample method code is given below: CrmService service = new CrmService(); RetrieveMultipleRequest request2 = new RetrieveMultipleRequest(); // Create the ConditionExpression object. ConditionExpression condition = new ConditionExpression...

How to? Exch2003srv receive e-mail and forward to two mailbox
Hi, Our company uses Exchange 2003 Server. One of the VP made a special request that I have no idea how to deal with. Basically, he wants any e-mail send to him, internal or external, not only arrive in his exchange mail box, but also arrive at his secretary's exchange mail box, all done on the server side. So, one incoming e-mail send to two different exchange mail box at the same time. If any one can give me a pointer here and there, it would be greatly appreciate. Thanks. Sincerely, Jadefox Open the VP account properties in Active Directory Users and Computers and then go to th...

Random numbers
How exactly do i generate random numbers in VC++.Is the a function similar to random() in C++.Thanks in advance Fenn On Mon, 10 Jan 2005 22:42:52 -0800, "Fenn" <fenn_j@yahoo.com> wrote: >How exactly do i generate random numbers in VC++.Is the a function similar >to random() in C++.Thanks in advance > VC++i is C++. Besides, there is no random() in C++. I think you mean rand(). And besides, that is a C function imported into C++. In any event, generate random numbers in Visual C++ using MFC exactly the same way you would in any C++ (or C) program. .. yeah, b...

Split report for email
I have a report that contains a number of columns - and where the report contains a number of lines for a range of 'cost centres'. The report has the cost centre title in one column (say B) and then a number of rows below that containing various data in the columns. At the moment the cost centre is only identified in the first row of each cost centre's data. At a point the data for one cost centre is concluded and the next row contains the data for the next cost centre - and so on. How may I 'split' this report - I guess in to a workbook for each cost cent...

Sub-Form
I am developing an Access form to track my purchases. I have a form and a subform. I am doing the calculations on the subform and transferring the data to fields in the main form. I am calculating the Sub Total, Tax and Order Total. All that works fine. Now I want to add an Option Group on the main form for Tax or No Tax (Some entries are not taxable) and use the results from the Option Group selection in the calculation(s) for Tax and Order Total. The process seems easy but I keep coming up with "#NAME? or #ERROR. If anyone can point me in the right direction...Great!! If it is s...

how do I create a chart on two axes with stacked columns and line
I am struggling to create a chart that combines stacked columns on one X axis with a line on a second axis. Both are available as standard charts. Hi, Create the stacked column chart with all of your data. Select the data series you want to plot as a line on the secondary axis. Change the chart type to line. And then on the format dialog change the axis to the secondary. Cheers Andy ian ian ian rogers wrote: > I am struggling to create a chart that combines stacked columns on one X axis > with a line on a second axis. Both are available as standard charts. -- Andy Pope, Microso...