How to RANDOMLY split a whole dataset into two sub-dataset? #3

  Hi,

    At your possible convenience, might anyone please kindly answer my
question?

    How to RANDOMLY split a whole dataset (n=2000) into two sub
dataset (n=1000, n=1000) in SPSS or Excel?

   Thank you very much.

    Please take care

    Caroline

0
7/17/2007 5:33:17 AM
excel 39879 articles. 2 followers. Follow

3 Replies
475 Views

Similar Articles

[PageSpeed] 0

You have two answers in your other posting.

Pete

On Jul 17, 6:33 am, zencaroline <zencarol...@gmail.com> wrote:
>   Hi,
>
>     At your possible convenience, might anyone please kindly answer my
> question?
>
>     How to RANDOMLY split a whole dataset (n=2000) into two sub
> dataset (n=1000, n=1000) in SPSS or Excel?
>
>    Thank you very much.
>
>     Please take care
>
>     Caroline


0
pashurst (2576)
7/17/2007 8:32:02 AM
Transform --> Compute.

Create a new target variable (say "uni") then under numeric expression
type "univariate(1)". Press OK
This will have created a variable called "uni" that will be as long as
your data (n=2000) randomly assigning each one a value between 0 and
1. As it is a uniform distribution there should be n=1000 with a "uni"
score of <=0.5 and n=1000 for a "uni" score of > 1000.


If anyone has a better way i too would be interested

Cheers

DaveL


0
7/17/2007 8:45:50 AM
The routine I use is as follows:

Function Sample(IP_Data, Nsamp, nobs, Replace As Boolean, ObsNum As Boolean)
Dim i As Double, j As Double, k As Double, n As Double, skip As Boolean
Dim p As Double, Out(), Cnt As Double, pick As Double, Samp As Double

'******************************************************************
'**  Function returns a simple random sample from an input array.**
'**  Arguments:                                                  **
'**    IP_data = input data array                                **
'**    Nsamp = number of samples                                 **
'**    nobs = number of observations in each sample              **
'**    Replace = if TRUE, sample with replacement.  If FALSE     **
'**              sample is returned without replacement.         **
'**    ObsNum = if TRUE, the number of the observation is        **
'**             included in the first column                     **
'******************************************************************

    With Application
        n = UBound(IP_Data, 1)
        p = UBound(IP_Data, 2)
        Samp = 0
        
        If Replace Then
            'Sample with replacement
            If ObsNum Then
                'Include observation number column
                ReDim Out(1 To nobs * Nsamp, 1 To p + 1)
                For i = 1 To nobs
                    For j = 1 To Nsamp
                        Cnt = Cnt + 1
                        pick = .Ceiling(Rnd(1234) * n, 1)
                        Out(Cnt, 1) = pick
                        For k = 1 To p
                            Out(Cnt, k + 1) = IP_Data(pick, k)
                        Next k
                    Next j
                Next i
            Else
                'No observation number column
                ReDim Out(1 To nobs * Nsamp, 1 To p)
                For i = 1 To nobs
                    For j = 1 To Nsamp
                        Cnt = Cnt + 1
                        pick = .Ceiling(Rnd(1234) * n, 1)
                        For k = 1 To p
                            Out(Cnt, k) = IP_Data(pick, k)
                        Next k
                    Next j
                Next i
            End If
        Else
            'Sample without replacement
            If ObsNum Then
                'Include observation number column
                ReDim Out(1 To nobs * Nsamp, 1 To p + 1)
                For j = 1 To Nsamp
                    For i = 1 To nobs
                        Cnt = Cnt + 1
                        skip = False
                        pick = .Ceiling(Rnd(1234) * n, 1)
                        'Check for obs in sample
                        If Cnt > 1 Then
                            For k = 1 To Cnt
                                If pick = Out(k + Samp, 1) Then
                                    'Found repeat
                                    skip = True
                                    Cnt = Cnt - 1
                                    i = i - 1
                                    Exit For
                                End If
                            Next k
                        End If
                        If Not skip Then
                            Out(Cnt + Samp, 1) = pick
                            For k = 1 To p
                                Out(Cnt + Samp, k + 1) = IP_Data(pick, k)
                            Next k
                        End If
                    Next i
                    Samp = Samp + nobs
                    Cnt = 0
                Next j
            Else
                'No observation number column
                ReDim Out(1 To nobs * Nsamp, 1 To p)
                For j = 1 To Nsamp
                    For i = 1 To nobs
                        Cnt = Cnt + 1
                        skip = False
                        pick = .Ceiling(Rnd(1234) * n, 1)
                        'Check for obs in sample
                        If Cnt > 1 Then
                            For k = 1 To Cnt
                                If pick = Out(k + Samp, 1) Then
                                    'Found repeat
                                    skip = True
                                    Cnt = Cnt - 1
                                    i = i - 1
                                    Exit For
                                End If
                            Next k
                        End If
                        If Not skip Then
                            For k = 1 To p
                                Out(Cnt + Samp, k) = IP_Data(pick, k)
                            Next k
                        End If
                    Next i
                    Samp = Samp + nobs
                    Cnt = 0
                Next j
            End If
        End If
                
    End With

    Sample = Out
End Function


"DaveL" wrote:

> Transform --> Compute.
> 
> Create a new target variable (say "uni") then under numeric expression
> type "univariate(1)". Press OK
> This will have created a variable called "uni" that will be as long as
> your data (n=2000) randomly assigning each one a value between 0 and
> 1. As it is a uniform distribution there should be n=1000 with a "uni"
> score of <=0.5 and n=1000 for a "uni" score of > 1000.
> 
> 
> If anyone has a better way i too would be interested
> 
> Cheers
> 
> DaveL
> 
> 
> 
0
Chad9044 (93)
7/20/2007 2:38:30 PM
Reply:

Similar Artilces:

Syntax for Refreshing Control on Sub-Subform
I'm close, but get no cigar... Can anybody please tell me the proper syntax for refreshing or requerying a combobox on a subform that is embedded within another subform from another subform on the mainform? I'm trying to set it up (in the AfterUpdate event property of the Chrgs field on Subform1) so that changes made to the Chrgs field on subform1 causes the cboLineItems combobox on Sub-Subform2 to be refreshed or requeried, so that a calculated textbox on the same sub-subform can be recalculated? I know that's a mouthful... If you need further clarification, I will de...

ESEUTIL #3
My priv.edb file is on the e: drive and there's not enough available diskspace to run the ESEUTIL defrag utility. However, I have more than enough diskspace on the d: drive. Is it possible to specify a different drive as the temporary location that the ESEUTIL defrag utility needs? Thank you in advance, Bill Yes, you can specify the temporary folder in another drive or directory. Check the parameters in the following link, http://support.microsoft.com/?id=192185 -- This posting is provided "AS IS" with no warranties, and confers no rights. Use of included script samp...

comparing two columns
I need to compare cells 2 columns and return a value if different. e.g. A1: 2002 B:2002 A2: 2002 B:2003 A3: 2002 B:2002 Need to identify row 2 is different (with an x and a color?) I have about 1000 rows to compare. I tried some of the formulas listed in this section e.g. =if(countif(a:a,a25)>1,a25,"") but didn't return what i wanted. Paste this formula in C1 cell =IF(A1=B1,"Both Are Same","Mismatch") Now copy the C1 cell and paste it to the remaining cells of C Column depends upon your A & B Column Data. For Applying the Co...

Counting specific text in two columns.
I have two columns (a and B) which contain the following:- Column A:- Not_Chased, Regret, Stockist, Ordered, Chased, Pending, Lost, Opp_Log Column B:- List of countries (e.g. China, U.K. Netherlands, Germany, Spain, Italy, Czech). I would like to produce a matrix as following showing say Not-Chased for all the individual countries, Regret for all countries, Stockist for all countries… (see below) China Czech France Germany India Grand Totals LOST Not_Chased OPP_LOG ORDERED PENDING REGRET STOCKIST ...

asking about retrieving 3 levels of heirarchy data from xml
I want to ask some question about xml and xPath This is the Heirarchy Data of XML File <OC> CEO Manager Employee </OC> Under the Employee Node - it may be a lot of sub nodes (that node has also child node) I want to retrieve the first time 3 levels of nodes. I use .NET XmlDocument.SelectNodes("//CEO") And I iterate the NodeList for 3 levels and add to TreeControl. After that I click the third level of TreeControl. At that time I need to recreate my TreeControl with new values. (using the selected node info and to make a new XPath Query)...

Random
How to get random numbers random(0xff) ? int RandomNumber = (rand() * 0xff) / RAND_MAX; -- Bjarne Nielsen "]GHO[" <yousefk@taux01.nsc.com> skrev i en meddelelse news:0b6a01c3a84c$5733a890$a501280a@phx.gbl... > How to get random numbers random(0xff) ? rnd()%0x100; or rnd() & 0xff; Ruben On Tue, 11 Nov 2003 04:07:13 -0800, "]GHO[" <yousefk@taux01.nsc.com> wrote: >How to get random numbers random(0xff) ? ...

Passwords #3
I have a summary roll up file that is linked to about 15 other workbooks that have password protection on them. Because of the passwords when I pull up the linked roll up file it asked for ALL workbook passwords (15) in all. Is there a VBA statement that I can add to my statement that calls the rollup file to not recognize the passwords??? Thanks Frank Take a look at Worbooks.Open in Help, specifically the UpdateLinks argument. In article <564901c376df$b58d8080$a601280a@phx.gbl>, "Frank B" <fborger@yahoo.com> wrote: > I have a summary roll up file that is ...

split text in cell into 2 #2
I am needing help with formula. I have a cell that takes values from several other cells to create one long line text (for a command used in other app). For ex: a1=jack b1=ran c1="There was a person named"&a1&"who was hungry and"&b1&" to Mcdonalds." This process works great except when the line is greater than 72. I need an if statment that says if c1 length is greater than 72 then to split / replace line AFTER name(a1). Where the split occurs a +, and carriage return needs to be added. So in the example above (pretend length will be over 72,...

combining two woeksheets into one
Hi, Is it possible to combine two work sheets into one? ex: I have first.xls, second.xls firsr.xls has columns aa,bb,cc second.xls has columns aa,bb,cc,dd Both sheets has different data. I want to see them in a single sheet at once side by side and should be able to compare. appreciate your help. Thanks The easiest thing to do, is to copy/paste from second.xls into ff, gg, hh. Or.... if you have Excel 2003, you can go to Window-->compare side by side with or.... Use helper columns to do the comparison for you. In first.xls go to column ff. Use the following formula =EXACT(AA1,...

Can We have two separate email Icons 03-22-10
Can I set up two email icons on my desktop? One that checks my email address and one that checks my sons. Thamks, Mike No. They would both open the same instance of OE. The only way to be completely separate is to have two Windows logons. -- Bruce Hagen MS-MVP [Mail] Imperial Beach, CA "amdx" <amdx@knology.net> wrote in message news:512b5$4ba817b2$18ec6dd7$4464@KNOLOGY.NET... > Can I set up two email icons on my desktop? > One that checks my email address and one that checks my sons. > &...

Outlook SLow #3
ANyone know why Outlook constantly freezes for about 10 seconds at random times? Will service pack 1 fix this?? >-----Original Message----- >ANyone know why Outlook constantly freezes for about 10 >seconds at random times? Will service pack 1 fix this?? >. >believe it or not, spyware may be causing the problem. Try running a program to delete any spyware on the system. This may be one reason. ...

Purchasing Excel eBook 3 Pak
I would like to buy the Excel eBook 3 Pak in a downloadable form and al that is listed on the web site is a 3 Pak in CD form. I would think it is more cost efficient to deliver in an electroni form. Is there a way to purchase the 3 Pak in an electronic form? Please advise -- Ahmed V. Ismai ----------------------------------------------------------------------- Ahmed V. Ismail's Profile: http://www.excelforum.com/member.php?action=getinfo&userid=911 View this thread: http://www.excelforum.com/showthread.php?threadid=50044 You are looking at advertising on a commercial site and ar...

Splitting a workbook
Is there an easier way for splitting a workbook into 2 smaller workbooks than copying the workbook to 2 separate files and deleting the appropriate worksheets from each? Thanks in advance. -- Tony Vella in Ottawa, Canada http://www.datapigtechnologies.com/freeware.htm Tony Vella wrote: > Is there an easier way for splitting a workbook into 2 smaller workbooks > than copying the workbook to 2 separate files and deleting the appropriate > worksheets from each? Thanks in advance. > -- > Tony Vella in Ottawa, Canada Why do you want to split the file? If you do you'll h...

DEFINING RELATIONSHIP BETWEEN TWO or more TABLES
I am often required to design queries involving two or more tables andlinking them together. However, in linking them together, i am oftenforced to define the relationship (join properties) between thesetables and I often get it wrong. As a result, i get duplicate recordsall the time. Can anyone explain to me how does one-to-many and many-to-many relationship really works in Access queries and how are theybeing retrieved. And how can i easily identify and fix this probem.Plese give some examples.Thanks,Pago On Mar 27, 8:23 am, pago_b...@yahoo.com.au wrote:> I am often required to design ...

Random Numbers not Random
When I create random numbers for an application with the following code, they do not appear to be truely random. Me.RandomNumber = Int(9 * Rnd + 1) ' Random Number 1-9 If I have two users setting side by side, they frequently get the same results when they start up Am I missing somthing? Any help would be appreciated -- Frank Wagner fwagner111@aol.com Frank, Have you tried a Randomize statement just before the RND? That should do it. -- hth Al Campagna Microsoft Access MVP http://home.comcast.net/~cccsolutions/index.html "Find a job tha...

3 Implementation questions on one.
1. What are the rules with your CRM org. name? We have an existing Exchange org; must they match? Must the org name used in the license request match the CRM org name? 2. We would like to setup a CRM staging enviornmet. How should we do this? What are the limitaitons and requirements? 3. We would like to setup CRM so that when you enter in a lead, based on the zip code of the lead, they are automatically assigned to a sales person that covers that zip. We have a spreadsheet with all our sales people and the zips in their territory. How can we make CRM do this? Thanks, Malcolm 3...

Worksheet protection #3
I want to lock all of the cells in all worksheets that have formula's. I have 26 pages and the formula's are in the same cells on each page. I want to protect the formula's but the only way I can find to do this is by protecting one page at a time. So everytime I need to change a formula, I have to unprotect each page, change the formula, then protect each page again. Is there a way to protect all of the pages at one time? When I try to use the protection for the workbook, it doesn't seem to do anything. Jeannine Wrote: > I want to lock all of the cells in all wor...

julian date #3
Is there a formula or line of code that, in a date cell, would automatically convert a date entered in the 07/16/04 format into the 04188 julian format? thank you. You can't have a formula in a cell and make an entry in it - the formula is overwritten. You can use an event macro: Put this in the worksheet code module (right-click the worksheet tab and choose View Code): Private Sub Worksheet_Change(ByVal Target As Excel.Range) Dim nYear As Long Dim nDays As Long With Target If .Count > 1 Then Exit Sub If Not Intersect(.Cells, Ra...

Outlook and Service Pack 3
Office XP service pack 3 is interfering somehow with Outlook 2002 functionality. Following installation of service pack 3, every time I am sending an email a warning pops up stating "A program is trying to access e- mail addresses you have stored in Outlook. Do you wish to allow this?" Then I have to choose yes or allow access for one minute. A virus is not present on my computer. I have offloaded and reloaded Office XP and it functions fine until I load service pack 3. I have gone through all settings but cannot find a fix. PLEASE advise what to do with this irritati...

Exchange server randomly asking random users to Authenticate randomly...
Has anyone had a problem like this where a user is in their Exchange mailbox using Outlook 2003 and the server prompts them for their credentials? If they hit cancel and exit out of Outlook and run it again it lets them into their mailbox. Thank you, Benjamin Pls make sure the outlook 2003 Client does not using Cache mode If is ,pls clearn this option -- Jammy "Benjamin Chan" <bchan[pleasenospam]@controlproductsinc.com> ���g��l��s�D :#BxdizrtEHA.3972@TK2MSFTNGP15.phx.gbl... > Has anyone had a problem like this where a user is in their Exchange mailbox > using Outlo...

Payroll Security #3
I have a 2 part question: My company is a transportation company, and we need to pay personnel in many different ways. Some of our drivers get paid by miles/stops/etc. So there is a need for us to be able to allow managers process these pay codes thgrough Business Portal. 1. Is there a way that business portal can accept more than 24 units of payroll measurement per day? Or is 24 a hard coded limit, because it is the maximum number of hours someone could potentially be paid in a single day? 2. Because we have not been able to figure out Q1, we have a department manager set up to en...

Placing dataset into XPathDocument
I have a merged dataset that contains xml read from SQL Server. I need to place the data into an XPathDocument. I can do the following: mydataset.writeXML("mydata.xml") dim xpdoc as new XPathDocument("mydata.xml") Problem is it seem rather redundent to write data currently in memory to disk in order to be read on the next line. According to the documentation the writeXML method supports writing to System.IO.Stream and the XPathDocument supports load from System.IO.Stream but I can not seem to get this to work. Any suggestions? Cheers Keith Keith Chadwick wrote: &...

Generating Correlated Random Values in Excel
Hi Mike, Thanks so much for your advice. I think that I have it figured out, but can you confirm this for me? I can't afford to screw this up. If mean(inflation) = .031 stdev(inflation) = .047 mean(stock return) = .106 stdev(stock return) = .204 Column Headers: A B C D 1 Z1 Z2 Inflation Stock Return Data Generation Formulas A2 =NORMINV(RAND(),0,1) B2 =NORMINV(RAND(),0,1) C2 =0.031+0.047*A2 D2 =0.106+0.204*(A2*rho+B2*(1-rho^2)^0.5) Are these formulas correct to generate the random data? Particularly cell D2. Does this give me a complete value that t...

random pages
I have developed pages that randomly define a set of normal lung function data based on gender, age and height. I have applied formulas to this data simulate 5 classic lung function abnormalities (normal, restrictive, obstructive, mixed and early obstructive.) Currently the sheets that define the abnormal are named by one of the above lung abnormalities. I would like to present data from random pages in a work book to students and rather than have the pages named by the abnormality be named a simulated patient name. My questions. Can I have the name of a sheet set to equal data from to ...

outlook reminder #3
My computer's default reminder is set at 15 minutes, it keeps changing the reminder to 18 hours. Is this a glitch in the program? Have you heard this before? This problem makes the reminder useless, HELP! The 18 hour reminder is the default for all day events. If you created an appointment that wasn't an all day event it should default to 15 minutes. Does it? -- Patricia Cardoza Outlook MVP www.cardozasolutions.com Author, Special Edition Using Microsoft Outlook 2003 ***Please post all replies to the newsgroups*** "eha" <anonymous@discussions.microsoft.com> wr...