table design question

I have table of 40 milion rows (sql 2005). There is one column, Id, which is 
integer (basically month), and there are 20 different values for this column 
(around 2 million rows per id value). At the same time, there is very long 
primary key on this table (12 columns, around 400 bytes), and nonclustered 
index on this primary key. Table is being accessed frequently by Id, so I 
created clustered index on this column (usually, developers create temporary 
tables per id, and then do further processing) to lower physical reads...

Does this sound like a desirable design? I don't want to have long clustered 
key, but at the same time it is not selective. I thought of having surogate 
column (identity) beside id to get selectivity of clustered (and unique in 
this case) key: id,<new_identity_column>, but then, what would i really get 
by doing this?

Thanks in advance,
Pedja
0
Utf
3/2/2010 3:20:01 PM
sqlserver.server 1327 articles. 0 followers. Follow

5 Replies
750 Views

Similar Articles

[PageSpeed] 40

Have you considered partitioning the table on Id?

"Pedja" <Pedja@discussions.microsoft.com> wrote in message 
news:7FF13B6D-5A98-4E7D-BC9E-5F0C3B264330@microsoft.com...
>I have table of 40 milion rows (sql 2005). There is one column, Id, which 
>is
> integer (basically month), and there are 20 different values for this 
> column
> (around 2 million rows per id value). At the same time, there is very long
> primary key on this table (12 columns, around 400 bytes), and nonclustered
> index on this primary key. Table is being accessed frequently by Id, so I
> created clustered index on this column (usually, developers create 
> temporary
> tables per id, and then do further processing) to lower physical reads...
>
> Does this sound like a desirable design? I don't want to have long 
> clustered
> key, but at the same time it is not selective. I thought of having 
> surogate
> column (identity) beside id to get selectivity of clustered (and unique in
> this case) key: id,<new_identity_column>, but then, what would i really 
> get
> by doing this?
>
> Thanks in advance,
> Pedja 


0
Jay
3/2/2010 4:14:36 PM
It's difficult to answer to this question without viewing the main 
queries that are based on...

Personnally I will try to place an IDENTITY columns wich type is based 
over the length of the processor's word (32 ou 64 bits).
Then making this colums as the clustered index.
If the main part of the queries are based upon the actual primary key 
(400 bytes), I will try to reduce the effort of searchnig by adding a 
persisted calculated colums wich checksum the hole components of your 
key. Then add this colum as the first colums of the primary key's index 
and add this part of search in all the queries you have....

A +

Pedja a écrit :
> I have table of 40 milion rows (sql 2005). There is one column, Id, which is 
> integer (basically month), and there are 20 different values for this column 
> (around 2 million rows per id value). At the same time, there is very long 
> primary key on this table (12 columns, around 400 bytes), and nonclustered 
> index on this primary key. Table is being accessed frequently by Id, so I 
> created clustered index on this column (usually, developers create temporary 
> tables per id, and then do further processing) to lower physical reads...
> 
> Does this sound like a desirable design? I don't want to have long clustered 
> key, but at the same time it is not selective. I thought of having surogate 
> column (identity) beside id to get selectivity of clustered (and unique in 
> this case) key: id,<new_identity_column>, but then, what would i really get 
> by doing this?
> 
> Thanks in advance,
> Pedja


-- 
Frédéric BROUARD, MVP SQL Server, expert bases de données et langage SQL
Le site sur le langage SQL et les SGBDR  :  http://sqlpro.developpez.com
Audit, conseil, expertise, formation, modélisation, tuning, optimisation
Enseignant aux Arts & Métiers PACA et à L'ISEN Toulon - Var Technologies
*********************** http://www.sqlspot.com *************************
0
Fred
3/2/2010 4:36:55 PM
A couple of thoughts.  Keep in mind that a table is physically organized by 
the clustered index.  Think Order By.  This is why you can only have one 
clustered index per table. In case of duplicates Sql Sever adds an additional 
tail in order to make each index entry unique.  Having a clustered index with 
20 unique values for ID across 40 million rows probably makes little sense.  
What makes a really good clustered index is an ever increasing value like 
transaction datetime so that all new entries are added to back this avoiding 
major sorting and page splits. Real world cases are not always so clean.  I 
would rethink your indexing choice and, as always test, test, test.  
Obviously with such a large table you must make your decisions very 
carefully.  How many inserts/updates/deletes?  vs. how  many reads?  At times 
it makes sense to have a large table be a heap(table without a clustered 
index) but this needs to be proven out and not a default position.
I saw Jay’s comment on partitioning which might be a good option assuming 
you have the Enterprise version of SS2005.

Good luck!

"Pedja" wrote:

> I have table of 40 milion rows (sql 2005). There is one column, Id, which is 
> integer (basically month), and there are 20 different values for this column 
> (around 2 million rows per id value). At the same time, there is very long 
> primary key on this table (12 columns, around 400 bytes), and nonclustered 
> index on this primary key. Table is being accessed frequently by Id, so I 
> created clustered index on this column (usually, developers create temporary 
> tables per id, and then do further processing) to lower physical reads...
> 
> Does this sound like a desirable design? I don't want to have long clustered 
> key, but at the same time it is not selective. I thought of having surogate 
> column (identity) beside id to get selectivity of clustered (and unique in 
> this case) key: id,<new_identity_column>, but then, what would i really get 
> by doing this?
> 
> Thanks in advance,
> Pedja
0
Utf
3/2/2010 5:06:01 PM
Pedja wrote:
> 
> I have table of 40 milion rows (sql 2005). There is one column, Id, which is
> integer (basically month), and there are 20 different values for this column
> (around 2 million rows per id value). At the same time, there is very long
> primary key on this table (12 columns, around 400 bytes), and nonclustered
> index on this primary key. Table is being accessed frequently by Id, so I
> created clustered index on this column (usually, developers create temporary
> tables per id, and then do further processing) to lower physical reads...
> 
> Does this sound like a desirable design? I don't want to have long clustered
> key, but at the same time it is not selective. I thought of having surogate
> column (identity) beside id to get selectivity of clustered (and unique in
> this case) key: id,<new_identity_column>, but then, what would i really get
> by doing this?
> 
> Thanks in advance,
> Pedja

Pedja,

Your naming of the column Id is very confusing to me, because Id is
short for Identifier and suggests that each Id uniquely identifies a
row. Apparently not in your case. So to avoid any confusing, I will just
refer to that columns as "month".

If many queries are done on month (and not much else), then using it as
the clustered index could be a good choice.

Two facts about clustered indexes that you may or may not know:
- the clustered index key is stored with each nonclustered index key. So
the size of the clustered index has an immediate effect on the
nonclustered index size
- any nonunique value in the clustered index gets an invisible sequence
number (an integer) that combined with the index key makes for a unique
value. This is sometimes called a uniquefier

If your clustered index is on month, then that index definition will
only add 8 bytes to each nonclustered index (2 ints).

A 12 column primary key of 400 bytes per key sounds ridiculous. I don't
think such a primary key is very practical. I wonder if the table is
properly normalized. Anyway, you definitely don't want to have that as
the clustered index if you also want to use any nonclustered indexes.
For exact matches of the primary key, the clustering of the primary
key's index is not very important. For range scans, that is another
issue.

Of course you could consider a suggorate key. But before you are
considering that, you first might want to check whether the table is
properly normalized, because it doesn't feel right.

-- 
Gert-Jan
0
Gert
3/2/2010 9:54:07 PM
Table is denormalized, it is datamart. Table gets loaded monthly as part of 
calculation, and after that different reports and processing cubes runs from 
it.

Thanks to all of you

"Gert-Jan Strik" wrote:

> Pedja wrote:
> > 
> > I have table of 40 milion rows (sql 2005). There is one column, Id, which is
> > integer (basically month), and there are 20 different values for this column
> > (around 2 million rows per id value). At the same time, there is very long
> > primary key on this table (12 columns, around 400 bytes), and nonclustered
> > index on this primary key. Table is being accessed frequently by Id, so I
> > created clustered index on this column (usually, developers create temporary
> > tables per id, and then do further processing) to lower physical reads...
> > 
> > Does this sound like a desirable design? I don't want to have long clustered
> > key, but at the same time it is not selective. I thought of having surogate
> > column (identity) beside id to get selectivity of clustered (and unique in
> > this case) key: id,<new_identity_column>, but then, what would i really get
> > by doing this?
> > 
> > Thanks in advance,
> > Pedja
> 
> Pedja,
> 
> Your naming of the column Id is very confusing to me, because Id is
> short for Identifier and suggests that each Id uniquely identifies a
> row. Apparently not in your case. So to avoid any confusing, I will just
> refer to that columns as "month".
> 
> If many queries are done on month (and not much else), then using it as
> the clustered index could be a good choice.
> 
> Two facts about clustered indexes that you may or may not know:
> - the clustered index key is stored with each nonclustered index key. So
> the size of the clustered index has an immediate effect on the
> nonclustered index size
> - any nonunique value in the clustered index gets an invisible sequence
> number (an integer) that combined with the index key makes for a unique
> value. This is sometimes called a uniquefier
> 
> If your clustered index is on month, then that index definition will
> only add 8 bytes to each nonclustered index (2 ints).
> 
> A 12 column primary key of 400 bytes per key sounds ridiculous. I don't
> think such a primary key is very practical. I wonder if the table is
> properly normalized. Anyway, you definitely don't want to have that as
> the clustered index if you also want to use any nonclustered indexes.
> For exact matches of the primary key, the clustering of the primary
> key's index is not very important. For range scans, that is another
> issue.
> 
> Of course you could consider a suggorate key. But before you are
> considering that, you first might want to check whether the table is
> properly normalized, because it doesn't feel right.
> 
> -- 
> Gert-Jan
> .
> 
0
Utf
3/2/2010 10:47:02 PM
Reply:

Similar Artilces:

Data table in chart?
This is a multi-part message in MIME format. ------=_NextPart_000_01A9_01C69782.28506D60 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I have some line chart portraying responses to 20 items. I would like to put a table under each chart the shows the value of not = only the responses that are charted but also the percentage difference = between the two. My problem and the question is that if I use the 'data table' option in = creating the chart the row with the difference is not there. Then if I = try to paste a table under ...

Combobox for Table, From Table
Hi, The online help function tells me I cant make a combobox which finds its entries from a table and stores its entries in the same table. I want users to be able to enter areas where clients live. The combobox needs to fill the Area column in the appropriate table. But I want the combobox to be filled with previous entries from the same column, with the provision that a new entry can be made. The project is expanding thats why I need this. Or do I need to make an extra table with areas that can be filled with a separate command, which is then used as a source for the com...

Filter Question!
I have a table [tblAccountStatus] that has around 6 fields for receiving payments 2 of the fields are [ClientID] (Number) and [ModeOfPayment] (Text) I have created a query using these to fields , made a combo box using this query ,put it on my form and made Control Source [ModeOfPayment] Now what I would like to do is when I enter the ClientID on to my form the query will only show [ModeOfPayment] that has been entered with that [ClientID] Hence when I select Mr Tom Cruise , I will get a selection my from my Combo Box as the names of the checks I have entered under his ClientID Hope thi...

Copy data from on table to another ?
I have lost a lot of data in my backend, but it is only data from a few columns, in one tabel, that I have lost. My tabel is called taCustemor, and the columns in that table is called fakFirm, fakAdress. I have use a custemornumber as a primarykey. So now I want to copy only these columns from my backup of the backend to my backend in use. I'm sure that I can do this buy using SQL, but I'm not sure how to make this querie. Can some one out there help me ? You want to use an append query. See Help file for info. In the database where you want the data to be when you're done, ...

pivot table again
i have a range of fields in a pivot table some are qty others are value how can i split the table into two seperate sections one showing qty and the other showing values (if that makes sense) If your quantities and values are in the same column, you'd need to use a "helper column" that differetiates them (Qty vs Amt) Then you can include that field in the pivot table before tha field containing the quantities and amounts. Does that help? *********** Regards, Ron XL2002, WinXP "max power" wrote: > i have a range of fields in a pivot table some are qty other...

Landscape Table on Portrait Page
Is it possible to create a landscape table on a portrait page, i.e. a page with a normal portrait header and footer? I am aware that one can change the text direction of each cell in the table to simulate a landscape table, but I would like to know if there is not an easier way. See http://word.mvps.org/FAQs/Formatting/LandscapeSection.htm -- Suzanne S. Barnhill Microsoft MVP (Word) Words into Type Fairhope, Alabama USA http://word.mvps.org "Johann Swart" <JohannSwart@discussions.microsoft.com> wrote in message news:1119FBC3-2001-48C7-B02E-07D85C7F444E@mic...

Where is tables in reports
Is there anyway to get tables into reports? If there is none then is there a work around. The forms I have to make use many block cells. In word tables makes things easy. ...

Ensuring only one commission per product in Access Table
Good afternoon, Please can someone help me. In my Access Database I have a table called tblCommission, with the fields: Commission_ID;Client_account; Product_code; Commission; Exchange (£ or $) I need a method to ensure that each client can have a commission set for every product - but that a client cant have to commission's for one product: eg: I need a method to prevent this: Comm_ID Client_account Product_code Commission Exchange 1 000001 Coke 3 $ 2 000001 Coke ...

How do you create a table with a shaded header in Word 2007 or 201
Hi.. Can anybody tell me, how you create a table, and then shade the first row in it to something like gray, then set the grid lines to gray or something, and maybe set the alignment in it to centered... This is 2007 or 2010, not 2003 in which it was possible to do so. Thanks. Select the first row of the table and then right click with the mouse and select Borders and shading. For the alignment, select and right click and select Cell alignment -- Hope this helps. Please reply to the newsgroup unless you wish to avail yourself of my services on a paid consulting ba...

Recreating HR tables
I've been having some problems with HR. I've had to create some tables so the program would run. I simply copied the tables from another compagnie where is was saying I had missing tables. I'm wondering if there is a list of Tables for HR and Canadian Payroll or if it were possible to recreate these missing tables another way? I currently have this error as well: Could not find stored procedure 'Compagnie.dbo.hrCompDoesHrCompExist' ...

Table Expert
When a case is resolved, there is a dropdown to select the amount of "Billable Time" for the case. I have searched tables but cannot find where this data is stored. Is there anyone that know the schema where this data is kept? Bryon, When you resolve a case, the resolution screen fields are added into the activity table as a resolution type of activity. The Billable Time value appears in the TimeSpent field. (While building my auditing application, VAST, I've had to become much more of a CRM schema expert that I ever wanted to be - by the way, you can see these reso...

Pivot table subtotals 100% #2
This is what I am aiming for but how can I get 100% in subtotals for first year and the increase %age for the second and subsequent year please 2002 2003 Apples excellent 15 22.72% 16 24.23% good 21 31.82% 25 37.88% poor 30 46.46% 39 60.39% 66 100% 80 122.5% Greens excellent 12 17.65% 37 54.42% good 27 39.70% 51 74.98% poor 29 42.65% 47 69.12% ...

Table info
In Vendor Maintenance window , there are multiple Address ID's. In which table i can find this information. Great Plains 7.5 Thanks in advance Deepwater, PM00300 holds all the address information for each address ID. -- Victoria Yudin Dynamics GP MVP Flexible Solutions - home of GP Reports http://www.flex-solutions.com/gpreports.html "Deepwater" <Deepwater@discussions.microsoft.com> wrote in message news:F9296B22-2B44-4D64-9E3C-DC5FE4B07028@microsoft.com... > In Vendor Maintenance window , there are multiple Address ID's. In which > table i can find this...

Extender tables #2
Hello! We are implementing an integration from Microsoft CRM into Microsoft Great Plains and this would include writing CRM fields to Extender Fields. Are there any available SDK's for Extender or maybe any table lists? Thanks, Andi Saldana There are but I think only partners can get them. Please send me a message and I'll send you what I have. "Andi Saldana" wrote: > Hello! > > We are implementing an integration from Microsoft CRM into Microsoft Great > Plains and this would include writing CRM fields to Extender Fields. Are > there any avai...

Pivot Table Axis format
Hi. I have a macro that creates a pivot-table and pivot chart. The X-axis is a date. In the raw data, the date is formated as month/day. This is the way I want it on the chart. However, the X-axis formats with month/day/year. How can I force this to display correctly? Thanks, Mike. Do you see a grey little dropdown under the X axis (where you can choose your dates to show)? If yes, then rightclick on it. Select the top option: "Format pivotchart field" then click on the Number button on the next dialog. There are a bunch of date formats that you can choose from. Maybe...

Inserting table to other table
I have a form which consists of a table. I want to insert some selected value into other table when i update and at the same time I want it to save into my own table too... May I know the way please..... -- Message posted via AccessMonster.com http://www.accessmonster.com/Uwe/Forums.aspx/access-formscoding/200704/1 I'm having trouble visualizing your situation. Could you provide an example of data? -- Regards Jeff Boyce Microsoft Office/Access MVP http://mvp.support.microsoft.com/ Microsoft IT Academy Program Mentor http://microsoftitacademy.com/ Microsoft Registered Partner http...

If Formula Question #2
Hi There I am struggling with what appears to be an easy formula. Here is what I am trying to do: If the values of cell vale of B4 begins with either 20 or 30 then the (B20*0.5) Basically if a part number is for example 20-504 or 30-555 then we need to give it 50% discount. We have many part numbers beginning with 20 or 30 that need 50% discount. Any help or advice offered will be most welcome. Many thanks in advance. -- Kind Regards Ady Use the following: =IF(OR(LEFT(B1,2)="20",LEFT(B1,2)="30"),C1*0.5,C1) "Ady" <ady@nospam.com> wrote in me...

table to new table
hi thanks for your time and help i have an access table which is an automatic update so its data is constantly changing but the fields stay the same. i need to be able to transfer the data at the click of a button from the linked table into an access table without deleting the data already stored in the main table. i have tried looking at the transfer spreadsheet but this does not work. thanks for the help hi, Rivers wrote: > i have an access table which is an automatic update so its data is > constantly changing but the fields stay the same. > i need to be able to transfer...

Pivot Table not sorting correctly
In column A I have several of the same names listed more than once spelled exactly the same but when these line are pulled into a pivot table each is displayed seperately. I followed the same exact procedure for this table as I have for the others that I displayed correctly. What causes those lines to be displayed on separate lines? Perhaps some of the entries have space characters at the end, and others don't. Marian wrote: > In column A I have several of the same names listed more than once spelled > exactly the same but when these line are pulled into a pivot table each ...

Relay question #2
Hello, I will appreciate your help in the next problem: Exchange server 2003. The domain name is abc.com and i created another domain for some user accounts: xyz.com with the recipient policies. All ok. I have users that belongs to both companys and they needs to send mails from both domains. For one user I configured account abc.com as exchange account and then the xyz.com as pop account. I used outlook 2003. I allowed the rely permission for this user and his pc. The problem start when i send mail from the xyz.com account to an external account. At destination, the mail arrives as if it...

Pivote table adding Fileds
Dear All Happy new year I had creat pivote table on table that was exported from Access (it was exported from query that based on relation beteen tabled ) ,if i make refreshing data on the same filds the new data will be refrexhed withot any problem , but when i add new filds in the query and exported again (with the same name and the same orginal fields but with one or two more additional fileds ) if i make refresh he can get and match the old filds but not the new filds , so i can not add this filds to the pivote table unless i restart the pivote table from the start , MY Question is , is...

TAble Insert
I have a person table that has id, firstname, lastname, address, city, state, zip. I have set all fields except id to have required = no and allow zero length = yes. But when I try the following query it will not insert the record because the firstname is not filled out. Any ideas? INSERT INTO person ( id, lastname, firstname, address, city, state, zip ) VALUES (1783, 'Hacienda La Daniela', '', '5310 SW 192 Avenue', 'Sw Ranches', 'Fl', '33331'); Try inserting Null instead of a zero-length string: INSERT INTO ... VALUES (1783, 'Hacienda...

DelayLoad questions
I am tryng to delay load a DLL and have a few questions hopefully somebody can answer. If I correctly setup a DLL to be delay loaded would it be required to be in my PATH on startup even though I am not invoking any functions defined in the DLL at startup time. Is there some tool I can use (maybe process monitor or something along that line) that will show me that my DLL has not been loaded even though my program is running. Lastly, is there a preferred method to delay loading a DLL. Is it better to to do it programatically or through options in Visual Studio? Thanks in advance for all tha...

Loading Table with Associated Foreign Table
Hope I've expressed my question correctly in the title. I have two separate tables. Table1: fldIDRepairName << AutoNumber and Primary Key fldRepairName << Index, Unique fldIDGroupName << Link to Foreign Table Table 2 ( I call this Foreign): fldIDGroupName <<AutoNumber and Primary Key fldGroupName ================== If trying to load an alphabetical list of the "fldRepairName" in Table 1: and at the same time get the "fldGroupName" into a second list box?? =================== If I loop "Tabl...

Word and Publisher General Questions
the window keeps on moving and blinking as i choose a document to view making it difficult for me to edit the existing document or compose a new one. Have you tried to update your video/graphics driver? Read the third FAQ here. Why can I not see images/shapes/lines when editing my publication? http://ed.mvps.org/Static.aspx?=Publisher/FAQs You might check Accessibility options in the Control Panel. Turn off high contrast if it is enabled. If you are using Vista, the folder is called Ease of Access Center. -- Mary Sauer MSFT MVP http://office.microsoft.com/ http://msauer.mvps.org/ news:/...