Large XML file opinions

Hi,

I have several large XML files (500-700 MB) that get updated once a day.  I 
have an app that will need to Query (read only operations is all I'll ever 
need) these files.  Obviously, querying a file this large will be somewhat 
of a challenge because it can't be loaded into memory all at once.  Also, I 
am skeptical of the speed of Xpath/Xquery.

I am wondering if I shouldn't just create a table in SQL server, then create 
a routine that dumps the XML file's values into corresponding SQL server 
columns.

More background info, my app will need to query this XML data fairly 
regularly, about every 5 minutes that the app is being used.  So speed and 
efficiency are crucial.

Any thoughts on a best implementation strategy would be much appreciated!

Thanks,
Marc.



0
Marc
2/8/2005 3:24:32 PM
dotnet.xml 7266 articles. 0 followers. Follow

2 Replies
517 Views

Similar Articles

[PageSpeed] 27

Without running any actual benchmarks, I would say that putting the
described XML file into a database will be a more efficient solution for the
outlined problem.  I'm also assuming that the fields that you will be
searching on can be indexed.

Richard Rosenheim


"Marc Thompson" <(NO SPAM) my email address is marc at sycron dot com> wrote
in message news:uLw1HJfDFHA.1260@TK2MSFTNGP12.phx.gbl...
> Hi,
>
> I have several large XML files (500-700 MB) that get updated once a day.
I
> have an app that will need to Query (read only operations is all I'll ever
> need) these files.  Obviously, querying a file this large will be somewhat
> of a challenge because it can't be loaded into memory all at once.  Also,
I
> am skeptical of the speed of Xpath/Xquery.
>
> I am wondering if I shouldn't just create a table in SQL server, then
create
> a routine that dumps the XML file's values into corresponding SQL server
> columns.
>
> More background info, my app will need to query this XML data fairly
> regularly, about every 5 minutes that the app is being used.  So speed and
> efficiency are crucial.
>
> Any thoughts on a best implementation strategy would be much appreciated!
>
> Thanks,
> Marc.
>
>
>


0
2/8/2005 5:41:24 PM
Marc Thompson wrote:

> I have several large XML files (500-700 MB) that get updated once a day.  I 
> have an app that will need to Query (read only operations is all I'll ever 
> need) these files.  Obviously, querying a file this large will be somewhat 
> of a challenge because it can't be loaded into memory all at once.  Also, I 
> am skeptical of the speed of Xpath/Xquery.
> 
> I am wondering if I shouldn't just create a table in SQL server, then create 
> a routine that dumps the XML file's values into corresponding SQL server 
> columns.
> 
> More background info, my app will need to query this XML data fairly 
> regularly, about every 5 minutes that the app is being used.  So speed and 
> efficiency are crucial.

Alternatively you may want take a look at XML data type columns in SQL 
Server 2005.
Otherwise you would need to load that XML into XPathDocument (read-only 
hence simpler, faster and smaller than XmlDocument) and query it. 
Optimize your queries (never use // etc).
If your queries are similar ones, you might try XPath indexing using 
IndexingXPathNavigator from Mvp.Xml library (http://mvp-xml.sf.net/common).

-- 
Oleg Tkachenko [XML MVP, MCP]
http://blog.tkachenko.com
0
Oleg
2/9/2005 11:03:47 AM
Reply:

Similar Artilces:

copy linked documents (2 files)
I can't quite figure out how to solve this... so here goes... I'm making a report template in Word that has several links to an Exce spreadsheet. When finished, I would like to be able to reuse thi combination (ie: make copies of both files) while maintaining th links. However, if I copy the Excel file, it is not linked t anything; if I copy the Word file, it maintains the links the origina Excel. Does anyone know how to solve this?? Thanks for your help -- Message posted from http://www.ExcelForum.com FYI - I tried recording a macro to update the links automatically, bu that d...

open file from WinExplorer in new window
is Currently when i open an Excel file from WinExplorer, it shows up on the task bar but not open in a window where i can see the worksheet. Can that be set to always open in a new window? I thought I saw that choice with some application recently. I have just upgraded to using MS Office Excel 2007. I am not quite clear on what you are trying to achieve but based from my understanding I think you want to open several excel files in different windows? If you do, apparently excel doesn't do that, it only opens one windows tih all the documents inside. Please let me kno...

Old Quattro Pro File Conversion
I have some old (ca. 1989-1990) Quattro Pro 3.0 spreadsheets that were saved in Borland's SQZ format (*.wkz). I would like to be able to recover the data in any format that can be read by Excel, however, the only utility that I have found on the web does not seem to work. I saved the disks from the original program, however, I seem only to have the first 2 of a three disk set. Does anyone have experience with this problem? I would love to hear your solution. Thanks, Ray I can't remember that extension but I have quattro 4.0 and 5.5 so if you want to send to my email bel...

How can I change the GUID of a control in an rc file programatically?
Hi all, I have a project that contains a huge amount of rc files that have dialogs in. lots of these dialogs contain custom OCX controls. Some of the OCX Controls have changed their GUIDs, which is out of our control, and now the dialogs wont load (in the resource editor firstly, oviously the code won't run). My immediate thought when I saw this was, "thats ok we can grep the rc files and change the old GUID to the new one". That was before I found that the GUID was no where to be found! I assume the that it is encoded and stored in the DLGINIT structures at the bottom o...

How do I link an incrementing file name like reports1or reports2
=SUM('C:\F drive\My Documents\Expense reports\[Lkreport47.xls]Mileage Log and Reimbursement'!J5) I want to increment the file name to the latest one which is always ending in a higher numerical. Try =INDIRECT("'C:\F drive\My Documents\Expense reports\[Lkreport" & ROW(A46)+1 & ".xls]Mileage Log and Reimbursement'!J5") If this post helps click Yes --------------- Jacob Skaria "gonzerv" wrote: > =SUM('C:\F drive\My Documents\Expense reports\[Lkreport47.xls]Mileage Log and > Reimbursement'!J5) > ...

for xml auto and synonyms
in SQL 2005 if you issue a select with for xml auto from a synonym with an alias, the returned xml will include the synonym's parent object instead of the alis name, but in sql 2008 this works fine. for example create synonym syscolumns_test for msdb.dbo.syscolumns go select name from syscolumns_test test for xml auto on SQL 2005 will return <msdb.dbo.syscolumns name="" /> on SQL 2008 will return <test name="" /> Is there a hotfix for this? On Tue, 15 Jun 2010 16:06:41 -0700, pvdm <pvdm@discussions.microsoft.com> wrote: >in S...

Link a chart from another Excel file
I can't figure out how to paste a chart (with data I guess) into another Excel file so that it's linked. I'm doing it in Excel 2000. Can anyone help? thanks. Hi Andy: I dont have XL2000 (but 2003). You can Copy and "Paste Special" [select: Paste as Link] from Excel to Word, and likewise from Word to Excel, but not from Excel workbook A to Excel workbook B weirdly enough. However, I tested it by copying and pasting as link the chart from Excel to Word and then from Word to a new Excel workbook (again copy/paste special, selecting "paste as link"). N...

File opens in Money2005 on desktop but not on Laptop
I have used Money since 1998 upgrading throughout various versions over the years. I recently purchased Money2005, installed on the desktop, and the "current" file upgraded fine. I then installed Money2005 on my laptop saved my "current" file to my flashdisk and tried to open on the laptop. It takes me straight to the "About upgrading your money file" screen and I can't get any further. The sample file opens okay. Any ideas? You need to open the sample file on both machines and then do an Internet Update on both machines until Help|About shows version ...

IXMLDOMDocument::loadXML fails on what seems to be a perfectly fine xml
Hi, I am trying to parse an XML file using DOM in VC++. I need an element which is inside other elements. So I am loading the file. Calling GetElementsByTagName to get the first level elements. Iterating through them, get to the one I need and here I am actually in need of another GetElementsByTagName. Of course there is no such func in IXMLDOMNode. Therefore I am creating another IXMLDOMDocument and trying to load into it that inner xml part of my desired node, by using IXMLDOMNode::xml property. I am using loadXML, of course. And it fails, it returns -1. While when I substitut...

Possible to convert xls files to qif files ?
I'd like to export excel files to quicken, can this be done? Does anyone know of a macro or shareware program that can do this? Can excel files also be converted to .txf files? thanks ..txf = tax exchange files ...

Handling very large files (too many rows)
I'm trying to analyse a set of data. However, the data is a CSV fil with almost 1.5 million rows, which somewhat excedes the 65 thousand o so that excel can handle. However, excel is what I'm used to. If ther any convenient way to import so much data? Alternately, can anyon recomend a different means to observe and process that amount of data -- Message posted from http://www.ExcelForum.com I really think that is way too much for Excel, even if you break it into bits. What you should use depends on how you want to analyse the data. Access can handle that quantity of data. Maybe, if...

getting a message box with the unc file path
I need help getting a message box with a unc file path to pop up(ie. \ \server1\folder1\file1) my current code is Sub whatisfilepath() MsgBox ActiveWorkbook.Path End Sub i need to know if it is possable to have a unc file path any help is good help. Untried as I don't have a network at home but.... Private Declare Function WNetGetConnection Lib "mpr.dll" _ Alias "WNetGetConnectionA" _ (ByVal lpszLocalName As String, _ ByVal lpszRemoteName As String, _ cbRemoteName As Long) As Long 'basics created by Frank Isaacs (www.DolphinTechnology.com/Frank) '1...

Password Protected File
We have a file at work that is password protected and we can not open, even with what everyone believes is the correct password. The file is not linked to any other file. Any suggestions on what we should be looking for to try to open the file? Ian -----= Posted via Newsfeeds.Com, Uncensored Usenet News =----- http://www.newsfeeds.com - The #1 Newsgroup Service in the World! -----== Over 80,000 Newsgroups - 16 Different Servers! =----- Ian, A reply from Norman Harker a month ago is copied below. I think I remember seeing that if you try to open the file in word pad the password may be...

Looking for ListCtrl/ListView with multiple large icon states
Hello, does anyone know of a CListCtrl or CListView derived class that shows large icons that switch image when they are hovered and clicked? Thanks, David "David Ching" <dc@remove-this.dcsoft.com> wrote in message news:RXM_j.2658$xZ.1232@nlpi070.nbdc.sbc.com... > Hello, does anyone know of a CListCtrl or CListView derived class that > shows large icons that switch image when they are hovered and clicked? > I guess it doesn't have to be derived from those, it just has to show large icons like them. Thanks, David "David Ching" <dc@remove...

Duplicate File Message opening an Excel 97 file from Windows Expl.
When I open an Excel 97 file from Windows Explorer or the Documents menu of the Start button, the document opens, and I get a message that "A document with the name "XXXXX" is already open. You cannot open two documents with the same name, evan if the documents are in different folders." If I start Excel, firs, I don't get this message. Hi see: http://www.contextures.com/xlfaqApp.html#AlreadyOpen -- Regards Frank Kabel Frankfurt, Germany Mark B wrote: > When I open an Excel 97 file from Windows Explorer or the Documents > menu of the Start button, the ...

Opening file cause rows to lose
I am trying to open a file of 48215 row written in Excel 2003 but only 12121 appeared from 48215 when I am trying to save as Excel 95 format because of another application only reads Excel 95 fromat Thanks It's because Excel 95 does not have that many rows. You need to split them up on multiple sheets -- Regards, Peo Sjoblom "happytoday" <ehabaziz2001@gmail.com> wrote in message news:e211cafe-30fe-45df-af70-864e67f00894@f63g2000hsf.googlegroups.com... >I am trying to open a file of 48215 row written in Excel 2003 but only > 12121 appeared from 48215 when I ...

Can't attach files now
Hi, I have been using Windows Mail (The 2006 desktop version) for quite sometime now. I have always been able to attach files but today when I tried to send my resume like I always do, it wouldn't attach. It says some files could not be attached. What is the problem with that? "asiankelsey" <asiankelsey@discussions.microsoft.com> wrote in message news:747450D0-DDFC-4BBC-A833-21375EA27879@microsoft.com... > Hi, > > I have been using Windows Mail (The 2006 desktop version) for quite > sometime > now. I have always been able to attach files ...

Mac Excel 08 puts "#VALUE!" in all pivot table references when opening an WIN Excel 97 file.
Version: 2008 Operating System: Mac OS X 10.6 (Snow Leopard) Processor: Intel I recently bought a MacBook Pro 15.4, 3.06GHz, 8GB RAM running Mac Excel 2008 (v12.1.0 080409). When I use Mac Excel 2008 to open excel files that I created in Windows Excel 97 2004, all my pivot tables convert correctly but any REFERENCES to any fields within the pivot table do NOT show the valid values that Excel 97 showed. Instead all refrences have &quot;#VALUE!&quot; in the cell. There is a valid reference startement there but no valid value. How can I fix this? You are a few updates behind: make ...

Accessing one file from 2 home computers
I have a simple home computer network and want to update one MS Money 2001 file from both my pcs. When I try to open the file across the network, it comes back with the message " cannot open becasue file doesn't exist or it is in use" Neither of these statements are true and I don't know what to do now - can you help??? You have to be absolutely sure that both Money versions are the same and have the same Internet Updates. All of the network sharing constraints apply. (You can test this with other files and applications from the same place to the same place--e.g., .txt...

Best way to display XML
Hello, i have a big XML file, i load the xml to a xmldocument. My question is, how is the best way to display de information (aspx page). I tryng load the xml into a dataset an then displaying into a grid, but the structure of the xml (combinations of the attributes and nodes) is dificult to take into a dataset, and not represent well de model of the xml. My xml not have allways the same structure, it depends to the parameter that i give a some webservice, and this webservice give a XML resultant to the query. note: I have de schema of the xml Only i need your opinions and experience. Th...

recovery of .pst file is not working
I am trying to import .pst file into outlook 2000. the operation is failing at the floder "SUB". It is saying "could not complete the operation.One or more parameter values are not valid". any help in this matter is highly appreciated. thanks can you open it using file, open, outlook data file? if so, move or copy the contents instead of using import. -- Diane Poremsky [MVP - Outlook] Author, Teach Yourself Outlook 2003 in 24 Hours Coauthor, OneNote 2003 for Windows (Visual QuickStart Guide) Outlook Tips: http://www.outlook-tips.net/ Search for answers: http://group...

how do I get a .wab contact file into outlook ?
I don't have outlook express,just outlook "powertrainpaul" <powertrainpaul@discussions.microsoft.com> wrote in message news:E9333C9C-50E8-4DCF-BCF2-0DD5108BF9E5@microsoft.com... >I don't have outlook express,just outlook Look at Import Export on the file menu in Outlook. There are a few options to try like Import Mail and addresses.. Try it. But you may need to reimport into OE if it won't work directly on the file. And could you report back when you try? There's a number of people who don't and your findings can help others if they look through ...

copying information from various excel files
HELP PLEASE!!! I need your help with the following situation: I have multiple excel files that have the same exact structure. Have to search across all of the files for two different strings, which should both be prompted (stra and strb). This information prompted should be compared, (stra) with the values in column c (if this is true), then strb with the values in column d. If both criterias are located in the SAME row, I have to display the information on columns e and f. Thanks to all of you for your help. Marcelo This might work for you if all your workbooks are in the same folder. ...

Unable to read accent character in a generated xml file
Hi Everybody !!! I had generated a xml file that contained accent character. When I am trying to open it inside Excel 2002, Excel don't want to open it. If I delete all my accent character and I am able to open it. I sent you the content of my xml file. If you can tell me what is wrong with it it will help me a lot. Thanks a lot !!! <?xml version='1.0'?> <Workbook xmlns='urn:schemas-microsoft-com:office:spreadsheet' xmlns:o='urn:schemas-microsoft-com:office:office' xmlns:x='urn:schemas-microsoft-com:office:excel' xmlns:dt='u...

When saving CSV excel always prompts about the file format
I work with a lot of csv files all day and they have to be this format for other applications and scripts to use them. Everytime I save Excel prompts "filename.csv may contain features that are not compatible with csv, do you want to keep this format". This is the case even with Control S. The only formating is that I make the columns big enough for me to read whats in each cell. Anyway - can I turn off this prompt so that Excel just saves the file as csv? Thanks, Simon Try this before the save Application.DisplayAlerts = False and reset it after the save Application.Di...