blue       earthtones
 
Emitting Well-Formed XML from Excel
Discussion: I've actually been amazed by the volume of technical and scientific data, all of a non-statistal or numeric character, that has been compiled in Excel workbooks over time. In 2008 I was handed an extensive mineral database (geology) which converged data seldom found in other sources, and even more seldom searchable on the internet other than by gleaning individual tidbits from a wide range of textbooks and specialized sources. This data had been compiled in an Excel workbook, and after a brief analysis, I concluded that in order to present this on the web using AJAX, the first step was to emit the entire dataset from the workbook as an XML database, and proceed from there.

Solution: It's a relatively straightforward process to emit well-formed XML directly from an Excel workbook using a VB module. There are, as you might expect, a number of VB classes and methods for working with XML, but for a task such as this, it's more expedient simply to write the small amount of code required.

That being said, bear in mind that canonical XML uses UTF-8 as the character encoding while Excel uses UTF-16 internally. Consequently it's probable with certain datasets that you'll emit characters including some scientific notation, Greek character entities, and vulgar fractions that cannot be normalized in UTF-8. You can check for this condition by viewing the emitted file with the built-in XML viewers of IE7, IE8, Firefox, and Opera (Safari and Chrome's xml views are too primitive to discuss here). Symptoms include a terminating error in the XML Viewer (IE7 and IE8), or the substituted character (Firefox) if UTF-8 is conflicted. The simplest solution in this event is to set the xml header to encoding="utf-16" and save the file as 'unicode'. All XML viewers I've tested deal gracefully with UTF-16, as does the XMLHttpRequest.