blue       earthtones
 
Miscellaneous Ramblings
Bookmark and Share

I've summarized a few cross-browser and related observations from past and current projects on this page, and am sharing some thoughts on them below. Some of you have no doubt already explored these issues, but perhaps they can save others a few minutes duplicating my experiments.

 
(This page is organized with a jQuery accordion, click a heading to toggle it open / closed)

Viewing 'raw' XML and XSLT with built-in XML Viewers

Discussion: All XML viewers are notably different, and equally somewhat selective regarding what/how/if they'll format and display raw XML, particularly depending on whether the XML header encoding is UTF-8 or UTF-16 (Windows "Unicode").

Firefox

Firefox is clearly the developers' browser and it will make every attempt to format and display raw xml in its XML Viewer even when there is some conflict between the element data, XML header, and format on disk; and equally, diagnose and report missing closing tags and similar errors. Since I programmatically emit XML, I've found it simplest to use Firefox throughout the development cycle to avoid getting sidetracked by browser quirks. It employs the readily recognizable black  character in place of any conflicted data, consequently it's the only browser I use to check pre-production XML.

Internet Explorer

Internet Explorer has perhaps the most rigid (read: unforgiving) XML viewer from the developer's perspective. Its viewer will abort XML display and report an error on any-and-many conditions. Principal among these is if there is any element content that it considers "inconsistent" or in conflict with the specified encoding (I can take either side of that argument). But said another way, IE does not gracefully handle the entire range of possible characters when XML file encoding is specified as UTF-8 (which is the canonical encoding for XML). This is not entirely a criticism, but the problem nevertheless surfaces with certain scientific notation, Greek character entities, "vulgar fractions", and other content often present in scientific data which has been directly exported from other Microsoft products as XML elements (if not employing CDATA markers -- which is problematic for other reasons and generally to be avoided in XML). In view of this, I've found that using IE to 'final-proof' XML files is helpful, since it will quickly expose (and often diagnose) any compatibility issues. My experience is that "If it works with the IE XML Viewer, it will work with anything".

Opera

Opera is based on Opera Software's proprietary Presto engine. Opera has rigorously passed Acid-3 testing though I have admittedly undertaken only basic testing with Opera's built-in XML viewer (version 10.00) -- without any reportable glitches. Opera formats the XML view somewhat differently than IE or Firefox, but it's nevertheless quite readable. I'm generally agnostic with regard to browsers (other than a personal preference for Firefox and IE in that order for reasons noted above), but Opera is fully worth testing with XML since it has evolved independently since it was first adapted as the basis for WebKit.

Safari (and Chrome)

Both Safari and Google Chrome are based on WebKit to render web pages, and neither — surprisingly — provides a built-in XML viewer. Frankly, I find this disappointing since neither displays XML adequately; both simply strip the "unrecognized" element tags, and render the .xml file as one exceedingly long string. If one wishes to view raw XML in either of these browsers in any remotely intelligible way, it's necessary to load the XML, find a whitespace on the page, then right-click, and 'view source'. This is frankly just too primitive to make them useful for viewing XML. Some might argue that plugins are available; I would argue that they shouldn't be required.

Solution:

Proof XML files with Firefox, IE, and Opera -- in that order. The simplest solution for a cross-browser test of XML (albeit not the only one) is to encode XML files which have extended characters with UTF-16 encoding. This works equally well for IE, Firefox, and Opera, and consequently absolves almost all cross-browser issues.

Notes:

1 - to re-save an emitted file as UTF-16 in Windows Vista, open the file with Notepad or WordPad and "Save as" unicode. Unix offers a wide range of open source tools to accomplish the same result.

2 - In general, if you need to display raw XML within a web page, the simplest method is to invoke a Viewer by embedding an <iframe> in your page with the src=relative_path_to_your.xml. This invokes the browser's built-in XML viewer by default. (And if the file has an associated XSL, it will, further, transform the XML as indicated in the style sheet. You can see an example of this here).

3 - Excel (and essentially all Microsoft products) use UTF-16 encoding internally. If you're emitting XML data from any MS Office Pro product, use:

<?xml version="1.0" encoding="utf-16"?>

for your XML header to avoid cratering the IE Viewer when casually emitted characters cannot be normalized to UTF-8.

'Reset CSS' — Cross-Browser Baseline Compatibility

Most web developers will no doubt have observed — and consequently dealt with, in one way or another — the differences in the default behavior (style properties) of the major browsers and their respective rendering engines:

  • Internet Explorer (Trident)
  • Firefox (Gecko)
  • Safari (WebKit)
  • Chrome (WebKit)
  • Opera (Presto)

One practice used to "level the playing field", that is, set a baseline from which browser rendering may be uniformly predicted, is to start from a linked css which resets the default behavior of all browsers to a common state. Such a style sheet is generally termed a "reset css". Naturally, there are pro's and con's to this approach, and I'll share my parochial thoughts on these here.

A 'reset css' is most commonly the first linked style sheet in a given page <head>, which will then be cascaded or overridden by any following style sheets. It contains a very wide range of selectors and default properties to reset all of the rendering engines (browsers) which might be expected to load the page. Some of the more popular Reset CSS stylesheets may be found at:

and there are many more to be found with a simple web search. But what are the, admittedly subjective, Pros and Cons?

Pros: A Reset Stylesheet sets a uniform stage for rendering your web pages across many browsers.

Cons: Performance Overhead

That being said, I don't typically (though I'm not categorically opposed to) use a Reset Stylesheet for a few reasons:

  • Performance: loading a reset stylesheet typically requires another <link> in the <head> section of every page (or an @include in you base stylesheet), which directly translates to loading another source file from the server, parsing it, and even if cached, iterating through the many selectors to set the reset styles before beginning to load and process the specific styles for the site/page;
  • Redundancy: Since I set the properties for just the subset (each selector I plan to use on a given website) in my "base" css, loading a reset stylesheet first is redundant browser burden and consumes unnecessary network time and client-side cycles.

Consequently there's no "magic bullet" regarding cross-browser compatibility using reset stylesheets — you'll need to analyze each requirement in order to draw your own conclusions as to which which approach is most appropriate in your environment.

Rounded Corners with Various Techniques

Rounded corners on block elements can lend elegance to layouts, and have consequently been a subject of much discussion between creatives and developers the last few years. Many solutions have been developed, and with a little tweaking, the examples below quickly worked on all mainstream browsers.

Beyond the obvious solutions, notably including Flash, current approaches to corner rounding can be loosely organized into five categories:

  1. custom image positioned below a block element with z-index
  2. reusable corner images (gif or png) assembled with css classes and xhtml
  3. pure css-only classes with no images, painstakingly assembled with xhtml
  4. JavaScript/jQuery-driven corner rounding
  5. eventually — css 3.0 properties (naturally, only in Firefox at present)

Example - Category 4

This example uses the jQuery JavaScript "corners" plugin to round block elements.

JavaScript must be enabled in the visitor's browser for this corner rounding technique to work.

Each technique clearly brings its own strengths and weaknesses. But, in the final analysis, we're all eagerly awaiting the finalization of the css 3.0 spec and implementation in the Trident, WebKit, Presto, and Gecko rendering engines. The examples above work in Firefox, IE, Safari, Opera, and Chrome (which make up over 99.9% of visitors to my sites). Meanwhile, there are literally dozens of examples and tutorials for the corner-rounding groups listed above to be found with a simple web search.

The rounded corners on the animated gif at right are a simple illusion created by specifying a conventional (rectangular) animated gif as the background-image in css with the div's content consisting only of a rounded transparent gif.

Cross-browser Page Centering

Centering the content of a web page or an entire website on a <body> background is simple for a given browser. But of course there's more than a dozen rendering engines (including notables Trident, WebKit, Presto, and Gecko), so it takes a dual-pronged approach to achieve a generalized cross-browser page-centering solution.

The example below assumes a <body> with a css-specified background color or image. The objective is to 'float' the site's pages in the center of the browser window regardless of the window or monitor size:

  1. IE (Trident) works as might be anticipated with style="text-align: center"; (but remember that it's subsequent enclosed layers (div) that are centered)
  2. Firefox (Gecko), Safari and Chrome (WebKit), and Opera (Presto) deliver the same results, but only with style="margin-left: auto; margin-right: auto";

So, a generalized cross-browser solution to center-float the inner content of the body regardless of window size simply uses both:

a) In your css:

  1. specify the background-color or background-image in the default <body> tag,
  2. create an id for an over-arching container with the css properties:

        #parent-container {
             margin-left: auto;
             margin-right: auto;
             text-align: center
             }
(if you'd like uniform inner layer width, add:)
#body-content { width: 780px; }

b) in your template or markup:

  1. place a container ( <div id="parent-container"> ) immediately below the <body>
  2. all subsequent layers should lie within this parent container, which gets closed just before the </body> tag

in other words, the markup on every page you need centered employs this envelope:


              <body>
                  <div id="parent-container">
                      <div id="body-content">
                          the rest of your markup containered here
                      </div>
                  </div>
              </body>
            

note that this does not disturb the behavior of absolute and relative positioning of elements within the container.

Relative and Absolute Positioning

The css position property, especially when combined with the z-index property, provides all of the flexibility you might want to achieve a perfect pixel-positioned layout. However, the interactions between relative and absolute positioning are not immediately apparent and take a little experimentation to grok. Let's consider those interactions in this discussion.

"position: absolute" means just that — if you casually specify absolute positioning for an element within the <body>, it will hang at the same absolute position, typically displaced in a signed polar direction from the left, top origin of the browser window (i.e. the <body>), regardless of the window/browser/monitor size, and regardless of how you resize the window. If this is not your goal (and it usually isn't), then simply enclose it within a block element (e.g., a <div>) which lies in the normal flow and whose class/id contains "position: relative". In this event the child's 'absolute' position will be conveniently relative to its parent layer's top left origin.

Said another way, Unless you're really trying to pin an element to a fixed location, a "position: absolute" element should lie within a "position: relative" block element.

Expanding on the example used earlier in the Page Centering topic, experiment with:

css:


        #parent-container {
             position: relative;
             margin-left: auto;
             margin-right: auto;
             text-align: center
             }
#body-content { position: relative; width: 780px; text-align: left; }
#page-heading { position: absolute; left: 50px; top: -5px; }

Given that, it then follows that the markup:


          <body>
              <div id="parent-container">
                  <div id="body-content">
                      <div id="page-heading">
                          An absolutely-positioned Heading
                      </div>
                      <div>
                          <p>other elements follow normal flow<p>
                      </div>
                  </div>
              </div>
          </body>
          

will place the page heading 50 pixels right of, and 5 pixels above, the left top corner of the parent-container. You can randomly pixel-position any number of layers using this technique by adding additional classes or IDs to your css. The example above shows two animated gifs which are absolutely positioned within this jQuery pane and overlapped with z-index. Note that the images are separately specified in the preferred and alternate style sheets, so they'll also change with the color theme.

Depending on your needs for multiple <div> tags above (enclosing) an absolutely-positioned element, it may be necessary to make one or more of those parent layers "position: relative" to get the desired result.

But note that an important behavior of all rendering engines when positioning out of the normal flow applies differently to position:relative and position:absolute. Should you specify position:relative in css and further specify top and/or left values, the space formerly occupied by the re-positioned node remains reserved and may consequently display as an open/blank space; to solve this, instead position:absolute within a position:relative parent block element so that the space is "released" and no 'blank' space is rendered.

   

Variable Height <iframe>

(currently updating this discussion with additional techniques for dynamic iframe height...)

Although I haven't used frames in years, I continue to come across occasional but perfect applications for iframes (inline frames). High among these is to invoke the built-in XML viewer of Firefox, IE, or Opera (but note that you must relax your XHTML header from 'strict' to 'transitional to validate a page which contains an iframe.)

That being said, one of the principal issues I (and just about every developer) has battled with iframes was the fixed "height" limitation. If you don't require the automatic invocation of an XML Viewer or XSL formatter, then with the advent of the XMLHttpRequest object, it became a simple matter to substitute another solution — which doesn't suffer from the fixed-height limitation.

Solution: Consider instead using an XMLHttpRequest object response loaded into the text node of a named <div id="xxx"> — which will automatically expand to the necessary height. This approach is used liberally elsewhere on this site, but in brief, simply retrieve a text file from the server and stuff it into the id.innerHTML (or id.childNodes[0].nodeValue DOM object).

Naming Conventions and Style Guides

As a freelance UI developer, one of the things I find myself continually doing is taking a "reset" on client lab conventions regarding naming. This applies equally to css classes and IDs, html/xhtml node names, JavaScript function names, and others.

For all practical purposes, nearly everyone employs lower camel case for JavaScript funtions, but there are at least four approaches regarding the naming conventions used in css classes and IDs and xhtml nodes that I've observed. If you were to read the css sheets from the sites in my portfolio, you'd no doubt notice a range of conventions. They are:

  • lower Camel Case
  • upper Camel Case
  • hyphenated name
  • underscore separated

Let's take the example of a css class, generically " left column", which is a hypothetical float:left column on a page. The class or id might be variously named (respectively):

  • leftColumn
  • LeftColumn
  • left-column
  • left_column

Observation: My first rule is: "Consistency beats subjective quality every time". If the shop convention is 'hyphenated', then it's far more productive to adhere to extant local convention and name the class "left-column" despite any personal preferences you might have. If you're a contractor, it's important to identify these (and other conventions) when commencing a new engagement; if you're a new full-time employee, then it is equally important to obtain, learn, and adhere to the corporate/site Style Guide -- or offer to produce this material if, as is often the case, a Style Guide doesn't exist.

Revising any extant convention to something arguably more appropriate is non-trivial, but nevertheless possible given the power of tools such as Dreamweaver. But with that being said, decisions regarding pervasive changes such as these should not be taken lightly since they affect esssentially every page — and every developer — on a site and its branches, are costly, must first be carefully considered, and then funded as a distinct 'project' with many deliverables. Parenthetically, I'm a hand-coder, but would enthusiastically use tools such as Dreamweaver to accomplish a project of this nature at affordable cost.

Style Guides

Not intended as a monograph on UI Style Guides, here's nevertheless a few pivotal issues to keep in mind regarding style — and to understand thoroughly before embarking on page, template, or css development. A web development Style Guide must clearly indicate much more than simply naming conventions; beyond that it must also clearly specify peripheral detail including:

  • the corporate colors in hex notation
  • the corporate font-family, font-size, and related conventions
  • registered® and common-law™ Trademarks
  • required trademark marking
  • an intranet-based library of images and logos approved by corporate counsel
  • trademark image source location on the development intranet

Browser Trends and Display Resolution

In order to confirm which browsers to cross-check, I periodically collect statistics from some live web sites and watch for trends. Here are some interesting browser statistics from July 2009 and 2010.

Browser Statistics
  site IE7 IE6 IE8 Firefox Chrome Safari Opera
July 2009 15.9% 14.4% 9.1% 47.9% 6.5% 3.3% 2.1%
tgz 37.4% 14.6% 22.8% 16.3% 0.5% 6.4% 0.2%
 
June 2010 8.1% 7.2% 15.7% 46.6% 15.9% 3.6% 2.1%
tgz 17.3% 5.9% 44.1% 19.2% 4.6% 7.4% 0.5%

The W3C's extensive statistics are a great resource found here. Naturally, the visitors to W3C's website tend to be more technically savvy than the general population, so those stats reflect their browser choices. Conversely, visitors to theGeoZone.com ("tgz" in the above table) tend to be on the other end of the bell curve, so reflect a microcosm of the general population. In general, we see a slow, but steady drift from IE (all versions) to Firefox, and to a smaller extent, Chrome. Safari visitors were mostly using Macs..

Display Resolution

The current trend is that most computers are using a screen size of 1024 x 768 pixels or larger. The W3C's trend data can be seen here. Prior to 2008 we designed largely for 800 x 600 pixels; since that time we've moved to 1024 x 768 pixels as a baseline (and we regularly see 1280 x 1024 and larger in our web stats).

Apache Configuration Notes — Blocking Rogue Spyders and setup.php Penetration Attempts

Discussion: I recently spent a full day tweaking up the behavior and performance of one of my Apache/Tomcat websites, and found myself reading through a ton of (useful) reference material from apache.org and some other webmasters faced with similar needs. Among things, my mission included blocking the BaiDuSpider since it blatently ignores robots.txt and indexes things that are there for the convenience of my web development team and not intended to be indexed.

Blocking BaiDuSpider: In order to save you a non-trivial amount of time researching it, the BaiDuSpider (a spyder used by Chinese and Japanese search engines), visits from too many ip addresses (which you can nevertheless find in a web search) to make blocking their specific ip addresses a practical solution. I normally attempt to block rogue spyders by USER_AGENT name instead. But that being said, there are a few variations in capitalization to deal with, and I expect more will arise as time goes on.

Solution: Since I have no Asian content or language support on the target website, and would prefer it not be indexed in Japan or China, I simply blocked the ill-behaved spyder in total using rewrite rules. Parenthetically, it's more conventional to put rules in the httpd.conf file, and more efficient not to use .htaccess at all for a variety of performance reasons particularly those related to the dynamics of determining inheritance of .htaccess restrictions. But if it's impractical to edit the base httpd.conf file — as it was with this ISP — then .htaccess represents the simplest solution.


Blocking BaiDuSpider in .htaccess:  Place the following rewrite rule in the .htaccess file in your website's root directory:


RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^baiduspider [NC]
RewriteRule .* - [F,L]	   
	   

but first be very sure you have mod_rewrite enabled in the DSOs in apache's httpd.conf! You should find (or add) this line:


LoadModule rewrite_module modules/mod_rewrite.so
	  

Consequently, BaiDuSpider now receives an HTTP: 403 - Forbidden response whenever they request any resource from the target website. Perhaps some day BaiDu will learn to observe the directions in robots.txt; or perhaps, arrogantly, not. In either event, I haven't seen BaiDuSpider in several weeks now, but have seen the hundreds of resulting HTTP:403 entries in my logs.


Blocking Purebot: Purebot is a surprisingly lame spider, not the least of which odd behaviors is that it leaves bizarre 404's behind, making it easy to spot in your error log: Purebot will relentlessly request non-existent paths with dozens of erroneously repeated nodes in the path. Try adding the following rewrite rule to the .htaccess above:


RewriteCond %{HTTP_USER_AGENT} ^(.*)purebot(.*) [NC]
RewriteRule .* - [F,L]	  
	  

This appears to be working on my sites, which now decline to serve any request from Purebot with an HTTP: 403 - Forbidden result.


Blocking setup.php Exploits: While I'm on the subject of rewrite rules, another mission was to block the dramatically increasing number of penetration attempts to /phpMyAdmin.../setup.php. I've recently (June 2010 and beyond) seen several hundred HTTP:404 entries in my logs every month for many variations of the path to setup.php, which originate principally from Russia, Guatamela, British Columbia, and the Phillipines. Adding these simple RewriteRules to the code above moved all of them from 404: Not Found to 403: Forbidden.


RewriteRule ^phpmyadmin(.*) - [F,NC,L]
RewriteRule ^(.*)setup.php$ - [F,NC,L]
	   

Bear in mind that I cannot urge you strongly enough to thoroughly test any changes to .htaccess on your development or staging server before putting it into production! An .htaccess file presents a binary proposition — it is either precisely correct, or categorically wrong. And it will completely hose the behavior of Apache, Tomcat and other engines when it's wrong.

Efficiently Transferring Arrays Between Microsoft Excel Worksheets and Visual Basic

Discussion: I was recently investigating the encryption algorithm behind the classical Jefferson Wheel Cipher and decided to use Excel coupled with a VBA module as a tool to to fast prototype, implement, and explore its behavior. I 'knocked out' a spreadsheet and VB module which faithfully implemented the algorithm — however I was startled at how slowly it ran.

With a little profiling, it quickly became apparent that transferring individual Cells() from a worksheet to a VB array is a slow process— and transferring individual Cells() from a VB array back to the worksheet is a *really slow* process. My initial approach had been simply to use the convenience of nested For/Next loops to retrieve the clear text and subsequently return the cipher text, transferring one character (Cell) at a time between the worksheet and a VB array. But with only a small array (26x16 cells, or 416 total cells), the roundtrip execute time was an astounding 20 seconds.

Solution: Abandoning the convenience of the .Cells() property (which accomodates simple calculation of the row and column as variables), I substituted a Range() assignment with a precalculated string parameter denoting the entire range. Transferring the 2D array as a single object executed about 20x faster — less than one second!

One difference in the supporting code is that regardless of the Option Base specified in the VB module, the array variant containing the cells transferred from the worksheet inevitably behaved as if "Option Base 1" was declared. In my case, it required declaring Option Base 1 to eliminate the constant mindset adjustment between Option Base 0 for my VB arrays and Option Base 1 for the objects transferred from/to the worksheet.

Starting point snippet (all-caps are convenience constants defining the location of the worksheet array):


	Dim arrayVariant (ROW_COUNT, COL_COUNT)
	
	For row = TOP_LEFT to (TOP_LEFT + ROW_COUNT)
		For col = LEFT_COL to (LEFT_COL + COL_COUNT)
			Worksheets(SHEET_NAME).Cells(row, col).Value = arrayVariant(row, col)
		Next col
	Next row
	   

This solution snippet executed 20x faster:


	Dim arrayObject as Variant
	Dim responseRange as Variant
	responseRange = Chr(Asc("A") + TOP_LEFT - 1) & Format(LEFT_COL) & ":" & Chr(Asc("A") + COL_COUNT - 1) & Format(TOP_LEFT + ROW_COUNT)

	'then to load the array from the worksheet:
	arrayObject = Worksheets(SHEET_NAME).Range(responseRange).Value
	'or to move the array back to the worksheet:
	Worksheets(SHEET_NAME).Range(responseRange).Value = arrayObject