Skip to main content

XPath query result order


For another question I have created some XML related code that works on my development machine but not on viper codepad where I tested it before adding it to my answer.



I could reduce my problem to the point that the order of nodes returned by DOMXPath::query() differs between my system and the codepad.



XML: <test>This is some <span>text</span>, fine.</test>



When I query all textnodes //child::text() the result differs:



Viper Codepad:




#0: This is some
#1: , fine.
#2: text



My Machine:




#0: This is some
#1: text
#2: , fine.



I'm not that experienced with xpath that I do understand why this happens and how it's probably possible to influence the return order with the PHP implementation.



Edit:



Further testing has revealed that LIBXML_VERSION differs between the two systems:




Viper Codepad: 20626 (2.6.26; 6 Jun 2006)
My Machine...: 20707 (2.7.7; 15 Mar 2010)


Source: Tips4allCCNA FINAL EXAM

Comments

  1. Technically XPath 1.0 returns node-sets rather than node sequences. In the XPath 1.0 specification there is no statement about the order of these node-sets - indeed, being sets, they have no intrinsic order.

    However, XSLT 1.0 always processes the node-sets returned by XPath 1.0 in document order, and because of that precedent, there is a widespread expectation that XPath results will be in document order when XPath is invoked from languages other than XSLT. However, there is nothing in the spec to guarantee this. In XPath 2.0 the user expectation becomes part of the spec, and the results of a path expression MUST be in document order.

    ReplyDelete
  2. It looks like an bug in 20626 version:

    It process first all child text nodes in document order, then content of child element nodes. Should be as result on your machine

    ReplyDelete
  3. I could find the following bug-report which looks like the issue: Bug 363252 - proximity position in libxml2's xmlXPathEvalExpression() reported 18 Oct 2006 and confirmed dating back since May 2006 which is before the 2.6.26 version in question.

    This should have been fixed in libxml2 2.6.27.

    ReplyDelete
  4. It appears that Viper Codepad is not returning the selected text() nodes in depth first document order, but doing a breadth first evaluation.

    It is supposed to be a depth first traversal.

    Saxon, MSXML, Altova XML each returned the results in a depth-first order.

    ReplyDelete
  5. XPath is a query language, thus it should only read the structure of the .xml document as is and never modify it. This includes the node order. In your first example however this is not true. So this is definitely a bug according to this.

    ReplyDelete

Post a Comment

Popular posts from this blog

[韓日関係] 首相含む大幅な内閣改造の可能性…早ければ来月10日ごろ=韓国

div not scrolling properly with slimScroll plugin

I am using the slimScroll plugin for jQuery by Piotr Rochala Which is a great plugin for nice scrollbars on most browsers but I am stuck because I am using it for a chat box and whenever the user appends new text to the boxit does scroll using the .scrollTop() method however the plugin's scrollbar doesnt scroll with it and when the user wants to look though the chat history it will start scrolling from near the top. I have made a quick demo of my situation http://jsfiddle.net/DY9CT/2/ Does anyone know how to solve this problem?

Why does this javascript based printing cause Safari to refresh the page?

The page I am working on has a javascript function executed to print parts of the page. For some reason, printing in Safari, causes the window to somehow update. I say somehow, because it does not really refresh as in reload the page, but rather it starts the "rendering" of the page from start, i.e. scroll to top, flash animations start from 0, and so forth. The effect is reproduced by this fiddle: http://jsfiddle.net/fYmnB/ Clicking the print button and finishing or cancelling a print in Safari causes the screen to "go white" for a sec, which in my real website manifests itself as something "like" a reload. While running print button with, let's say, Firefox, just opens and closes the print dialogue without affecting the fiddle page in any way. Is there something with my way of calling the browsers print method that causes this, or how can it be explained - and preferably, avoided? P.S.: On my real site the same occurs with Chrome. In the ex