[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-general
Subject:    [HACKERS] Fix xpath() to return namespace definitions
From:       Ali Akbar <the.apaan () gmail ! com>
Date:       2014-05-30 9:04:33
Message-ID: CACQjQLo18s5Lpx9ngh5Qd1mhB4OukC12MzcjFwy0LQr2kw2DoQ () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Hi all,

While developing some XML processing queries, i stumbled on an old bug
mentioned in http://wiki.postgresql.org/wiki/Todo#XML: Fix Nested or
repeated xpath() that apparently mess up namespaces.

Source of the bug is that libxml2's xmlNodeDump won't output XML namespace
definitions that declared in the node's parents. As per
https://bug639036.bugzilla-attachments.gnome.org/attachment.cgi?id=177858,
the behavior is intentional.

This patch uses function xmlCopyNode that copies a node, including its
namespace definitions as required (without unused namespace in the node or
its children). When the copy dumped, the resulting XML is complete with its
namespaces. Calling xmlCopyNode will need additional memory to execute, but
reimplementing its routine to handle namespace definition will introduce
much complexity to the code.

Note: This is my very first postgresql patch.

-- 
Ali Akbar

[Attachment #5 (text/html)]

<div dir="ltr"><div><div><div>Hi all,<br><br></div>While developing some XML \
processing queries, i stumbled on an old bug mentioned in <a \
href="http://wiki.postgresql.org/wiki/Todo#XML">http://wiki.postgresql.org/wiki/Todo#XML</a>: \
Fix Nested or repeated xpath() that apparently mess up namespaces.<br>

<br></div>Source of the bug is that libxml2&#39;s xmlNodeDump won&#39;t output XML \
namespace definitions that declared in the node&#39;s parents. As per <a \
href="https://bug639036.bugzilla-attachments.gnome.org/attachment.cgi?id=177858">https://bug639036.bugzilla-attachments.gnome.org/attachment.cgi?id=177858</a>, \
the behavior is intentional.<br>

<br></div><div>This patch uses function xmlCopyNode  that copies a node, including \
its  namespace definitions as required (without unused namespace in the node or its \
children). When the copy dumped, the resulting XML is complete with its namespaces. \
Calling xmlCopyNode will need additional memory to execute, but reimplementing its \
routine to handle namespace definition will introduce much complexity to the \
code.<br>

<br></div><div><div><div>Note: This is my very first postgresql patch.<br \
clear="all"></div><div><div><div><div><br>-- <br>Ali Akbar \
</div></div></div></div></div></div></div>


["xpath-ns-fix.patch" (text/x-patch)]

diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 422be69..93e335c 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3602,19 +3602,28 @@ xml_xmlnodetoxmltype(xmlNodePtr cur)
 	if (cur->type == XML_ELEMENT_NODE)
 	{
 		xmlBufferPtr buf;
+		xmlNodePtr cur_copy;
 
 		buf = xmlBufferCreate();
+
+		/* the result of xmlNodeDump won't contain namespace definitions, 
+		 * but libxml2 has xmlCopyNode that duplicates a node, along
+		 * with its required namespace definitions. 
+		 */
+		cur_copy = xmlCopyNode(cur, 1);
 		PG_TRY();
 		{
-			xmlNodeDump(buf, NULL, cur, 0, 1);
+			xmlNodeDump(buf, NULL, cur_copy, 0, 1);
 			result = xmlBuffer_to_xmltype(buf);
 		}
 		PG_CATCH();
 		{
+			xmlFreeNode(cur_copy);
 			xmlBufferFree(buf);
 			PG_RE_THROW();
 		}
 		PG_END_TRY();
+		xmlFreeNode(cur_copy);
 		xmlBufferFree(buf);
 	}
 	else
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 382f9df..a6d26f7 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -584,6 +584,12 @@ SELECT xpath('//loc:piece/@id', '<local:data \
xmlns:local="http://127.0.0.1"><loc  {1,2}
 (1 row)
 
+SELECT xpath('//loc:piece', '<local:data xmlns:local="http://127.0.0.1"><local:piece \
id="1">number one</local:piece><local:piece id="2" /></local:data>', \
ARRAY[ARRAY['loc', 'http://127.0.0.1']]); +                                           \
xpath                                                                       \
+------------------------------------------------------------------------------------------------------------------------------------------------
 + {"<local:piece xmlns:local=\"http://127.0.0.1\" id=\"1\">number \
one</local:piece>","<local:piece xmlns:local=\"http://127.0.0.1\" id=\"2\"/>"} +(1 \
row) +
 SELECT xpath('//b', '<a>one <b>two</b> three <b>etc</b></a>');
           xpath          
 -------------------------
diff --git a/src/test/regress/expected/xml_1.out \
b/src/test/regress/expected/xml_1.out index a34d1f4..c7bcf91 100644
--- a/src/test/regress/expected/xml_1.out
+++ b/src/test/regress/expected/xml_1.out
@@ -498,6 +498,12 @@ LINE 1: SELECT xpath('//loc:piece/@id', '<local:data \
xmlns:local="ht...  ^
 DETAIL:  This functionality requires the server to be built with libxml support.
 HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath('//loc:piece', '<local:data xmlns:local="http://127.0.0.1"><local:piece \
id="1">number one</local:piece><local:piece id="2" /></local:data>', \
ARRAY[ARRAY['loc', 'http://127.0.0.1']]); +ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('//loc:piece', '<local:data xmlns:local="http:/...
+                                    ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
 SELECT xpath('//b', '<a>one <b>two</b> three <b>etc</b></a>');
 ERROR:  unsupported XML feature
 LINE 1: SELECT xpath('//b', '<a>one <b>two</b> three <b>etc</b></a>'...
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index 90d4d67..241a5d6 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -174,6 +174,7 @@ SELECT xpath(NULL, NULL) IS NULL FROM xmltest;
 SELECT xpath('', '<!-- error -->');
 SELECT xpath('//text()', '<local:data xmlns:local="http://127.0.0.1"><local:piece \
id="1">number one</local:piece><local:piece id="2" /></local:data>');  SELECT \
xpath('//loc:piece/@id', '<local:data xmlns:local="http://127.0.0.1"><local:piece \
id="1">number one</local:piece><local:piece id="2" /></local:data>', \
ARRAY[ARRAY['loc', 'http://127.0.0.1']]); +SELECT xpath('//loc:piece', '<local:data \
xmlns:local="http://127.0.0.1"><local:piece id="1">number \
one</local:piece><local:piece id="2" /></local:data>', ARRAY[ARRAY['loc', \
'http://127.0.0.1']]);  SELECT xpath('//b', '<a>one <b>two</b> three \
<b>etc</b></a>');  SELECT xpath('//text()', '<root>&lt;</root>');
 SELECT xpath('//@value', '<root value="&lt;"/>');



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic