Python-libxml2
From Docunext Technology Wiki
Contents |
Documentation Snibbets
This text is taken from the python-libxml2 source file /var/lib/python-support/python2.4/libxml2.py and is therefore licensed under the same license as the python-libxml2:
def Preserve(self):
"""This tells the XML Reader to preserve the current node. The
caller must also use xmlTextReaderCurrentDoc() to keep an
handle on the resulting document once parsing has finished """
ret = libxml2mod.xmlTextReaderPreserve(self._o)
if ret is None:raise treeError('xmlTextReaderPreserve() failed')
__tmp = xmlNode(_obj=ret)
return __tmp
libxml2 parser
Lots of options here and methods here! Examples:
parse_options = libxml2.XML_PARSE_DTDLOAD + libxml2.XML_PARSE_NOENT libxml2.initParser() xqf = libxml2.readFile(query_file, None, parse_options) no_ent_xml = xqf.serialize() print no_ent_xml
This also worked for me:
no_ent_xml = "<blah><barf/></blah>" reader = libxml2.readerForMemory(no_ent_xml, len(no_ent_xml), "", "UTF-8", 0)
Segmentation Fault Bug?
Though I get a segmentation fault with this:
parse_options = libxml2.XML_PARSE_DTDLOAD + libxml2.XML_PARSE_NOENT reader = libxml2.readerForFile(query_file, "UTF-8", parse_options)
Once I fire up the reader while loop, that is.
Thinking I was using the wrong options, I tried this:
reader_options = libxml2.XML_PARSER_LOADDTD + libxml2.XML_PARSER_SUBST_ENTITIES reader = libxml2.readerForFile(query_file, "UTF-8", reader_options)
This also results in a segmentation fault:
reader = libxml2.newTextReaderFilename(query_file) reader.SetParserProp(libxml2.PARSER_SUBST_ENTITIES, 1) reader.SetParserProp(libxml2.PARSER_SUBST_ENTITIES, 1)
Here's the strace:
open("/var/www/dev/phunkybb/apps/phunkybb/data/sql/topics_get_by_forum_id.xml", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=1475, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75e8000
read(3, "<!--\nProgram: PhunkyBB\nComponent"..., 16384) = 1475
read(3, "", 12288) = 0
_llseek(3, 0, [1475], SEEK_CUR) = 0
_llseek(3, 0, [0], SEEK_SET) = 0
read(3, "<!--\nProgram: PhunkyBB\nComponent"..., 4096) = 1475
write(1, "1\n", 21
) = 2
read(3, "", 4096) = 0
stat64("/var/www/dev/phunkybb/apps/phunkybb/data/sql/__default_table_names__.txt", {st_mode=S_IFREG|0644, st_size=507, ...}) = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
Process 4981 detached
libxml2 reader python
XML_PARSER_SUBST_ENTITIES or? XML_PARSE_NOENT