Python-libxml2


From Docunext Technology Wiki

Jump to: navigation, search

Contents

Documentation Snibbets

This text is taken from the python-libxml2 source file /var/lib/python-support/python2.4/libxml2.py and is therefore licensed under the same license as the python-libxml2:

    def Preserve(self):
        """This tells the XML Reader to preserve the current node. The
          caller must also use xmlTextReaderCurrentDoc() to keep an
           handle on the resulting document once parsing has finished """
        ret = libxml2mod.xmlTextReaderPreserve(self._o)
        if ret is None:raise treeError('xmlTextReaderPreserve() failed')
        __tmp = xmlNode(_obj=ret)
        return __tmp

libxml2 parser

Lots of options here and methods here! Examples:

parse_options = libxml2.XML_PARSE_DTDLOAD + libxml2.XML_PARSE_NOENT
libxml2.initParser()
xqf = libxml2.readFile(query_file, None, parse_options)
no_ent_xml = xqf.serialize()
print no_ent_xml

This also worked for me:

no_ent_xml = "<blah><barf/></blah>"
reader = libxml2.readerForMemory(no_ent_xml, len(no_ent_xml), "", "UTF-8", 0)

Segmentation Fault Bug?

Though I get a segmentation fault with this:

parse_options = libxml2.XML_PARSE_DTDLOAD + libxml2.XML_PARSE_NOENT
reader = libxml2.readerForFile(query_file, "UTF-8", parse_options)

Once I fire up the reader while loop, that is.

Thinking I was using the wrong options, I tried this:

reader_options = libxml2.XML_PARSER_LOADDTD + libxml2.XML_PARSER_SUBST_ENTITIES
reader = libxml2.readerForFile(query_file, "UTF-8", reader_options)

This also results in a segmentation fault:

reader = libxml2.newTextReaderFilename(query_file)
reader.SetParserProp(libxml2.PARSER_SUBST_ENTITIES, 1)
reader.SetParserProp(libxml2.PARSER_SUBST_ENTITIES, 1)

Here's the strace:

open("/var/www/dev/phunkybb/apps/phunkybb/data/sql/topics_get_by_forum_id.xml", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=1475, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75e8000
read(3, "<!--\nProgram: PhunkyBB\nComponent"..., 16384) = 1475
read(3, "", 12288)                      = 0
_llseek(3, 0, [1475], SEEK_CUR)         = 0
_llseek(3, 0, [0], SEEK_SET)            = 0
read(3, "<!--\nProgram: PhunkyBB\nComponent"..., 4096) = 1475
write(1, "1\n", 21
)                      = 2
read(3, "", 4096)                       = 0
stat64("/var/www/dev/phunkybb/apps/phunkybb/data/sql/__default_table_names__.txt", {st_mode=S_IFREG|0644, st_size=507, ...}) = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
Process 4981 detached

libxml2 reader python

XML_PARSER_SUBST_ENTITIES or? XML_PARSE_NOENT


http://xmlsoft.org/xmlreader.html

Personal tools