Elementflow is a Python library for generating XML as a stream. Some existing XML producing libraries (like ElementTree, lxml) build a whole XML tree in memory and then serialize it. It might be inefficient for moderately large XML payloads (think of a content-oriented Web service producing lots of XML data output). Python’s built-in
xml.sax.saxutils.XMLGenerator
is very low-level and requires closing elements by hand. Also, most XML libraries, to be honest, suck when dealing with namespaces.
Let’s assume we have a fridge with products on its shelves that is represented by this code:
FRIDGE = (
{
'sausage': 1,
'cake': 3,
},
{
'juice': 1,
'apple': 7,
},
{
'water': 17,
},
)
Our goal is to produce XML like this:
<?xml version="1.0" encoding="utf-8"?>
<fridge>
<shelf>
<product name="cake" quantity="3"/>
<product name="sausage" quantity="1"/>
</shelf>
<shelf>
<product name="juice" quantity="1"/>
<product name="apple" quantity="7"/>
</shelf>
<shelf>
<product name="water" quantity="17"/>
</shelf>
</fridge>
There is nothing special about XML generation for fridge, except that we want to reuse shelf generator (that can be used without fridge :) By default elementflow’s XML generators output XML declaration, but in the middle of the document that wouldn’t be nice. So we need to write generator that can accept declaration
argument (which can be True
or False
).
import elementflow
BUFSIZE = 1024
class IndentingGenerator(elementflow.IndentingGenerator):
def __init__(self, file, root,
attrs={}, namespaces={}, declaration=True, **kwargs):
super(IndentingGenerator, self).__init__(file, root,
attrs, namespaces, **kwargs)
if not declaration:
self.file.pop()
self.container(root, attrs, **kwargs)
It’s not elegant solution, but it works. Now let’s write XML generation:
def shelf_to_xml(shelf, declaration=True):
xml = IndentingGenerator(elementflow.Queue(), 'shelf',
declaration=declaration)
with xml:
for name, quantity in shelf.items():
xml.element('product', {'name': unicode(name),
'quantity': unicode(quantity)})
if len(xml.file) > BUFSIZE:
yield xml.file.pop()
yield xml.file.pop()
def fridge_to_xml(fridge):
with IndentingGenerator(elementflow.Queue(), 'fridge') as xml:
for shelf in fridge:
for chunk in shelf_to_xml(shelf, declaration=False):
xml.file.write(chunk.decode('utf-8'))
if len(xml.file) > BUFSIZE:
yield xml.file.pop()
yield xml.file.pop()
And now let’s test:
print ''.join(fridge_to_xml(FRIDGE))
It works.
This blog is about things I encounter while doing web and non-web software development.