genshi.output

This module provides different kinds of serialization methods for XML event streams.

genshi.output.encode(iterator, method='xml', encoding=None, out=None)

Encode serializer output into a string.

Parameters:
  • iterator – the iterator returned from serializing a stream (basically any iterator that yields unicode objects)
  • method – the serialization method; determines how characters not representable in the specified encoding are treated
  • encoding – how the output string should be encoded; if set to None, this method returns a unicode object
  • out – a file-like object that the output should be written to instead of being returned as one big string; note that if this is a file or socket (or similar), the encoding must not be None (that is, the output must be encoded)
Returns:

a str or unicode object (depending on the encoding parameter), or None if the out parameter is provided

Since:

version 0.4.1

Note:

Changed in 0.5: added the out parameter

genshi.output.get_serializer(method='xml', **kwargs)

Return a serializer object for the given method.

Parameters:method – the serialization method; can be either “xml”, “xhtml”, “html”, “text”, or a custom serializer class

Any additional keyword arguments are passed to the serializer, and thus depend on the method parameter value.

See:XMLSerializer, XHTMLSerializer, HTMLSerializer, TextSerializer
Since:version 0.4.1
class genshi.output.DocType

Defines a number of commonly used DOCTYPE declarations as constants.

classmethod get(name)

Return the (name, pubid, sysid) tuple of the DOCTYPE declaration for the specified name.

The following names are recognized in this version:
  • “html” or “html-strict” for the HTML 4.01 strict DTD
  • “html-transitional” for the HTML 4.01 transitional DTD
  • “html-frameset” for the HTML 4.01 frameset DTD
  • “html5” for the DOCTYPE proposed for HTML5
  • “xhtml” or “xhtml-strict” for the XHTML 1.0 strict DTD
  • “xhtml-transitional” for the XHTML 1.0 transitional DTD
  • “xhtml-frameset” for the XHTML 1.0 frameset DTD
  • “xhtml11” for the XHTML 1.1 DTD
  • “svg” or “svg-full” for the SVG 1.1 DTD
  • “svg-basic” for the SVG Basic 1.1 DTD
  • “svg-tiny” for the SVG Tiny 1.1 DTD
Parameters:name – the name of the DOCTYPE
Returns:the (name, pubid, sysid) tuple for the requested DOCTYPE, or None if the name is not recognized
Since:version 0.4.1
class genshi.output.XMLSerializer(doctype=None, strip_whitespace=True, namespace_prefixes=None, cache=True)

Produces XML text from an event stream.

>>> from genshi.builder import tag
>>> elem = tag.div(tag.a(href='foo'), tag.br, tag.hr(noshade=True))
>>> print(''.join(XMLSerializer()(elem.generate())))
<div><a href="foo"/><br/><hr noshade="True"/></div>
class genshi.output.XHTMLSerializer(doctype=None, strip_whitespace=True, namespace_prefixes=None, drop_xml_decl=True, cache=True)

Produces XHTML text from an event stream.

>>> from genshi.builder import tag
>>> elem = tag.div(tag.a(href='foo'), tag.br, tag.hr(noshade=True))
>>> print(''.join(XHTMLSerializer()(elem.generate())))
<div><a href="foo"></a><br /><hr noshade="noshade" /></div>
class genshi.output.HTMLSerializer(doctype=None, strip_whitespace=True, cache=True)

Produces HTML text from an event stream.

>>> from genshi.builder import tag
>>> elem = tag.div(tag.a(href='foo'), tag.br, tag.hr(noshade=True))
>>> print(''.join(HTMLSerializer()(elem.generate())))
<div><a href="foo"></a><br><hr noshade></div>
class genshi.output.TextSerializer(strip_markup=False)

Produces plain text from an event stream.

Only text events are included in the output. Unlike the other serializer, special XML characters are not escaped:

>>> from genshi.builder import tag
>>> elem = tag.div(tag.a('<Hello!>', href='foo'), tag.br)
>>> print(elem)
<div><a href="foo">&lt;Hello!&gt;</a><br/></div>
>>> print(''.join(TextSerializer()(elem.generate())))
<Hello!>

If text events contain literal markup (instances of the Markup class), that markup is by default passed through unchanged:

>>> elem = tag.div(Markup('<a href="foo">Hello &amp; Bye!</a><br/>'))
>>> print(elem.generate().render(TextSerializer, encoding=None))
<a href="foo">Hello &amp; Bye!</a><br/>

You can use the strip_markup to change this behavior, so that tags and entities are stripped from the output (or in the case of entities, replaced with the equivalent character):

>>> print(elem.generate().render(TextSerializer, strip_markup=True,
...                              encoding=None))
Hello & Bye!