genshi.input¶
Support for constructing markup streams from files, strings, or other sources.
-
genshi.input.
ET
(element)¶ Convert a given ElementTree element to a markup stream.
Parameters: element – an ElementTree element Returns: a markup stream
-
exception
genshi.input.
ParseError
(message, filename=None, lineno=-1, offset=-1)¶ Exception raised when fatal syntax errors are found in the input being parsed.
-
class
genshi.input.
XMLParser
(source, filename=None, encoding=None)¶ Generator-based XML parser based on roughly equivalent code in Kid/ElementTree.
The parsing is initiated by iterating over the parser object:
>>> parser = XMLParser(StringIO('<root id="2"><child>Foo</child></root>')) >>> for kind, data, pos in parser: ... print('%s %s' % (kind, data)) START (QName('root'), Attrs([(QName('id'), u'2')])) START (QName('child'), Attrs()) TEXT Foo END child END root
-
parse
()¶ Generator that parses the XML source, yielding markup events.
Returns: a markup event stream Raises ParseError: if the XML text is not well formed
-
-
genshi.input.
XML
(text)¶ Parse the given XML source and return a markup stream.
Unlike with XMLParser, the returned stream is reusable, meaning it can be iterated over multiple times:
>>> xml = XML('<doc><elem>Foo</elem><elem>Bar</elem></doc>') >>> print(xml) <doc><elem>Foo</elem><elem>Bar</elem></doc> >>> print(xml.select('elem')) <elem>Foo</elem><elem>Bar</elem> >>> print(xml.select('elem/text()')) FooBar
Parameters: text – the XML source Returns: the parsed XML event stream Raises ParseError: if the XML text is not well-formed
-
class
genshi.input.
HTMLParser
(source, filename=None, encoding=None)¶ Parser for HTML input based on the Python HTMLParser module.
This class provides the same interface for generating stream events as XMLParser, and attempts to automatically balance tags.
The parsing is initiated by iterating over the parser object:
>>> parser = HTMLParser(BytesIO(u'<UL compact><LI>Foo</UL>'.encode('utf-8')), encoding='utf-8') >>> for kind, data, pos in parser: ... print('%s %s' % (kind, data)) START (QName('ul'), Attrs([(QName('compact'), u'compact')])) START (QName('li'), Attrs()) TEXT Foo END li END ul
-
parse
()¶ Generator that parses the HTML source, yielding markup events.
Returns: a markup event stream Raises ParseError: if the HTML text is not well formed
-
-
genshi.input.
HTML
(text, encoding=None)¶ Parse the given HTML source and return a markup stream.
Unlike with HTMLParser, the returned stream is reusable, meaning it can be iterated over multiple times:
>>> html = HTML('<body><h1>Foo</h1></body>', encoding='utf-8') >>> print(html) <body><h1>Foo</h1></body> >>> print(html.select('h1')) <h1>Foo</h1> >>> print(html.select('h1/text()')) Foo
Parameters: text – the HTML source Returns: the parsed XML event stream Raises ParseError: if the HTML text is not well-formed, and error recovery fails