genshi.util¶
Various utility classes and functions.
-
class
genshi.util.
LRUCache
(capacity)¶ A dictionary-like object that stores only a certain number of items, and discards its least recently used item when full.
>>> cache = LRUCache(3) >>> cache['A'] = 0 >>> cache['B'] = 1 >>> cache['C'] = 2 >>> len(cache) 3
>>> cache['A'] 0
Adding new items to the cache does not increase its size. Instead, the least recently used item is dropped:
>>> cache['D'] = 3 >>> len(cache) 3 >>> 'B' in cache False
Iterating over the cache returns the keys, starting with the most recently used:
>>> for key in cache: ... print(key) D A C
This code is based on the LRUCache class from
myghtyutils.util
, written by Mike Bayer and released under the MIT license. See:
-
genshi.util.
flatten
(items)¶ Flattens a potentially nested sequence into a flat list.
Parameters: items – the sequence to flatten >>> flatten((1, 2)) [1, 2] >>> flatten([1, (2, 3), 4]) [1, 2, 3, 4] >>> flatten([1, (2, [3, 4]), 5]) [1, 2, 3, 4, 5]
-
genshi.util.
plaintext
(text, keeplinebreaks=True)¶ Return the text with all entities and tags removed.
>>> plaintext('<b>1 < 2</b>') u'1 < 2'
The keeplinebreaks parameter can be set to
False
to replace any line breaks by simple spaces:>>> plaintext('''<b>1 ... < ... 2</b>''', keeplinebreaks=False) u'1 < 2'
Parameters: - text – the text to convert to plain text
- keeplinebreaks – whether line breaks in the text should be kept intact
Returns: the text with tags and entities removed
-
genshi.util.
stripentities
(text, keepxmlentities=False)¶ Return a copy of the given text with any character or numeric entities replaced by the equivalent UTF-8 characters.
>>> stripentities('1 < 2') u'1 < 2' >>> stripentities('more …') u'more \u2026' >>> stripentities('…') u'\u2026' >>> stripentities('…') u'\u2026'
If the keepxmlentities parameter is provided and is a truth value, the core XML entities (&, ', >, < and ") are left intact.
>>> stripentities('1 < 2 …', keepxmlentities=True) u'1 < 2 \u2026'
Return a copy of the text with any XML/HTML tags removed.
>>> striptags('<span>Foo</span> bar') 'Foo bar' >>> striptags('<span class="bar">Foo</span>') 'Foo' >>> striptags('Foo<br />') 'Foo'
HTML/XML comments are stripped, too:
>>> striptags('<!-- <blub>hehe</blah> -->test') 'test'
Parameters: text – the string to remove tags from Returns: the text with tags removed