weboob.browser.filters.html

class weboob.browser.filters.html.CSS(selector=None, default=NO_DEFAULT)

Bases: weboob.browser.filters.base._Selector

Select HTML elements with a CSS selector

For example:

obj_foo = CleanText(CSS('div.main'))

will take the text of all <div> having CSS class “main”.

Parameters:default – default value in case the filter fails to find or parse the requested value
select(selector, item)
class weboob.browser.filters.html.XPath(selector=None, default=NO_DEFAULT)

Bases: weboob.browser.filters.base._Selector

Select HTML elements with a XPath selector

Parameters:default – default value in case the filter fails to find or parse the requested value
exception weboob.browser.filters.html.XPathNotFound

Bases: weboob.browser.filters.base.FilterError

exception weboob.browser.filters.html.AttributeNotFound

Bases: weboob.browser.filters.base.FilterError

class weboob.browser.filters.html.Attr(selector, attr, default=NO_DEFAULT)

Bases: weboob.browser.filters.base.Filter

Get the text value of an HTML attribute.

Get value from attribute attr of HTML element matched by selector.

For example:

obj_foo = Attr('//img[@id="thumbnail"]', 'src')

will take the “src” attribute of <img> whose “id” is “thumbnail”.

Parameters:
  • selector – selector targeting the element
  • attr – name of the attribute to take
filter(value)
Raises:XPathNotFound if no element is found
Raises:AttributeNotFound if the element doesn’t have the requested attribute

Bases: weboob.browser.filters.html.Attr

Get the link uri of an element.

If the <a> tag is not found, an exception IndexError is raised.

Bases: weboob.browser.filters.html.Link

Get the absolute link URI of an element.

class weboob.browser.filters.html.CleanHTML(selector=None, options=None, default=NO_DEFAULT)

Bases: weboob.browser.filters.base.Filter

Convert HTML to text (Markdown) using html2text.

See also

html2text site

Parameters:options (dict) – options suitable for html2text
classmethod clean(txt, options=None)
filter(value)
class weboob.browser.filters.html.FormValue(selector=None, default=NO_DEFAULT)

Bases: weboob.browser.filters.base.Filter

Extract a Python value from a form element.

Checkboxes and radio return booleans, while the rest return text. For <select> tags, returns the user-visible text.

Parameters:default – default value in case the filter fails to find or parse the requested value
filter(value)
class weboob.browser.filters.html.HasElement(selector, yesvalue=True, novalue=False)

Bases: weboob.browser.filters.base.Filter

Returns yesvalue if the selector finds elements, novalue otherwise.

filter(value)
class weboob.browser.filters.html.TableCell(*names, **kwargs)

Bases: weboob.browser.filters.base._Filter

Used with TableElement, gets the cell element from its name.

For example:

>>> from weboob.capabilities.bank import Transaction
>>> from weboob.browser.elements import TableElement, ItemElement
>>> class table(TableElement):
...     head_xpath = '//table/thead/th'
...     item_xpath = '//table/tbody/tr'
...     col_date =    u'Date'
...     col_label =   [u'Name', u'Label']
...     class item(ItemElement):
...         klass = Transaction
...         obj_date = Date(TableCell('date'))
...         obj_label = CleanText(TableCell('label'))
...
exception weboob.browser.filters.html.ColumnNotFound

Bases: weboob.browser.filters.base.FilterError