woob.browser.elements

class AbstractElement(*args, **kwargs)[source]

Bases: object

condition: None | bool | _Filter | Callable[[], Any] = None

The condition to parse the element.

This allows ignoring certain elements if certain fields are not valid, or if the element should actually be parsed using another class.

This property can be defined as:

  • None or True, to signify that the element should be parsed regardless.

  • False, to signify that the element should not be parsed regardless.

  • A filter returning a falsy or non-falsy object, evaluated with the constructed document section (HTML element or JSON data) for the element.

  • A method returning a falsy or non-falsy object, evaluated with the element object directly.

use_selector(func, key=None)[source]
parse(obj)[source]
cssselect(*args, **kwargs)[source]
xpath(*args, **kwargs)[source]
handle_loaders()[source]
fill_env(page, parent=None)[source]
check_condition()[source]

Get whether our condition is respected or not.

exception DataError[source]

Bases: Exception

Returned data from pages are incoherent.

class DictElement(*args, **kwargs)[source]

Bases: ListElement

find_elements()[source]

Get the nodes that will have to be processed. This method can be overridden if xpath filters are not sufficient.

class ItemElement(*args, **kwargs)[source]

Bases: AbstractElement

klass: Type | None = None
validate: Callable[[Any], bool] | None = None
skip_optional_fields_errors: bool = False
item_xpath: str | None = None

The xpath to reroot the element in.

This will be evaluated only once the object and environment are set, so it can be defined as a property using self.obj and self.env, i.e. call parameters.

class Index[source]

Bases: object

build_object()[source]
should_highlight()[source]
Return type:

bool

reroot_element(el)[source]

Reroot a given element for parsing a given object.

This will be called once the object and environment is set.

Return type:

Any

handle_attr(key, func)[source]
class ItemElementFromAbstractPage[source]

Bases: object

Don’t use this class, import woob_modules.other_module.etc instead

class ListElement(*args, **kwargs)[source]

Bases: AbstractElement

item_xpath = None
empty_xpath = None
flush_at_end = False
ignore_duplicate = False
find_elements()[source]

Get the nodes that will have to be processed. This method can be overridden if xpath filters are not sufficient.

flush()[source]
check_next_page()[source]
store(obj)[source]
class MetaAbstractItemElement(name, bases, dct)[source]

Bases: type

exception SkipItem[source]

Bases: Exception

Raise this exception in an ItemElement subclass to skip an item.

class TableElement(*args, **kwargs)[source]

Bases: ListElement

head_xpath = None
cleaner

alias of CleanText

get_colnum(name)[source]
generate_table_element(doc, head_xpath, cleaner=CleanText)[source]

Prints generated base code for TableElement/TableCell usage. It is intended for development purposes, typically in woob-debug. :type doc: :param doc: lxml tree of the page (e.g. browser.page.doc) :type head_xpath: :param head_xpath: xpath of header columns (e.g. //table//th) :type head_xpath: str :type cleaner: :param cleaner: cleaner class (Filter) (default: CleanText) :type cleaner: Filter

magic_highlight(els, open_browser=True)[source]

Open a web browser with the document open and the element highlighted

method(klass)[source]

Class-decorator to call it as a method.