class weboob.browser.browsers.APIBrowser(baseurl=None, *args, **kwargs)

Bases: weboob.browser.browsers.DomainBrowser

A browser for API websites.

build_request(*args, **kwargs)
open(*args, **kwargs)

Do a JSON request.

The “Content-Type” header is always set to “application/json”.

  • data (dict) – if specified, format as JSON and send as request body
  • headers (dict) – if specified, add these headers to the request
request(*args, **kwargs)

Do a JSON request and parse the response.

Returns:a dict containing the parsed JSON server response
Return type:dict
class weboob.browser.browsers.AbstractBrowser(logger=None, proxy=None, responses_dirname=None, weboob=None)

Bases: weboob.browser.browsers.Browser

AbstractBrowser allow inheritance of a browser defined in another module.

Websites can share many pages and code base. This class allow to load a browser provided by another module and to build our own browser on top of it (like standard python inheritance. Weboob will install and download the PARENT module for you.

PARENT is a mandatory attribute, it’s the name of the module providing the parent Browser

PARENT_ATTR is an optionnal attribute used when the parent module does not have only one browser defined as BROWSER class attribute: you can customized the path of the object to load.

Note that you must pass a valid weboob instance as first argument of the constructor.

exception weboob.browser.browsers.AbstractBrowserMissingParentError

Bases: exceptions.Exception

class weboob.browser.browsers.Browser(logger=None, proxy=None, responses_dirname=None, weboob=None)

Bases: object

Simple browser class. Act like a browser, and don’t try to do too much.

PROFILE = <weboob.browser.profiles.Firefox object>
REFRESH_RE = <_sre.SRE_Pattern object>
TIMEOUT = 10.0
classmethod asset(localfile)

Absolute file path for a module local file.

async_open(url, **kwargs)

Shortcut to open(url, is_async=True).

build_request(url, referrer=None, data_encoding=None, **kwargs)

Does the same job as open(), but returns a Request without submitting it. This allows further customization to the Request.

get_referrer(oldurl, newurl)

Get the referrer to send when doing a request. If we should not send a referrer, it will return None.

Reference: https://en.wikipedia.org/wiki/HTTP_referer

The behavior can be controlled through the ALLOW_REFERRER attribute. True always allows the referers to be sent, False never, and None only if it is within the same domain.

  • oldurl (str or None) – Current absolute URL
  • newurl (str) – Target absolute URL
Return type:

str or None


Called by open, to handle Refresh HTTP header.

It only redirect to the refresh URL if the sleep time is inferior to REFRESH_MAX.

location(url, **kwargs)

Like open() but also changes the current URL and response. This is the most common method to request web pages.

Other than that, has the exact same behavior of open().

open(url, referrer=None, allow_redirects=True, stream=None, timeout=None, verify=None, cert=None, proxies=None, data_encoding=None, is_async=False, callback=<function <lambda>>, **kwargs)
Make an HTTP request like a browser does:
  • follow redirects (unless disabled)
  • provide referrers (unless disabled)

Unless a method is explicitly provided, it makes a GET request, or a POST if data is not None, An empty data (not None, like ‘’ or {}) will make a POST.

It is a wrapper around session.request(). All session.request() options are available. You should use location() or open() and not session.request(), since it has some interesting additions, which are easily individually disabled through the arguments.

Call this instead of location() if you do not want to “visit” the URL (for instance, you are downloading a file).

When is_async is True, open() returns a Future object (see concurrent.futures for more details), which can be evaluated with its result() method. If any exception is raised while processing request, it is caught and re-raised when calling result().

For example:

>>> Browser().open('http://google.com', is_async=True).result().text 
  • url (str or dict or None) – URL
  • data – POST data
  • referrer (str or False or None) – Force referrer. False to disable sending it, None for guessing
  • is_async (bool) – Process request in a non-blocking way
  • callback (function) – Callback to be called when request has finished, with response as its first and only argument
Return type:



Get a prepared request from a Request object.

This method aims to be overloaded by children classes.


Like Response.raise_for_status but will use other classes if needed.

save_response(response, warning=False, **kwargs)
set_normalized_url(response, **kwargs)
class weboob.browser.browsers.DomainBrowser(baseurl=None, *args, **kwargs)

Bases: weboob.browser.browsers.Browser

A browser that handles relative URLs and can have a base URL (usually a domain).

For instance self.location(‘/hello’) will get http://weboob.org/hello if BASEURL is ‘http://weboob.org/‘.

absurl(uri, base=None)

Get the absolute URL, relative to a base URL. If base is None, it will try to use the current URL. If there is no current URL, it will try to use BASEURL.

If base is False, it will always try to use the current URL. If base is True, it will always try to use BASEURL.

  • uri (str) – URI to make absolute. It can be already absolute.
  • base (str or None or False or True) – Base absolute URL.
Return type:



Go to the “home” page, usually the BASEURL.

open(req, *args, **kwargs)

Like Browser.open() but handles urls without domains, using the BASEURL attribute.


Checks if we are allowed to visit an URL. See RESTRICT_URL.

Parameters:url (str) – Absolute URL
Return type:bool
class weboob.browser.browsers.LoginBrowser(username, password, *args, **kwargs)

Bases: weboob.browser.browsers.PagesBrowser

A browser which supports login.


Abstract method to implement to login on website.

It is call when a login is needed.


Logout from website.

By default, simply clears the cookies.

class weboob.browser.browsers.PagesBrowser(*args, **kwargs)

Bases: weboob.browser.browsers.DomainBrowser

A browser which works pages and keep state of navigation.

To use it, you have to derive it and to create URL objects as class attributes. When open() or location() are called, if the url matches one of URL objects, it returns a Page object. In case of location(), it stores it in self.page.


>>> from .pages import HTMLPage
>>> class HomePage(HTMLPage):
...     pass
>>> class ListPage(HTMLPage):
...     pass
>>> class MyBrowser(PagesBrowser):
...     BASEURL = 'http://example.org'
...     home = URL('/(index\.html)?', HomePage)
...     list = URL('/list\.html', ListPage)

You can then use URL instances to go on pages.

location(*args, **kwargs)

Same method than weboob.browser.browsers.Browser.location(), but if the url matches any URL object, an attribute page is added to response, and the attribute PagesBrowser.page is set.

open(*args, **kwargs)

Same method than weboob.browser.browsers.DomainBrowser.open(), but the response contains an attribute page if the url matches any URL object.

pagination(func, *args, **kwargs)

This helper function can be used to handle pagination pages easily.

When the called function raises an exception NextPage, it goes on the wanted page and recall the function.

NextPage constructor can take an url or a Request object.

>>> from .pages import HTMLPage
>>> class Page(HTMLPage):
...     def iter_values(self):
...         for el in self.doc.xpath('//li'):
...             yield el.text
...         for next in self.doc.xpath('//a'):
...             raise NextPage(next.attrib['href'])
>>> class Browser(PagesBrowser):
...     BASEURL = 'https://people.symlink.me'
...     list = URL('/~rom1/projects/weboob/list-(?P<pagenum>\d+).html', Page)
>>> b = Browser()
>>> b.list.go(pagenum=1) 
<weboob.browser.browsers.Page object at 0x...>
>>> list(b.pagination(lambda: b.page.iter_values()))
['One', 'Two', 'Three', 'Four']
class weboob.browser.browsers.StatesMixin

Bases: object

Mixin to store states of browser.

exception weboob.browser.browsers.UrlNotAllowed

Bases: exceptions.Exception

Raises by DomainBrowser when RESTRICT_URL is set and trying to go on an url not matching BASEURL.


Decorator used to require to be logged to access to this function.