woob.browser.filters.javascript

class JSPayload(selector=None, default=_NO_DEFAULT)[source]

Bases: Filter

Get Javascript code from tag’s text, cleaned from all comments.

It filters code in a such a way that corner cases are handled, such as comments in string literals and comments in comments.

The following snippet is borrowed from <https://ostermiller.org/findcomment.html>:

>>> JSPayload.filter('''someString = "An example comment: /* example */";
...
... // The comment around this code has been commented out.
... // /*
... some_code();
... // */''')
'someString = "An example comment: /* example */";\n\nsome_code();\n'
classmethod filter(value)[source]

This method has to be overridden by children classes.

class JSValue(selector=None, need_type=None, **kwargs)[source]

Bases: Regexp

Get one or many JavaScript literals.

It only understands literal values, but should parse them well. Values are converted in python values, quotes and slashes in strings are stripped.

>>> JSValue().filter('boringVar = "boring string"')
u'boring string'
>>> JSValue().filter('somecode(); doConfuse(0xdead, cat);')
57005
>>> JSValue(need_type=int, nth=2).filter('fazboo("3", "5", 7, "9");')
7
>>> JSValue(nth='*').filter('foo([1, 2, 3], "blah", 5.0, true, null]);')
[1, 2, 3, u'blah', 5.0, True, None]
pattern = '(?x)\n        (?:(?P<float>(?:[-+]\\s*)?                     # float ?\n               (?:(?:\\d+\\.\\d*|\\d*\\.\\d+)(?:[eE]\\d+)?\n                 |\\d+[eE]\\d+))\n          |(?P<int>(?:[-+]\\s*)?(?:0[bB][01]+          # int ?\n                                 |0[oO][0-7]+\n                                 |0[xX][0-9a-fA-F]+\n                                 |\\d+))\n          |(?:(?:(?:new\\s+)?String\\()?(?P<str>(?:(?<!\\\\)"(?:\\\\"|[^"])*"|(?<!\\\\)\'(?:\\\\\'|[^\'])*\')))  # str ?\n          |(?P<bool>true|false)                       # bool ?\n          |(?P<None>null))                            # None ?\n    '
to_python(m)[source]

Convert MatchObject to python value

class JSVar(selector=None, var=None, need_type=None, **kwargs)[source]

Bases: JSValue

Get assigned value of a variable, either as an initialisation value, either as an assignement. One can use Regexp’s nth parameter to be more specific.

See JSValue for more details about parsed values.

>>> JSVar(var='test').filter("var test = .1;\nsomecode()")
0.1
>>> JSVar(var='test').filter("test = 666;\nsomecode()")
666
>>> JSVar(var='test').filter("test = 'Some \\'string\\' value, isn\\'t it ?';\nsomecode()")
u"Some 'string' value, isn't it ?"
>>> JSVar(var='test').filter('test = "Some \\"string\\" value";\nsomecode()')
u'Some "string" value'
>>> JSVar(var='test').filter("var test = false;\nsomecode()")
False
>>> JSVar(var='test', nth=1).filter("var test = false; test = true;\nsomecode()")
True
pattern_template = '(?x)\n        (?:var\\s+)?                                   # optional var keyword\n        \\b%s                                          # var name\n        \\s*=\\s*                                       # equal sign\n    (?x)\n        (?:(?P<float>(?:[-+]\\s*)?                     # float ?\n               (?:(?:\\d+\\.\\d*|\\d*\\.\\d+)(?:[eE]\\d+)?\n                 |\\d+[eE]\\d+))\n          |(?P<int>(?:[-+]\\s*)?(?:0[bB][01]+          # int ?\n                                 |0[oO][0-7]+\n                                 |0[xX][0-9a-fA-F]+\n                                 |\\d+))\n          |(?:(?:(?:new\\s+)?String\\()?(?P<str>(?:(?<!\\\\)"(?:\\\\"|[^"])*"|(?<!\\\\)\'(?:\\\\\'|[^\'])*\')))  # str ?\n          |(?P<bool>true|false)                       # bool ?\n          |(?P<None>null))                            # None ?\n    '
filter(txt)[source]
Raises:

RegexpError if pattern was not found