Skip to content

Commit b5e331f

Browse files
gpsheadtiranmdickinson
authored
[3.8] gh-95778: CVE-2020-10735: Prevent DoS by very large int() (#96503)
* Correctly pre-check for int-to-str conversion Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =) The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact. The justification for the current check. The C code check is: ```c max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10 ``` In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is: $$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$ From this it follows that $$\frac{M}{3L} < \frac{s-1}{10}$$ hence that $$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$ So $$2^{L(s-1)} > 10^M.$$ But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check. <!-- gh-issue-number: gh-95778 --> * Issue: gh-95778 <!-- /gh-issue-number --> Co-authored-by: Gregory P. Smith [Google LLC] <[email protected]> Co-authored-by: Christian Heimes <[email protected]> Co-authored-by: Mark Dickinson <[email protected]>
1 parent 4f100fe commit b5e331f

26 files changed

+885
-23
lines changed

Doc/data/python3.8.abi

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2381,7 +2381,7 @@
23812381
</data-member>
23822382
</class-decl>
23832383
<pointer-type-def type-id='type-id-55' size-in-bits='64' id='type-id-56'/>
2384-
<class-decl name='_is' size-in-bits='21696' is-struct='yes' visibility='default' filepath='./Include/internal/pycore_pystate.h' line='67' column='1' id='type-id-66'>
2384+
<class-decl name='_is' size-in-bits='21760' is-struct='yes' visibility='default' filepath='./Include/internal/pycore_pystate.h' line='67' column='1' id='type-id-66'>
23852385
<data-member access='public' layout-offset-in-bits='0'>
23862386
<var-decl name='next' type-id='type-id-67' visibility='default' filepath='./Include/internal/pycore_pystate.h' line='69' column='1'/>
23872387
</data-member>
@@ -2490,6 +2490,9 @@
24902490
<data-member access='public' layout-offset-in-bits='21632'>
24912491
<var-decl name='audit_hooks' type-id='type-id-60' visibility='default' filepath='./Include/internal/pycore_pystate.h' line='137' column='1'/>
24922492
</data-member>
2493+
<data-member access='public' layout-offset-in-bits='21696'>
2494+
<var-decl name='int_max_str_digits' type-id='type-id-7' visibility='default' filepath='./Include/internal/pycore_pystate.h' line='139' column='1'/>
2495+
</data-member>
24932496
</class-decl>
24942497
<pointer-type-def type-id='type-id-66' size-in-bits='64' id='type-id-67'/>
24952498
<typedef-decl name='__int64_t' type-id='type-id-36' filepath='/usr/include/x86_64-linux-gnu/bits/types.h' line='44' column='1' id='type-id-77'/>

Doc/library/functions.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -838,6 +838,14 @@ are always available. They are listed here in alphabetical order.
838838
.. versionchanged:: 3.8
839839
Falls back to :meth:`__index__` if :meth:`__int__` is not defined.
840840

841+
.. versionchanged:: 3.8.14
842+
:class:`int` string inputs and string representations can be limited to
843+
help avoid denial of service attacks. A :exc:`ValueError` is raised when
844+
the limit is exceeded while converting a string *x* to an :class:`int` or
845+
when converting an :class:`int` into a string would exceed the limit.
846+
See the :ref:`integer string conversion length limitation
847+
<int_max_str_digits>` documentation.
848+
841849

842850
.. function:: isinstance(object, classinfo)
843851

Doc/library/json.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,11 @@ is a lightweight data interchange format inspired by
1818
`JavaScript <https://en.wikipedia.org/wiki/JavaScript>`_ object literal syntax
1919
(although it is not a strict subset of JavaScript [#rfc-errata]_ ).
2020

21+
.. warning::
22+
Be cautious when parsing JSON data from untrusted sources. A malicious
23+
JSON string may cause the decoder to consume considerable CPU and memory
24+
resources. Limiting the size of data to be parsed is recommended.
25+
2126
:mod:`json` exposes an API familiar to users of the standard library
2227
:mod:`marshal` and :mod:`pickle` modules.
2328

@@ -255,6 +260,12 @@ Basic Usage
255260
be used to use another datatype or parser for JSON integers
256261
(e.g. :class:`float`).
257262

263+
.. versionchanged:: 3.8.14
264+
The default *parse_int* of :func:`int` now limits the maximum length of
265+
the integer string via the interpreter's :ref:`integer string
266+
conversion length limitation <int_max_str_digits>` to help avoid denial
267+
of service attacks.
268+
258269
*parse_constant*, if specified, will be called with one of the following
259270
strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``.
260271
This can be used to raise an exception if invalid JSON numbers

Doc/library/stdtypes.rst

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4870,6 +4870,165 @@ types, where they are relevant. Some of these are not reported by the
48704870
[<class 'bool'>]
48714871

48724872

4873+
.. _int_max_str_digits:
4874+
4875+
Integer string conversion length limitation
4876+
===========================================
4877+
4878+
CPython has a global limit for converting between :class:`int` and :class:`str`
4879+
to mitigate denial of service attacks. This limit *only* applies to decimal or
4880+
other non-power-of-two number bases. Hexadecimal, octal, and binary conversions
4881+
are unlimited. The limit can be configured.
4882+
4883+
The :class:`int` type in CPython is an abitrary length number stored in binary
4884+
form (commonly known as a "bignum"). There exists no algorithm that can convert
4885+
a string to a binary integer or a binary integer to a string in linear time,
4886+
*unless* the base is a power of 2. Even the best known algorithms for base 10
4887+
have sub-quadratic complexity. Converting a large value such as ``int('1' *
4888+
500_000)`` can take over a second on a fast CPU.
4889+
4890+
Limiting conversion size offers a practical way to avoid `CVE-2020-10735
4891+
<https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735>`_.
4892+
4893+
The limit is applied to the number of digit characters in the input or output
4894+
string when a non-linear conversion algorithm would be involved. Underscores
4895+
and the sign are not counted towards the limit.
4896+
4897+
When an operation would exceed the limit, a :exc:`ValueError` is raised:
4898+
4899+
.. doctest::
4900+
4901+
>>> import sys
4902+
>>> sys.set_int_max_str_digits(4300) # Illustrative, this is the default.
4903+
>>> _ = int('2' * 5432)
4904+
Traceback (most recent call last):
4905+
...
4906+
ValueError: Exceeds the limit (4300) for integer string conversion: value has 5432 digits.
4907+
>>> i = int('2' * 4300)
4908+
>>> len(str(i))
4909+
4300
4910+
>>> i_squared = i*i
4911+
>>> len(str(i_squared))
4912+
Traceback (most recent call last):
4913+
...
4914+
ValueError: Exceeds the limit (4300) for integer string conversion: value has 8599 digits.
4915+
>>> len(hex(i_squared))
4916+
7144
4917+
>>> assert int(hex(i_squared), base=16) == i*i # Hexadecimal is unlimited.
4918+
4919+
The default limit is 4300 digits as provided in
4920+
:data:`sys.int_info.default_max_str_digits <sys.int_info>`.
4921+
The lowest limit that can be configured is 640 digits as provided in
4922+
:data:`sys.int_info.str_digits_check_threshold <sys.int_info>`.
4923+
4924+
Verification:
4925+
4926+
.. doctest::
4927+
4928+
>>> import sys
4929+
>>> assert sys.int_info.default_max_str_digits == 4300, sys.int_info
4930+
>>> assert sys.int_info.str_digits_check_threshold == 640, sys.int_info
4931+
>>> msg = int('578966293710682886880994035146873798396722250538762761564'
4932+
... '9252925514383915483333812743580549779436104706260696366600'
4933+
... '571186405732').to_bytes(53, 'big')
4934+
...
4935+
4936+
.. versionadded:: 3.8.14
4937+
4938+
Affected APIs
4939+
-------------
4940+
4941+
The limitation only applies to potentially slow conversions between :class:`int`
4942+
and :class:`str` or :class:`bytes`:
4943+
4944+
* ``int(string)`` with default base 10.
4945+
* ``int(string, base)`` for all bases that are not a power of 2.
4946+
* ``str(integer)``.
4947+
* ``repr(integer)``
4948+
* any other string conversion to base 10, for example ``f"{integer}"``,
4949+
``"{}".format(integer)``, or ``b"%d" % integer``.
4950+
4951+
The limitations do not apply to functions with a linear algorithm:
4952+
4953+
* ``int(string, base)`` with base 2, 4, 8, 16, or 32.
4954+
* :func:`int.from_bytes` and :func:`int.to_bytes`.
4955+
* :func:`hex`, :func:`oct`, :func:`bin`.
4956+
* :ref:`formatspec` for hex, octal, and binary numbers.
4957+
* :class:`str` to :class:`float`.
4958+
* :class:`str` to :class:`decimal.Decimal`.
4959+
4960+
Configuring the limit
4961+
---------------------
4962+
4963+
Before Python starts up you can use an environment variable or an interpreter
4964+
command line flag to configure the limit:
4965+
4966+
* :envvar:`PYTHONINTMAXSTRDIGITS`, e.g.
4967+
``PYTHONINTMAXSTRDIGITS=640 python3`` to set the limit to 640 or
4968+
``PYTHONINTMAXSTRDIGITS=0 python3`` to disable the limitation.
4969+
* :option:`-X int_max_str_digits <-X>`, e.g.
4970+
``python3 -X int_max_str_digits=640``
4971+
* :data:`sys.flags.int_max_str_digits` contains the value of
4972+
:envvar:`PYTHONINTMAXSTRDIGITS` or :option:`-X int_max_str_digits <-X>`.
4973+
If both the env var and the ``-X`` option are set, the ``-X`` option takes
4974+
precedence. A value of *-1* indicates that both were unset, thus a value of
4975+
:data:`sys.int_info.default_max_str_digits` was used during initilization.
4976+
4977+
From code, you can inspect the current limit and set a new one using these
4978+
:mod:`sys` APIs:
4979+
4980+
* :func:`sys.get_int_max_str_digits` and :func:`sys.set_int_max_str_digits` are
4981+
a getter and setter for the interpreter-wide limit. Subinterpreters have
4982+
their own limit.
4983+
4984+
Information about the default and minimum can be found in :attr:`sys.int_info`:
4985+
4986+
* :data:`sys.int_info.default_max_str_digits <sys.int_info>` is the compiled-in
4987+
default limit.
4988+
* :data:`sys.int_info.str_digits_check_threshold <sys.int_info>` is the lowest
4989+
accepted value for the limit (other than 0 which disables it).
4990+
4991+
.. versionadded:: 3.8.14
4992+
4993+
.. caution::
4994+
4995+
Setting a low limit *can* lead to problems. While rare, code exists that
4996+
contains integer constants in decimal in their source that exceed the
4997+
minimum threshold. A consequence of setting the limit is that Python source
4998+
code containing decimal integer literals longer than the limit will
4999+
encounter an error during parsing, usually at startup time or import time or
5000+
even at installation time - anytime an up to date ``.pyc`` does not already
5001+
exist for the code. A workaround for source that contains such large
5002+
constants is to convert them to ``0x`` hexadecimal form as it has no limit.
5003+
5004+
Test your application thoroughly if you use a low limit. Ensure your tests
5005+
run with the limit set early via the environment or flag so that it applies
5006+
during startup and even during any installation step that may invoke Python
5007+
to precompile ``.py`` sources to ``.pyc`` files.
5008+
5009+
Recommended configuration
5010+
-------------------------
5011+
5012+
The default :data:`sys.int_info.default_max_str_digits` is expected to be
5013+
reasonable for most applications. If your application requires a different
5014+
limit, set it from your main entry point using Python version agnostic code as
5015+
these APIs were added in security patch releases in versions before 3.11.
5016+
5017+
Example::
5018+
5019+
>>> import sys
5020+
>>> if hasattr(sys, "set_int_max_str_digits"):
5021+
... upper_bound = 68000
5022+
... lower_bound = 4004
5023+
... current_limit = sys.get_int_max_str_digits()
5024+
... if current_limit == 0 or current_limit > upper_bound:
5025+
... sys.set_int_max_str_digits(upper_bound)
5026+
... elif current_limit < lower_bound:
5027+
... sys.set_int_max_str_digits(lower_bound)
5028+
5029+
If you need to disable it entirely, set it to ``0``.
5030+
5031+
48735032
.. rubric:: Footnotes
48745033

48755034
.. [1] Additional information on these special methods may be found in the Python

Doc/library/sys.rst

Lines changed: 46 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -445,9 +445,9 @@ always available.
445445
The :term:`named tuple` *flags* exposes the status of command line
446446
flags. The attributes are read only.
447447

448-
============================= =============================
448+
============================= ==============================================================================================================
449449
attribute flag
450-
============================= =============================
450+
============================= ==============================================================================================================
451451
:const:`debug` :option:`-d`
452452
:const:`inspect` :option:`-i`
453453
:const:`interactive` :option:`-i`
@@ -463,7 +463,8 @@ always available.
463463
:const:`hash_randomization` :option:`-R`
464464
:const:`dev_mode` :option:`-X` ``dev``
465465
:const:`utf8_mode` :option:`-X` ``utf8``
466-
============================= =============================
466+
:const:`int_max_str_digits` :option:`-X int_max_str_digits <-X>` (:ref:`integer string conversion length limitation <int_max_str_digits>`)
467+
============================= ==============================================================================================================
467468

468469
.. versionchanged:: 3.2
469470
Added ``quiet`` attribute for the new :option:`-q` flag.
@@ -481,6 +482,9 @@ always available.
481482
Added ``dev_mode`` attribute for the new :option:`-X` ``dev`` flag
482483
and ``utf8_mode`` attribute for the new :option:`-X` ``utf8`` flag.
483484

485+
.. versionchanged:: 3.8.14
486+
Added the ``int_max_str_digits`` attribute.
487+
484488

485489
.. data:: float_info
486490

@@ -661,6 +665,15 @@ always available.
661665

662666
.. versionadded:: 3.6
663667

668+
669+
.. function:: get_int_max_str_digits()
670+
671+
Returns the current value for the :ref:`integer string conversion length
672+
limitation <int_max_str_digits>`. See also :func:`set_int_max_str_digits`.
673+
674+
.. versionadded:: 3.8.14
675+
676+
664677
.. function:: getrefcount(object)
665678

666679
Return the reference count of the *object*. The count returned is generally one
@@ -934,19 +947,31 @@ always available.
934947

935948
.. tabularcolumns:: |l|L|
936949

937-
+-------------------------+----------------------------------------------+
938-
| Attribute | Explanation |
939-
+=========================+==============================================+
940-
| :const:`bits_per_digit` | number of bits held in each digit. Python |
941-
| | integers are stored internally in base |
942-
| | ``2**int_info.bits_per_digit`` |
943-
+-------------------------+----------------------------------------------+
944-
| :const:`sizeof_digit` | size in bytes of the C type used to |
945-
| | represent a digit |
946-
+-------------------------+----------------------------------------------+
950+
+----------------------------------------+-----------------------------------------------+
951+
| Attribute | Explanation |
952+
+========================================+===============================================+
953+
| :const:`bits_per_digit` | number of bits held in each digit. Python |
954+
| | integers are stored internally in base |
955+
| | ``2**int_info.bits_per_digit`` |
956+
+----------------------------------------+-----------------------------------------------+
957+
| :const:`sizeof_digit` | size in bytes of the C type used to |
958+
| | represent a digit |
959+
+----------------------------------------+-----------------------------------------------+
960+
| :const:`default_max_str_digits` | default value for |
961+
| | :func:`sys.get_int_max_str_digits` when it |
962+
| | is not otherwise explicitly configured. |
963+
+----------------------------------------+-----------------------------------------------+
964+
| :const:`str_digits_check_threshold` | minimum non-zero value for |
965+
| | :func:`sys.set_int_max_str_digits`, |
966+
| | :envvar:`PYTHONINTMAXSTRDIGITS`, or |
967+
| | :option:`-X int_max_str_digits <-X>`. |
968+
+----------------------------------------+-----------------------------------------------+
947969

948970
.. versionadded:: 3.1
949971

972+
.. versionchanged:: 3.8.14
973+
Added ``default_max_str_digits`` and ``str_digits_check_threshold``.
974+
950975

951976
.. data:: __interactivehook__
952977

@@ -1220,6 +1245,14 @@ always available.
12201245

12211246
.. availability:: Unix.
12221247

1248+
.. function:: set_int_max_str_digits(n)
1249+
1250+
Set the :ref:`integer string conversion length limitation
1251+
<int_max_str_digits>` used by this interpreter. See also
1252+
:func:`get_int_max_str_digits`.
1253+
1254+
.. versionadded:: 3.8.14
1255+
12231256
.. function:: setprofile(profilefunc)
12241257

12251258
.. index::

Doc/library/test.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1283,6 +1283,16 @@ The :mod:`test.support` module defines the following functions:
12831283
.. versionadded:: 3.6
12841284

12851285

1286+
.. function:: adjust_int_max_str_digits(max_digits)
1287+
1288+
This function returns a context manager that will change the global
1289+
:func:`sys.set_int_max_str_digits` setting for the duration of the
1290+
context to allow execution of test code that needs a different limit
1291+
on the number of digits when converting between an integer and string.
1292+
1293+
.. versionadded:: 3.8.14
1294+
1295+
12861296
The :mod:`test.support` module defines the following classes:
12871297

12881298
.. class:: TransientResource(exc, **kwargs)

Doc/using/cmdline.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -437,6 +437,9 @@ Miscellaneous options
437437
* ``-X showalloccount`` to output the total count of allocated objects for
438438
each type when the program finishes. This only works when Python was built with
439439
``COUNT_ALLOCS`` defined.
440+
* ``-X int_max_str_digits`` configures the :ref:`integer string conversion
441+
length limitation <int_max_str_digits>`. See also
442+
:envvar:`PYTHONINTMAXSTRDIGITS`.
440443
* ``-X importtime`` to show how long each import takes. It shows module
441444
name, cumulative time (including nested imports) and self time (excluding
442445
nested imports). Note that its output may be broken in multi-threaded
@@ -487,6 +490,9 @@ Miscellaneous options
487490
The ``-X pycache_prefix`` option. The ``-X dev`` option now logs
488491
``close()`` exceptions in :class:`io.IOBase` destructor.
489492

493+
.. versionadded:: 3.8.14
494+
The ``-X int_max_str_digits`` option.
495+
490496

491497
Options you shouldn't use
492498
~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -646,6 +652,13 @@ conflict.
646652

647653
.. versionadded:: 3.2.3
648654

655+
.. envvar:: PYTHONINTMAXSTRDIGITS
656+
657+
If this variable is set to an integer, it is used to configure the
658+
interpreter's global :ref:`integer string conversion length limitation
659+
<int_max_str_digits>`.
660+
661+
.. versionadded:: 3.8.14
649662

650663
.. envvar:: PYTHONIOENCODING
651664

Doc/whatsnew/3.8.rst

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2325,3 +2325,17 @@ any leading zeros.
23252325

23262326
(Originally contributed by Christian Heimes in :issue:`36384`, and backported
23272327
to 3.8 by Achraf Merzouki)
2328+
2329+
Notable security feature in 3.8.14
2330+
==================================
2331+
2332+
Converting between :class:`int` and :class:`str` in bases other than 2
2333+
(binary), 4, 8 (octal), 16 (hexadecimal), or 32 such as base 10 (decimal)
2334+
now raises a :exc:`ValueError` if the number of digits in string form is
2335+
above a limit to avoid potential denial of service attacks due to the
2336+
algorithmic complexity. This is a mitigation for `CVE-2020-10735
2337+
<https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735>`_.
2338+
This limit can be configured or disabled by environment variable, command
2339+
line flag, or :mod:`sys` APIs. See the :ref:`integer string conversion
2340+
length limitation <int_max_str_digits>` documentation. The default limit
2341+
is 4300 digits in string form.

0 commit comments

Comments
 (0)