Skip to content

normalize() method of Decimal class does not always preserve value #105774

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hs-vc opened this issue Jun 14, 2023 · 10 comments
Closed

normalize() method of Decimal class does not always preserve value #105774

hs-vc opened this issue Jun 14, 2023 · 10 comments
Assignees
Labels
docs Documentation in the Doc dir

Comments

@hs-vc
Copy link

hs-vc commented Jun 14, 2023

I have encountered unexpected behavior while using the normalize() method of the Decimal class. I performed the following test using Python 3.10.10 and 3.11.4.

from decimal import Decimal
v1 =  Decimal("0.99999999999999999999999999999")
v2 = v1.normalize()
print(v1) # Output: 0.99999999999999999999999999999
print(v2) # Output: 1
assert(v1 == v2) # trigger AssertionError

Based on my understanding, the normalize() method is intended to produce a canonical representation of a decimal number. However, in this case, the values of v1 and v2 are not matching, leading to an AssertionError.

Linked PRs

@hs-vc hs-vc added the type-bug An unexpected behavior, bug, or error label Jun 14, 2023
@tomasr8
Copy link
Member

tomasr8 commented Jun 14, 2023

From the docs:

Unlike hardware based binary floating point, the decimal module has a user alterable precision (defaulting to 28 places)

Your example is 29 places so it gets rounded

@terryjreedy
Copy link
Member

@rhettinger Decimal question

@ericvsmith
Copy link
Member

I couldn't see it documented anywhere that .normalize() does any rounding. Maybe we should add that?

@mdickinson

@mdickinson
Copy link
Member

@ericvsmith Yes, updating the docs is probably a good idea. (I'm not offering to do it, I'm afraid; I don't have the bandwidth in the near future.)

FWIW, the behaviour itself is deliberate, even if questionable: the normalize method implements the reduce operation of the specification. (Aside: the specification changed the name of the operation from "normalize" to "reduce", but Python kept the old name; I'm guessing that the reason for the name change is that "normalize" is confusing, since it has no relationship to "normal" and "subnormal".)

The reduce operation is documented (in the spec) as equivalent to the "plus" operation along with trimming of trailing zeros, and the "plus" operation similarly rounds to the current context (and we have test cases from Mike Cowlishaw that confirm that intention).

Slightly surprisingly, IEEE 754 doesn't seem to have any equivalent operation (perhaps because there are issues at the top end of the dynamic range - e.g., in the IEEE 754 decimal32 format, 1.234000e96 is representable, but 1.234e96 is not), so we can't use IEEE 754 as a guide to whether the operation "should" round or not. But then we don't need to, since the IBM spec is reasonably clear here.

@rhettinger rhettinger self-assigned this Jun 17, 2023
@rhettinger
Copy link
Contributor

I couldn't see it documented anywhere that .normalize() does any rounding.

That is because rounding is the default for all operations that return a Decimal object except for __new__. The principle is that all numbers are exact even if they exceed the current context precision, that operations are applied to those exact inputs, and that the context (including rounding) is applied after the computation. This leads to surprises if your mental model incorrectly assumes that the numbers are rounded before the operation. To continue the OP's example:

>>> v1 =  Decimal("0.99999999999999999999999999999")
>>> v1 == v1 + 0
False
>>> v1 == v1 * 1
False
>>> v1 == v1 / 1
False
>>> v1 == + v1
False
>>> v1 == - - v1
False
>>> v1.quantize(Decimal('1.0000000000000000000000000000'))
Traceback (most recent call last):
   ...
decimal.InvalidOperation: [<class 'decimal.InvalidOperation'>]

The docs say that normalize() is "used for producing canonical values for attributes of an equivalence class." We could amend that to say, is "used for producing canonical values an equivalence class within either the current context or the specified context". That would explain why these two sets have different sizes:

>>> len({+v1, v1*1, v1/1, v1.normalize()})
1
>>> len({+v1, v1*1, v1/1, v1.normalize(Context(prec=50))})
2

@rhettinger
Copy link
Contributor

rhettinger commented Jun 18, 2023

Also, I'm thinking of adding an entry to the Decimal FAQ section to demonstrate and explain the notion that numbers are considered exact, that they are created independent of the current context (and can have greater precision), and that contexts are applied after an operation:

>>> getcontext().prec = 5
>>> pi = Decimal('3.1415926535')   # More than 5 digits
>>> pi                             # All digits are retained
Decimal('3.1415926535')
>>> pi + 0                         # Rounded after an addition
Decimal('3.1416')
>>> pi - Decimal('0.00005')        # Subtract unrounded numbers, then round
Decimal('3.1415')
>>> pi + 0 - Decimal('0.00005').   # Intermediate values are rounded
Decimal('3.1416')

@rhettinger rhettinger added docs Documentation in the Doc dir and removed type-bug An unexpected behavior, bug, or error labels Jun 18, 2023
@rhettinger
Copy link
Contributor

Also, I'm thinking that the docs and docstring for normalize() should include the wording from the specification:

It has the same semantics as the plus operation, except that if the final result is finite it is reduced to its simplest form, with all trailing zeros removed and its sign preserved. That is, while the coefficient is non-zero and a multiple of ten the coefficient is divided by ten and the exponent is incremented by 1. Otherwise (the coefficient is zero) the exponent is set to 0. In all cases the sign is unchanged.

rhettinger added a commit to rhettinger/cpython that referenced this issue Jun 25, 2023
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 27, 2023
(cherry picked from commit a8210b6)

Co-authored-by: Raymond Hettinger <[email protected]>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 27, 2023
(cherry picked from commit a8210b6)

Co-authored-by: Raymond Hettinger <[email protected]>
@HansBrende
Copy link

@mdickinson so do we have no recourse for normalizing an arbitrary (unknown precision, unknown exponent, etc.) decimal without rounding other than doing something like:

Context(
    prec=len(value.as_tuple().digits),
    Emin=[calculate an Emin that won't throw], 
    Emax=[calculate an Emax that won't throw]
).normalize(value)

If that's the case, it actually sounds easier to do the normalization manually by stripping zeros from value.as_tuple() and modifying the exponent ourselves rather than using the built-in functionality at all. Is it really supposed to be this hard to simply normalize without rounding, throwing an exception, etc.?

@mdickinson
Copy link
Member

@HansBrende

so do we have no recourse for normalizing an arbitrary (unknown precision, unknown exponent, etc.) decimal without rounding other than doing something like [...]

Yes, I believe that's currently the case.

it actually sounds easier to do the normalization manually by stripping zeros from value.as_tuple() and modifying the exponent ourselves rather than using the built-in functionality [...]

Agreed. I'd either do it this way, or via the string representation.

@ghedsouza
Copy link

ghedsouza commented Jul 11, 2024

@mdickinson so do we have no recourse for normalizing an arbitrary (unknown precision, unknown exponent, etc.) decimal without rounding other than doing something like:

For a practical recipe, this code will make the best effort and raise an exception if there is any precision loss (from the traps). The MAX_PREC is so high that you would run of memory + disk space before getting close to the the limit anyway (999999999999999999 = 888.18PiB).

normalized_value = decimal.Decimal(value).normalize(decimal.Context(
    traps=[decimal.Inexact],
    prec=decimal.MAX_PREC,
    Emax=decimal.MAX_EMAX,
    Emin=decimal.MIN_EMIN
))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir
Projects
None yet
Development

No branches or pull requests

8 participants