user defined quote literals, eg: -12.3E+4022'dec128 ; -128'i8 is transformed as `i8("-128")`

## summary
* builtin literals (eg `-128'i8`) become regular user defined literals (UDL) that the parser represents as a litteral-call expression `i8("-128")`, with AST `nkLitCall(nkIdent("i8"), nkStrLit("-128"))`; which has same semantics as `nkCall`
* parser becomes lazy, makes no attempt at parsing the string into a numerical type (parsing is deferred till actually needed insde semphase, if at all)
* new UDL (bigint, rational, dec128, f80 80 bit FP) become possible as library code, see examples below
* existing literal `nkCharLit .. nkFloat64Lit` get replaced by a single `nkLit` kind, and `TNode` is simplified as follows:
```nim
    # of nkCharLit..nkUInt64Lit:
    #  intVal*: BiggestInt
    # of nkFloatLit..nkFloat648Lit:
    #  floatVal*: BiggestFloat
    of nkLit: # new
      value*: uint64 # represents a float or int depending on `typ`, cast as uint64
    of nkStrLit..nkTripleStrLit:
      strVal*: string
... 
```
(or if `nkFloat128Lit` is still needed, to `array[2,uint64]` instead of `uint64`)

* all operations involving nkLit involve casting `value` from `uint64` to the appropriate type as specified by `typ : PType` (then casting back to `uint64`)
* IMO it's possible to do all this without introducing breaking changes, but this can be discussed separately

## details
* nim parser transforms all quote literals (eg `123'i8`) into litteral-call expressions as follows:
```nim
let a = -0x12e4567'bar
# parser transforms into:
bar("-0x12e4567") # AST: nkLitCall(nkIdent("bar"), nkStrLit("-0x12e4567"))
```
* the string literal is all characters preceding `'` that are in some set (eg: numbers + `-` + letters; precise set TBD); eg 
```nim
[-12'i8] => [i8("-12")]
a=-123e-12'f64 => a=f64("-123e-12")
```
* the parsing of the string litteral (eg `-123e-12`) is delayed until it's needed (eg for cgen, or vm)

## benefits
* remove edge case where T.low can't be used as a litteral for signed types T, eg `-128'i8`, see https://github.com/timotheecour/Nim/issues/125; because it'd be parsed as: `i8("-128")`

* `repr` (and runnableExamples rendering etc) would preserve original source code formatting (refs https://github.com/nim-lang/Nim/issues/8871) in partical `binary/octal/1_000_000`; it'd also make it easier for `nimpretty` and `nim doc`

* parsing becomes lazier, leading to potentially faster compile times in case some large chunk of code is statically disabled, eg via `when defined cpp: let a = 12.3'f32` => `12.3` won't need to be parsed into a float

* no more redundancy between the type (eg `tyInt32`) and the literal (eg `nkInt32Lit`) since we now just have the type + a single kind nkLit; AST is simplified, user macros and compiler code have less `TNodeKind` kinds to deal with

* parser backward compatibility when new literals are introduced
suppose we implement these new literal handling in 1.3.7, then any nim version after that will be backward compatible using since, eg:
```nim
since 1.3.9: # time when 80 bit float literals are introduced
  let a = 1.2'f80 # this would break nim < 1.3.7 (parser error) but not nim 1.3.7
```
the "generalized" literal handling could also be backported to older nim (eg 1.2.2) using a simple hack: turn unrecognized literals (eg 1.2'f80) into an error PNode, but not a parser error, so that `since 1.3.7:` would work and not give parser error

* enables user defined quote literals
everything becomes user defined, so `dec128` (https://forum.nim-lang.org/t/6310#38884) can be written via:
```nim
let a = -12.3E+4022'dec128 # calls dec128("-12.3E+4022"), returning a `Decimal128`
```
the builtin literals are not special builtins anymore, and require symbols defined in system.nim, eg:
```nim
# system.nim
proc i8*(a: string): int8 # but we can hardcode these as `builtinI8` instead of `i8` if needed
proc f32*(a: string): float32 # these doesn't even have to be magic, but can be
# etc
```
with `-0x12e4567'bar`, if `bar` isn't defined in scope, it gives a regular CT error (`bar` not defined)

## examples
all these types can be implemented as library solution and preserve nice native looking syntax, and also, would render as numerical types, not strings (pending updating syntax highlighters, including github linguist, as evidenced by ugly highlighting in this post)

* 80 bit float
```nim
let a = -1.2'f80
```
* bigint
```nim
let a = -123456789'bigint # instead of bigint"-123456789"
```

* decimal128
```nim
let a = -12.3E+4022'dec128 
```

* rational numbers
```nim
let a = 12/3'rational # or some other syntax if `/` is not in valid set of literals
```

* complex numbers
```nim
let a = 1.2+3.2i'c # or some other syntax
```
* symbolic math
```nim
let expr = diff(x^2's + y^2's, x's) # symbolic differentiation wrt sybmolic variable x; half baked idea here
```

## note
* since it's user defined, module-scoped aliases are possible, eg if a module deals a lot with rationals it can write:
```nim
template r(a: string): untyped = rational(a)
let a = 12/3'r * -4/5'r
```

* I originally suggested an initial concept of this in https://github.com/nim-lang/compilerdev/issues/7 but then realized it could be generalized to support arbitrary user defined literals and simplify the AST thanks to parser transformation, so that builtin literals (eg `123'i8`) are no longer builtin and naturally extend to other user defined literals

* literals without quote (quote as in `1'i8`) can be handled uniformly as literals with quote by a fake litteral-call, eg:
```nim
const a = 1.2e12 => openLitteral("1.2e12")
const b = 1234 => openLitteral("1234")
```
`openLitteral` preserves the same semantics as `const b = 1234`, in that the type is not bound but kept open so that this remains valid:
```nim
const b = 1234
let b2: seq[float32] = @[b, 1.32]
```

## VM
likewise for VM: `TFullReg` could be simplified as:
```nim
  TFullReg* = object
    case kind*: TRegisterKind
    of rkNone: nil
    # of rkInt: intVal*: BiggestInt
    # of rkFloat: floatVal*: BiggestFloat
    of rkLit:
      value*: uint64 #
      typ*: PType
...
```
with following benefits:
* avoid need to represent all the integer types (including int8 etc) as int128 (wasteful)
* make RT semantics match CT semantics, eg for float32 (see https://github.com/nim-lang/Nim/issues/12884)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

user defined quote literals, eg: -12.3E+4022'dec128 ; -128'i8 is transformed as `i8("-128")` #228

summary

details

benefits

examples

note

VM

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

user defined quote literals, eg: -12.3E+4022'dec128 ; -128'i8 is transformed as i8("-128") #228

Description

summary

details

benefits

examples

note

VM

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

user defined quote literals, eg: -12.3E+4022'dec128 ; -128'i8 is transformed as `i8("-128")` #228