Skip to content

proposal: go/types: an API for normalized interface terms #61013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
findleyr opened this issue Jun 26, 2023 · 15 comments
Open

proposal: go/types: an API for normalized interface terms #61013

findleyr opened this issue Jun 26, 2023 · 15 comments
Assignees
Labels
Milestone

Comments

@findleyr
Copy link
Member

findleyr commented Jun 26, 2023

This is a placeholder for planning purposes, to be exchanged for a proper proposal at a future date.

EDIT(2024-02-06): Please see the actual proposal in this comment below: #61013 (comment)

As discussed in #60994, there are some missing go/types APIs that are currently papered over with the x/exp/typeparams package.

In particular, we should propose a go/types API that serves the purpose of the NormalTerms function -- some way to traverse a normalized representation of the terms of an interface types.

We could expose an equivalent API to NormalTerms, or do something simpler. Let's decide early in the go1.22 cycle.

CC @griesemer @adonovan @timothy-king @mdempsky

@findleyr findleyr added this to the Go1.22 milestone Jun 26, 2023
@findleyr findleyr self-assigned this Jun 26, 2023
@findleyr findleyr added the early-in-cycle A change that should be done early in the 3 month dev cycle. label Jun 26, 2023
@timothy-king
Copy link
Contributor

Might be helpful to go over what x/tools/go/ssa uses. The heaviest usage of NormalTerms is indirect via x/tools/internal/typeparams.CoreType. If we only release NormalTerms, CoreType will be the performance sensitive path. Other uses of NormalTerm go through the function x/tools/go/ssa.typeSetOf. The use of this is somewhat varied. FWIW all uses only pay attention to the underlying types in the list.

@gopherbot
Copy link
Contributor

This issue is currently labeled as early-in-cycle for Go 1.22.
That time is now, so a friendly reminder to look at it again.

@findleyr
Copy link
Member Author

findleyr commented Nov 7, 2023

Unfortunately, this isn't going to happen for 1.22, which is perhaps not unreasonable given #63940. Let's wait another cycle to see if the correct API emerges.

@findleyr findleyr modified the milestones: Go1.22, Go1.23 Nov 7, 2023
@gopherbot
Copy link
Contributor

This issue is currently labeled as early-in-cycle for Go 1.23.
That time is now, so a friendly reminder to look at it again.

@findleyr
Copy link
Member Author

I still think we should do this, but it's been an unfortunately low priority as most users have access to terms via x/tools/internal or x/exp/typeparams.

Moving to the backlog, as it seems unlikely that we'll take this on before there is more demand. As mentioned above, #63940 is a higher priority for the team, and may affect the design of this API.

@findleyr findleyr removed the early-in-cycle A change that should be done early in the 3 month dev cycle. label May 14, 2024
@findleyr findleyr modified the milestones: Go1.23, Backlog May 14, 2024
@seankhliao seankhliao added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jul 13, 2024
@findleyr
Copy link
Member Author

findleyr commented Aug 6, 2024

@timothy-king as you are working on go/types and have a lot of experience with core types / normalized interface terms from your work on go/ssa, you seem like the perfect person to design a nice API here :)

Tentatively reassigning, though I will leave it to you to prioritize.

@findleyr
Copy link
Member Author

findleyr commented Jan 28, 2025

We really need to expose an API here, independent of discussion around core types. It is too important for tools, and we have for too long relied on the x/exp/typeparams package.

I think the reason we don't have a solid proposal is that we didn't really know what the API should be.

To kick start this discussion, here's a proposed API:

// A TermSet is an ordered set of [Terms] representing the [type element] restrictions on the
// type set of a type, if any.
//
// The set is minimal, in the sense that there is no term set with fewer terms that represents
// the same set of types. As described at [link], every [Interface] type has a unique minimal
// term set.
//
// See [NewTermSet] for details on what the TermSet means for non-interface types.
type TermSet struct { /* ... */ }

// NewTermSet returns the normalized [Term] set for the given type.
//
// If the underlying of the type t is an [Interface] type, the result corresponds to the set
// of type terms in t, after normalizing as described at [link]. If the interface does not 
// contain any terms after normalizing, the resulting set is a "singleton set" containing
// a single Term with Tilde() == false and Type() == t.
//
// If the underlying of the type t is a [Union] type, the result is equivalent to the result
// for interface{t}.
//
// For all other types, the result is a singleton set for t, as described above.
//
// An error is returned if the type is invalid or exceeds complexity bounds.
// If the type has an empty type set (meaning no type satisfies its terms), the resulting
// set has Len == 0.
//
// NewTermSet makes no guarantees about the order of terms, except that it is 
// deterministic.
func NewTermSet(t types.Type) (*TermSet, error)

// Len returns the number of terms in the normalized term set.
func (*TermSet) Len() int

// At returns the ith term in the term set.
func (*TermSet) At(i int) *Term

// Terms returns a go1.23 iterator over the terms in the term set.
//
// Example: for t := range NewTermSet(t).Terms() { ... }
func (*TermSet) Terms() iter.Seq[*Term]

This may not be ideal, but I did consider a few problems with the existing NormalTerms API. A few notes:

  • TermSet API was chosen to be symmetric to MethodSet, TypeList, TypeParamList, etc. We could theoretically reuse Union here, but I chose not to as we really don't want this API to return a Type. (Perhaps we could have avoided Union altogether, in retrospect).
  • This also avoids exposing the underlying term slice.
  • ErrEmptyTypeSet is avoided by the more careful handling of Interface and Union types. In particular, by treating Union like interface{t}, we can differentiate the empty term list from the any term list.
  • I chose not to make this a method on Interface, for symmetry with NewMethodSet and because this API is more useful as it degenerates into the singleton term set for ordinary types, which is typically what the caller would want.

@griesemer @adonovan @dominikh WDYT?

@findleyr findleyr assigned findleyr and unassigned timothy-king Jan 28, 2025
@adonovan
Copy link
Member

This seems fine to me. We should add (*TermSet) Terms() iter.Seq[*Term] too.

@fbbdev
Copy link

fbbdev commented Jan 29, 2025

Thanks @findleyr for this proposal.

The API should suffice to solve the issue discussed at #71464.

Perhaps the complexity bounds mentioned in the comment should be better documented (are these set by the spec? enforced by the compiler?...).

@fbbdev
Copy link

fbbdev commented Jan 29, 2025

Another question that comes to mind is whether this comes with the same performance issues as NewMethodSet, i.e. is it advisable to cache results? Will it benefit from the internal caching of type sets that is done by go/types for interface types? If yes, are there going to be any caveats w.r.t. concurrent access?

@griesemer
Copy link
Contributor

Some questions:

  • Can you be a bit more specific about how the empty TermSet is different from the TermSet for any?
  • Since we're talking about a canonicalized form, which (I assume) means that two term sets with the same set of terms are represented identically, should this be called TermList?
  • In the type checker we typically iterate over all terms (if any), so an iterator API is more useful than a Len and At accessor.
  • If we don't have an interface, why not return nil rather than the singleton type, so there's an easy way to tell the cases apart?

Further out:

  • If the compiler becomes smarter (less restrictive), a term will consist of a type, ~ information, and methods. We should make sure we can expand the API to accomodate for that (Methods).

@findleyr findleyr marked this as a duplicate of #71464 Jan 29, 2025
@findleyr
Copy link
Member Author

Thanks all. Responses:

@adonovan says:

This seems fine to me. We should add (*TermSet) Terms() iter.Seq[*Term] too.

Indeed, thanks. Added the iterator method to the proposal.

@fbbdev says:

Perhaps the complexity bounds mentioned in the comment should be better documented (are these set by the spec? enforced by the compiler?...).

Hmm, good point. @griesemer should we add an implementation to the spec, similar to what we do for the precision of numeric constants?

Another question that comes to mind is whether this comes with the same performance issues as NewMethodSet, i.e. is it advisable to cache results?

No, I don't think so: the normalized term set is already cached on all interface types. Maybe it could be a problem to request term sets of Unions (which is not cached), but since Unions don't appear as the type of an Object, this is unlikely to be a problem.

@griesemer asks:

Can you be a bit more specific about how the empty TermSet is different from the TermSet for any?

Sure. This is the crux of the API problem. Consider the term sets of the following two union types:

  1. ~MyType
  2. int | any

Internally, we use a nil slice to represent 1 (empty), and a non-nil empty slice to represent 2 (no restrictions = all terms).
That's not a good API for the caller. In the x/exp/typeparams.NormalTerms API, we differentiate these two cases by returning ErrEmptyTermSet for 1, and nil slice for 2. But that means each caller must generally have special handling for the empty slice.

In the proposed API, we return nil for 1, and {{any, false}} for 2. That seemed more generally useful, for the following reason: callers will typically be asking a question about what the underlying of the type could be (consider uses of go/types.underIs). If the underlying could be anything, we can represent that with any. (of course, that begs the question of whether the term set of interface{M()} should also be {any, false}. But consider that the type below is a valid var type and its underlying is interface{ M() }, so we can't differentiate whether an interface is a constraint or a value type.

var _ interface {
	M()
	int | any
}

The most important case to consider is a type parameter T with no specific type restrictions. Should its TermSet be {any, false}, {T, false}, or {nil, false}? To me, {C, false} (where C is its constraint interface) seemed most generally useful, as in this case the behavior of the type param typically degenerates to that of its constraint interface. I'm not positive about this, though.

Since we're talking about a canonicalized form, which (I assume) means that two term sets with the same set of terms are represented identically, should this be called TermList?

I don't follow this. I'd think that since it is representing the logical set (where a|a ~ a), that's exactly a reason not to call it a list. See also MethodSet.

In the type checker we typically iterate over all terms (if any), so an iterator API is more useful than a Len and At accessor.

I agree, but we probably want to have Len, and probably want to be symmetric with other collection types (MethodSet, TypeList, TypeParamList). An iterator has been included per Alan's suggestion.

If we don't have an interface, why not return nil rather than the singleton type, so there's an easy way to tell the cases apart?

We could, but shouldn't the type parameter restricted by interface{int} be equivalent to int?

Let me highlight that this API is not natural or obvious. Just as we discovered that the underlying of TypeParam is awkward for tools, it will likely be the case that the API proposed here is awkward in some scenarios. We should endeavor to make it as minimally awkward as possible in most cases, while still preserving all requisite information, and viable for future relaxation of the spec.

If the compiler becomes smarter (less restrictive), a term will consist of a type, ~ information, and methods. We should make sure we can expand the API to accomodate for that (Methods).

I think that would be OK, or am I missing something? I guess a concrete example would be string | fmt.Stringer. In that case, what should we do here? We could return {{string, false}, {fmt.Stringer, false}}. I think that's actually an argument for returning {Iface, false} for interface types with no term list restrictions.

@findleyr
Copy link
Member Author

findleyr commented Jan 29, 2025

I realize the above may be a wall of text. It may be helpful to frame the critical API decisions in terms of examples. Here I'll use some short hand notation for types, but hopefully my meaning is clear.

  • For a type parameter T C, NewTermSet(T) should be the same as NewTermSet(C).
  • For a type parameter T interface{int} NewTermSet(T) should be the same as NewTermSet(int).
  • In a theoretical future where we support ~string | fmt.Stringer, NewTermSet(~string | fmt.Stringer) should differ from NewTermSet(fmt.Stringer) by exactly one term ({string, true}).

I think if you agree with those examples, the rest of the API behavior is implied.

@fbbdev
Copy link

fbbdev commented Feb 1, 2025

Thanks @findleyr for the additional details.

the normalized term set is already cached on all interface types

I still have some doubts about thread safety. In go/types/interface.go:16-26 we read:

// An Interface represents an interface type.
type Interface struct {
	// [...]

	tset *_TypeSet // type set described by this interface, computed lazily
}

And in go/types/typeset.go:155-196:

// computeInterfaceTypeSet may be called with check == nil.
func computeInterfaceTypeSet(check *Checker, pos token.Pos, ityp *Interface) *_TypeSet {
	if ityp.tset != nil {
		return ityp.tset
	}

	// [...]

	ityp.tset = &_TypeSet{terms: allTermlist} // TODO(gri) is this sufficient?

	// [...]

That looks like a potential data race, not to speak of later field accesses in the same function. The docs are silent on whether methods that involve access to type sets are safe for concurrent use. Should consumers of those APIs (and by extension the proposed term set API) worry about that? Is the cache guaranteed to be safely initialized before interface types are passed on to external consumers?

@fbbdev
Copy link

fbbdev commented Feb 1, 2025

Nevermind, I had missed this

To avoid race conditions, the interface's type set should be computed before concurrent use of the interface, by explicitly calling Complete.

Which of course is implicitly called while type checking packages, and has to be explicitly called by external consumers when constructing interface types.

@findleyr findleyr changed the title go/types: expose an API for normalized interface terms Proposal: go/types: an API for normalized interface terms Feb 6, 2025
@findleyr findleyr changed the title Proposal: go/types: an API for normalized interface terms proposal: go/types: an API for normalized interface terms Feb 6, 2025
@ianlancetaylor ianlancetaylor modified the milestones: Backlog, Proposal Feb 12, 2025
@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Feb 12, 2025
@findleyr findleyr removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

8 participants