Skip to content

deepcopy as the default copying mechanism is inefficient #560

@rfourquet

Description

@rfourquet

deepcopy is supposed to recreate the exact structure of the passed object, including aliasing. For example:

julia> x = big(1); a = [x, x]; b = deepcopy(a)
 2-element Array{BigInt,1}:
 1
 1
julia> b[1] === b[2]
true

To provide this feature, deepcopy creates a dictionary, which is passed to deepcopy_internal, to keep track of the identity of objects which "have been seen so far".

We have some problems with that:

  1. we often (always?) ignore the dictionary in our implementations of deepcopy_internal, which makes it non-conformant with the definition;
  2. I believe we sometimes use deepcopy to not only take ownership of an object, but also, for a matrix (maybe other objects), we assume all the entries are independant after deepcopy; this is consistent with 1) for what we control, but this is wrong for user-provided matrices which implement deepcopy in a conformant way;
  3. this is inneficient: we pay for a feature we don't need (and even need to not have) and ignore; for example:
julia> x = FlintZZ(1);

julia> @btime deepcopy($x);
 83.200 ns (3 allocations: 384 bytes)

julia> function fmpz_set(a::fmpz)
           # creates directly a copy with a ccall to the flint library
           x = fmpz()
           ccall((:fmpz_set, :libflint), Nothing, (Ref{fmpz}, Ref{fmpz}), x, a)
           x
       end

julia> @btime fmpz_set($x);  
  18.568 ns (1 allocation: 16 bytes)

But currently no API is exposed to make a copy of an fmpz besides deepcopy.

In other words, deepcopy doesn't have the needed semantics, which have been shoehorned into this function for convenience, but I think we should move away from it. I propose to introduce a new function, which provides the guaranties we need (point 2. above) and is efficient. We could call it dealias or dealias_copy, or whatever better name. To make the transition easier, it could temporarily default to deepcopy as a first approximation.

As a bonus, I'm pretty sure we could come up with an API which allows to define automatically this recursive dealias_copy for many recursive types, based on 1) copy (which copies the outer structure) and 2) an API to get and set "entries" of an object together with with the information of what constitutes the indices/coefficients of the objects. Then only leave types would have to define dealias_copy, which can then safely (but innefficiently) default to deepcopy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions