Skip to content

New implementation of LinkedHashSet/LinkedHashMap #11369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
joshlemer opened this issue Dec 27, 2018 · 32 comments · Fixed by scala/scala#10221
Closed

New implementation of LinkedHashSet/LinkedHashMap #11369

joshlemer opened this issue Dec 27, 2018 · 32 comments · Fixed by scala/scala#10221

Comments

@joshlemer
Copy link

Recently, @szeiger rewrote the implementations of mutable.HashMap and mutable.HashSet here. So far as I see there's no reason not to implement LinkedHashSet/Maps the same way, and in fact doing so will allow LinkedHashMaps/Sets to participate in hashcode sharing between:

  • s.c.i.HashMap
  • s.c.i.HashSet
  • s.c.i.VectorMap
  • s.c.m.HashMap
  • s.c.m.HashSet
  • s.c.m.HashMap#keySet
  • s.c.i.HashMap#keySet
  • s.c.i.VectorMap#keySet

as started here

@joshlemer
Copy link
Author

@szeiger would you approve this?

@szeiger
Copy link

szeiger commented Jan 10, 2019

I haven't paid much attention to LinkedHashMap so far. I think any performance improvement would be welcome, especially if we can share code again with the new HashMap implementation.

@mghildiy
Copy link

I would like to contribute here.

@joshlemer
Copy link
Author

@mghildiy it's all yours if you want it 😄

@adriaanm adriaanm transferred this issue from scala/scala-dev Jan 16, 2019
@adriaanm
Copy link
Contributor

Is this still under consideration for RC1?

@adriaanm adriaanm added this to the 2.13.0-RC1 milestone Jan 16, 2019
@mghildiy
Copy link

I went through mutable.HashMap and found changes broadly as:

  • mixesin trait mutable.AbstractMap

  • Class level parameters added: initialCapacity: Int, loadFactor: Double

  • Replaced HashTable field with Array named: private[this] var table = new ArrayNode[K, V]

  • field 'contentSize' intoduced

  • method 'size' overriden

  • inline method 'computeHash' added

  • inline method 'index' added

  • inline metod 'findNode' added to find an entry for input key(k) in array 'table'

  • API methods overriden: sizeHint, addAll, remove

  • abstract iterator class HashMapIterator added

  • private classes Node, DeserializationFactory added

  • Modified methods: apply, getOrElse, getOrElseUpdate, put

@szeiger
Copy link

szeiger commented Jan 17, 2019

"changes" is an understatement - it's a completely new implementation. I think the idea here is to start with a copy of the new HashMap implementation and make the required changes for keeping track of the links.

@joshlemer
Copy link
Author

@mghildiy @szeiger is right, as well as for LinkedHashSet

@mghildiy
Copy link

Thanks for the inputs.

@pavelpavlov
Copy link

I would also like to contribute if there's need for it.
We use linked collections in our code very heavily.

@mghildiy
Copy link

mghildiy commented Jan 19, 2019

Some methods like writeObject, readObject are never called in current implementation of LinkedHashMap.
I would remove them.
Also any other code too which may not be needed as now many implementations would be inherited from HashMap.

@joshlemer
Copy link
Author

joshlemer commented Jan 19, 2019

@mghildiy actually those methods are being used, these are kind of "magic methods" that Java knows to call during Java serialization/deserialization. Also I would recommend not extending hashmap to implement linked hash map, but maybe some of the hashtable code can be factored out for reuse.

@mghildiy
Copy link

mghildiy commented Jan 20, 2019

OK, so you mean a better approach would be to pull out reusable code from mutable.HashTable into a class/object and use it in classes wherever needed, like those enlisted in original post.
PS: Extending HashMap is approach also used in java''s LinkedHashMap.

@mghildiy
Copy link

As far as I understand, new implementation of HashMap/HashSet no more has hashtable related code.

@mghildiy
Copy link

mghildiy commented Feb 6, 2019

Hi,

Any more inputs here?

@joshlemer
Copy link
Author

@mghildiy If I were doing it, I would simply copy paste mutable.HashMap as LinkedHashMap and make necessary changes to make the nodes linked (taking the linking logic from the current LinkedHashMap), and same for LinkedHashSet

@joshlemer
Copy link
Author

@mghildiy the first step to changing the mutable.HashMap code into LinkedHashMap would be to change its private Node[K, V] class to include fields var earlier: Node[K, V], var later: Node[K, V]

Let me know if you have any more questions 😄

@mghildiy
Copy link

mghildiy commented Feb 7, 2019

Thanks for the inputs @joshlemer . Would look into it this weekend.

@mghildiy
Copy link

There are some methods,like apply, which are not implemented in current version of immutable.LinkedHashMap, but they are there in immutable.HashMap.
Should they be implemented?

@SethTisue
Copy link
Member

some methods [...] are not implemented

you mean because the implementations are inherited? in many cases the inherited implementations are fine and there's no reason to override them. concrete collections may override some of them, usually for efficiency

@mghildiy
Copy link

Sorry, my mistake. I was actually referring to mutable versions, not immutable..

@mghildiy
Copy link

I am done with changes in LinkedHashMap.
Do I need to write a benchmark class too, jst like done for HashMap?
Also, currently there is a single test for LinkedHashMap(in LinkedHashMapTest.scala). Should I add tests for entire code in LinkedHahsMap.scala?

@joshlemer
Copy link
Author

Do I need to write a benchmark class too, jst like done for HashMap?

No not necessarily,

Also, currently there is a single test for LinkedHashMap(in LinkedHashMapTest.scala). Should I add tests for entire code in LinkedHahsMap.scala?

I think if you just get all existing LinkedHashMap tests to pass, that should be enough to put up a preliminary [WIP] pull request

@adriaanm adriaanm self-assigned this Mar 5, 2019
@adriaanm
Copy link
Contributor

adriaanm commented Mar 6, 2019

👋 I haven't seen a PR for this yet. As we're closing in rapidly on the RC1 due date, I'm going to have to defer this unless we can have a PR in a mergeable state this week or early next.

@mghildiy
Copy link

mghildiy commented Mar 7, 2019

I am facing 2 test failures for my change. Trying to find the cause.

@SethTisue
Copy link
Member

SethTisue commented Mar 7, 2019

@mghildiy sounds good enough for a WIP PR, maybe if we take a look at it, it'll be obvious to somebody where the failures might be coming from

@mghildiy
Copy link

mghildiy commented Mar 7, 2019

I would create a PR in next 2-3 days.
BTW, WIP PR is just like any other PR..right? May be it just has a WIP tag?

@SethTisue
Copy link
Member

May be it just has a WIP tag?

right, or you can use GitHub's new "draft" feature, https://github.blog/2019-02-14-introducing-draft-pull-requests/

@mghildiy
Copy link

mghildiy commented Mar 9, 2019

Draft PR:
scala/scala#7839

@joshlemer
Copy link
Author

Thanks a bunch @mghildiy 😄

@adriaanm
Copy link
Contributor

It seems this PR won't be ready this week, which is the deadline for RC1. I'm going to put this on the backlog, since it doesn't sound like something we could do in 2.13.1. I haven't followed in detail, so please correct me if I'm missing something.

@SethTisue
Copy link
Member

the PR seems abandoned...? perhaps someone else would be interested in picking it up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants