-
Notifications
You must be signed in to change notification settings - Fork 181
Offer Int32Map and Int64Map #355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't think there is much benefit in 32 bit maps.
For Int64 it looks different though, since we can't use a IntMap on 32bit platforms. If that approach is deemed acceptable I might add a Int64Map (by taking the IntMap implementation and changing the types). It looks like we will need one for GHC in the near future and it would be good to be able to use containers for this. PS: If there is a strong demand for a Int32Map and it's shown to be worth the cost it might still be good. I just personally don't think it will have enough of a benefit to look into that myself. |
I've converted https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10568 It would require some polishing before upstreaming it. I might have time for it soon, but I'd gladly hand it over if there are other volunteers. |
I would definitely want these, but how can we do it without increasing the maintenance burden too much? |
That depends on what you consider to be too much. The simplest way to upstream these is to just copy the files, but I get the impression you consider adding ~7000 lines of code as adding too much burden, which is a fair point. But I don't see any easy solutions. The code is very similar in many places, but the types are pretty much all different. I could imagine making a poor man's backpack by using something like |
Just throwing an idea around: once our GHC lower bound rises high enough, we'll be able to have multiple "libraries" in the package. Will this give us enough power to have the necessary type definitions in a module that has different contents for the different libraries? |
To be clear, my objection isn't the amount of code, per se, but the duplication. Having to make every change in triplicate is pretty darn annoying. The set/map split is bad enough. |
@Bodigrim suggested we use CPP following other boot libraries like {-# LANGUAGE CPP #-}
#define FILEPATH_NAME OsPath
#define OSSTRING_NAME OsString
#define WORD_NAME OsChar
...
#include "OsPath/Common.hs" And then in OsPath/Common.hs there is code like this: ...
splitSearchPath :: OSSTRING_NAME -> [FILEPATH_NAME]
splitSearchPath (OSSTRING_NAME x) = fmap OSSTRING_NAME . C.splitSearchPath $ x
... |
How does that affect source links in the Haddocks? I would expect it to destroy them, which is ... not great. |
Yes, that seems like a disadvantage, e.g. this link doesn't point to anything useful: https://hackage.haskell.org/package/filepath-1.4.100.3/docs/src/System.OsPath.Posix.html#extSeparator |
Ugh. Using the (long-standing) private library support is enough for what I was talking about (no need for the new multiple public libraries), but there's a big problem: the names of the modules, and more importantly the names of the types won't work out right without some CPP. I don't know just how bad that is in practice; probably better than what |
Since it is unclear if this will make it into containers, are there plans to keep GHC's copy in sync with containers? Otherwise GHC is cut off from optimizations and bug fixes made here. IMO the easiest way to get newtype Word64Map a = Word64Map (IntMap (IntMap a)) The top map is indexed by the high 32 bits of the high, low :: Word64 -> Int
lookup :: Word64 -> Word64Map a -> Maybe a
lookup k (Word64Map m) = IM.lookup (high k) m >>= IM.lookup (low k)
union :: Word64Map a -> Word64Map a -> Word64Map a
union (Word64Map m1) (Word64Map m2) = Word64Map (IM.unionWith IM.union m1 m2)
... Admittedly this would be slightly slower compared to a directly implemented version, but I don't expect it to be too bad. Not sure if this was considered for GHC, but it is simple and allows benefiting from development in (cc @noughtmare) |
Whatever we decide to do, we should do it in So, I think there are four options:
I don't have time to champion any of these solutions in the near future, but I'm willing to help anyone who does. |
Nested |
I see, that's unfortunate.
Are there parties other than GHC who would like to see this?
It's not twice as deep, it's deeper by exactly one. In other words, a path from root to leaf passes through only one extra step, the |
I think you're right; I didn't think that through enough. But can it be two deeper sometimes, depending on where the branches are? |
I'm not seeing it, do you have a concrete example perhaps?
I'm beginning to realize that it may even be faster, since we are on a 32-bit system.
It looks like a promising idea to test out on GHC with a 32-bit system, should someone volunteer. |
Why? Ah, I see you've recently made some contributions that could potentially also benefit GHC. Maybe it would be worthwhile to port over some of them. Perhaps it would be useful to collect a number of those changes and port them over at the same time. I should mention that this is my personal opinion and does not reflect the opinion of the GHC team. And I'm no longer a very active contributor to (that part of) GHC. Perhaps @mpickering could chime in.
This issue wasn't opened by GHC, but perhaps GHC is the first that actually ran into issues with
For people who do want to try this, you don't need a 32-bit system. You can instead use ghc-docker-jobs.sh with the |
Someone suggested this weekend that we should really offer both 32-bit and 64-bit maps regardless of architecture, and I thoroughly agree. The same applies to IntSet, of course.
The text was updated successfully, but these errors were encountered: