-
Notifications
You must be signed in to change notification settings - Fork 21
Use NIO for zip/jar access #9683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Imported From: https://issues.scala-lang.org/browse/SI-9683?orig=1 |
@retronym said: |
@Ichoran said: |
@Ichoran said: def zipsumA(p: java.nio.file.Path) = {
import java.nio.file._
val fs = FileSystems.newFileSystem(p, null)
val root = fs.getPath("/")
var l = 0L
var n = 0L
Files.walkFileTree(root, new SimpleFileVisitor[Path]{
override def visitFile(path: Path, attr: attribute.BasicFileAttributes) = {
val bb = java.nio.ByteBuffer.wrap(Files.readAllBytes(path))
n += bb.remaining
var i = 0
while (bb.remaining > 4) i ^= bb.getInt
l += i
FileVisitResult.CONTINUE
}
})
fs.close
(n, l)
} with older-style def zipsumB(p: java.nio.file.Path) = {
import java.util.zip._
val zf = new ZipFile(p.toFile)
val zes = zf.entries
var n, l = 0L
while (zes.hasMoreElements) {
val ze = zes.nextElement
val b = new Array[Byte](ze.getSize.toInt)
val zis = zf.getInputStream(ze)
var i = 0
while (i < b.length) i += zis.read(b, i, b.length - i)
n += b.length
val bb = java.nio.ByteBuffer.wrap(b)
var j = 0
while (bb.remaining > 4) j ^= bb.getInt
l += j
zis.close
}
zf.close
(n, l)
} and def zipsumC(p: java.nio.file.Path) = {
import java.util.zip._
val zis = new ZipInputStream(new java.io.FileInputStream(p.toFile))
var n, l = 0L
var ze = zis.getNextEntry
while (ze ne null) {
if (ze.getSize > 0) {
val b = new Array[Byte](ze.getSize.toInt)
var i = 0
while (i < b.length) i += zis.read(b, i, b.length - i)
n += b.length
val bb = java.nio.ByteBuffer.wrap(b)
var j = 0
while (bb.remaining > 4) j ^= bb.getInt
l += j
}
zis.closeEntry
ze = zis.getNextEntry
}
else if (ze.getSize < 0) throw new Exception("This only works for ZipEntries with known size annotation")
zis.close
(n, l)
} I see that ZipFileSystem (method A) is about 6% slower than ZipFile (method B) and 5% faster than ZipInputStream (method C). Thus, given the modest differences I wouldn't bother prioritizing this for speed, and if it is using ZipInputStream now, I'd tend to go for ZipFile before ZipFileSystem. If you know differently, can you please provide some evidence? (Granted, this is only reading, but generally there's more reading than writing.) |
@retronym said: That project itself that offers these guidelines is an in-memory filesystem implementation. We currently have the an implementation of the same concept (https://github.com/scala/scala/blob/v2.11.6/src/reflect/scala/reflect/io/VirtualDirectory.scala) built in top our our internal IO abstractions. I'd like to get rid of that code in favour and push the logic down into the NIO filesystem abstraction. |
@retronym said:
I have witnessed poor performance of SBT and scalac on Windows file systems in the past which I believe was related to lots of lookups of file attributes of source and class files, so the ability to do this en-masse sounds enticing. Our next step towards this is to standardize on |
@SethTisue said: |
#9632 and #9682 stand as examples of why the JarFile API is really really buggy (especially on Windows),
Java 7 introduced a new API for dealing with zips/jars http://docs.oracle.com/javase/7/docs/technotes/guides/io/fsp/zipfilesystemprovider.html which is also much higher performance.
It would be good if scala 2.12 could make use of these NIO APIs, since the target platform has shifted to Java 8.
The text was updated successfully, but these errors were encountered: