I tried to use ReadBufferDataHandle to speed up parsing TIFF, instead of simple FileHandle in my analog of TiffParser. And I was very surprised that the speed did not increase.
Then I've created the following very simple test, which reads first 250000 of 4-byte integer values from some test file:
public class ReadArraySpeed {
public static void main(String[] args) throws IOException, FormatException {
if (args.length < 1) {
System.out.println("Usage:");
System.out.println(" " + ReadArraySpeed.class.getName() + " any_file");
return;
}
final Location location = new FileLocation(new File(args[0]));
for (int test = 1; test <= 5; test++) { // - helps to warm JVM
System.out.printf("Test #%d...%n", test);
try (Context context = new Context();
DataHandle<Location> in = context.getService(DataHandleService.class).create(location);
DataHandle<Location> inBuffer = context.getService(DataHandleService.class).readBuffer(location)) {
final int elementSize = 4;
final int size = (int) (Math.min(in.length(), 1_000_000) / elementSize);
int[] a = new int[size];
int[] b = new int[size];
long t1 = System.nanoTime();
in.seek(0);
for (int k = 0; k < a.length; k++) {
a[k] = in.readInt();
}
long t2 = System.nanoTime();
inBuffer.seek(0);
for (int k = 0; k < b.length; k++) {
b[k] = inBuffer.readInt();
}
long t3 = System.nanoTime();
System.out.printf(Locale.US, "Reading %d 32-bit integers, %s: %.3f ms, %.3f MB/s%n",
size, in.getClass(),
(t2 - t1) * 1e-6, (size * elementSize) / 1048576.0 / ((t2 - t1) * 1e-9));
System.out.printf(Locale.US, "Reading %d 32-bit integers, %s: %.3f ms, %.3f MB/s%n",
size, inBuffer.getClass(),
(t3 - t2) * 1e-6, (size * elementSize) / 1048576.0 / ((t3 - t2) * 1e-9));
}
}
}
}
On my computer, results are following:
Test #1...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 317.620 ms, 3.003 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 2609.762 ms, 0.365 MB/s
Test #2...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 336.619 ms, 2.833 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 2769.079 ms, 0.344 MB/s
Test #3...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 302.576 ms, 3.152 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 2784.298 ms, 0.343 MB/s
Test #4...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 307.343 ms, 3.103 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 2815.823 ms, 0.339 MB/s
Test #5...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 317.017 ms, 3.008 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 3378.953 ms, 0.282 MB/s
You see that ReadBufferDataHandle works... not only not faster, but much slower than the simple DataHandle.
Debugger shows that it performs a lot of calls of methods of low-level RandomAccessFile class, like exists(), length(). Maybe it is the reason.
Could you cleanup ReadBufferDataHandle to make it really quick while the simple sequential reading?
For comparison, if you will replace "readInt()" call with "readByte()", you will see the expected result: ReadBufferDataHandle works much faster. This method is implemented in much more simple way and is really efficient. But reading single byte is a rare case, usually we need to read some other types or short buffers.
I tried to use ReadBufferDataHandle to speed up parsing TIFF, instead of simple FileHandle in my analog of TiffParser. And I was very surprised that the speed did not increase.
Then I've created the following very simple test, which reads first 250000 of 4-byte integer values from some test file:
On my computer, results are following:
Test #1...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 317.620 ms, 3.003 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 2609.762 ms, 0.365 MB/s
Test #2...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 336.619 ms, 2.833 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 2769.079 ms, 0.344 MB/s
Test #3...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 302.576 ms, 3.152 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 2784.298 ms, 0.343 MB/s
Test #4...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 307.343 ms, 3.103 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 2815.823 ms, 0.339 MB/s
Test #5...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 317.017 ms, 3.008 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 3378.953 ms, 0.282 MB/s
You see that ReadBufferDataHandle works... not only not faster, but much slower than the simple DataHandle.
Debugger shows that it performs a lot of calls of methods of low-level RandomAccessFile class, like exists(), length(). Maybe it is the reason.
Could you cleanup ReadBufferDataHandle to make it really quick while the simple sequential reading?
For comparison, if you will replace "readInt()" call with "readByte()", you will see the expected result: ReadBufferDataHandle works much faster. This method is implemented in much more simple way and is really efficient. But reading single byte is a rare case, usually we need to read some other types or short buffers.