Skip to content

Extremely low speed of ReadBufferDataHandle #467

@Daniel-Alievsky

Description

@Daniel-Alievsky

I tried to use ReadBufferDataHandle to speed up parsing TIFF, instead of simple FileHandle in my analog of TiffParser. And I was very surprised that the speed did not increase.

Then I've created the following very simple test, which reads first 250000 of 4-byte integer values from some test file:

public class ReadArraySpeed {
    public static void main(String[] args) throws IOException, FormatException {
        if (args.length < 1) {
            System.out.println("Usage:");
            System.out.println("    " + ReadArraySpeed.class.getName() + " any_file");
            return;
        }

        final Location location = new FileLocation(new File(args[0]));
        for (int test = 1; test <= 5; test++) { // - helps to warm JVM
            System.out.printf("Test #%d...%n", test);
            try (Context context = new Context();
                DataHandle<Location> in = context.getService(DataHandleService.class).create(location);
                 DataHandle<Location> inBuffer = context.getService(DataHandleService.class).readBuffer(location)) {
                final int elementSize = 4;
                final int size = (int) (Math.min(in.length(), 1_000_000) / elementSize);
                int[] a = new int[size];
                int[] b = new int[size];
                long t1 = System.nanoTime();
                in.seek(0);
                for (int k = 0; k < a.length; k++) {
                    a[k] = in.readInt();
                }
                long t2 = System.nanoTime();
                inBuffer.seek(0);
                for (int k = 0; k < b.length; k++) {
                    b[k] = inBuffer.readInt();
                }
                long t3 = System.nanoTime();
                System.out.printf(Locale.US, "Reading %d 32-bit integers, %s: %.3f ms, %.3f MB/s%n",
                        size, in.getClass(),
                        (t2 - t1) * 1e-6, (size * elementSize) / 1048576.0 / ((t2 - t1) * 1e-9));
                System.out.printf(Locale.US, "Reading %d 32-bit integers, %s: %.3f ms, %.3f MB/s%n",
                        size, inBuffer.getClass(),
                        (t3 - t2) * 1e-6, (size * elementSize) / 1048576.0 / ((t3 - t2) * 1e-9));
            }
        }
    }
}

On my computer, results are following:
Test #1...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 317.620 ms, 3.003 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 2609.762 ms, 0.365 MB/s
Test #2...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 336.619 ms, 2.833 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 2769.079 ms, 0.344 MB/s
Test #3...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 302.576 ms, 3.152 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 2784.298 ms, 0.343 MB/s
Test #4...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 307.343 ms, 3.103 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 2815.823 ms, 0.339 MB/s
Test #5...
Reading 250000 32-bit integers, class org.scijava.io.handle.FileHandle: 317.017 ms, 3.008 MB/s
Reading 250000 32-bit integers, class org.scijava.io.handle.ReadBufferDataHandle: 3378.953 ms, 0.282 MB/s

You see that ReadBufferDataHandle works... not only not faster, but much slower than the simple DataHandle.
Debugger shows that it performs a lot of calls of methods of low-level RandomAccessFile class, like exists(), length(). Maybe it is the reason.

Could you cleanup ReadBufferDataHandle to make it really quick while the simple sequential reading?

For comparison, if you will replace "readInt()" call with "readByte()", you will see the expected result: ReadBufferDataHandle works much faster. This method is implemented in much more simple way and is really efficient. But reading single byte is a rare case, usually we need to read some other types or short buffers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions