Java BufferedInputStream

Jakob Jenkov
Last update: 2019-11-07

The Java BufferedInputStream class, java.io.BufferedInputStream, provides transparent reading of chunks of bytes and buffering for a Java InputStream, including any subclasses of InputStream. Reading larger chunks of bytes and buffering them can speed up IO quite a bit. Rather than read one byte at a time from the network or disk, the BufferedInputStream reads a larger block at a time into an internal buffer. When you read a byte from the Java BufferedInputStream you are therefore reading it from its internal buffer. When the buffer is fully read, the BufferedInputStream reads another larger block of data into the buffer. This is typically much faster than reading a single byte at a time from an InputStream, especially for disk access and larger data amounts.

Java BufferedInputStream Example

To add buffering to an InputStream simply wrap it in a BufferedInputStream. Here is how that looks:

BufferedInputStream bufferedInputStream = new BufferedInputStream(
                      new FileInputStream("c:\\data\\input-file.txt"));

As you can see, using a BufferedInputStream to add buffering to a non-buffered InputStream is pretty easy. The BufferedInputStream creates a byte array internally, and attempts to fill the array by calling the InputStream.read(byte[]) methods on the underlying InputStream.

Setting Buffer Size of a BufferedInputStream

You can set the buffer size to use internally by the Java BufferedInputStream. You provide the buffer size as a parameter to the BufferedInputStream constructor, like this:

int bufferSize = 8 * 1024;
    
BufferedInputStream bufferedInputStream = new BufferedInputStream(
                      new FileInputStream("c:\\data\\input-file.txt"),
                      bufferSize
    );

This example sets the internal buffer used by the BufferedInputStream to 8 KB. It is best to use buffer sizes that are multiples of 1024 bytes. That works best with most built-in buffering in hard disks etc.

Except for adding buffering to your input streams, BufferedInputStream behaves exactly like an InputStream.

Optimal Buffer Size for a BufferedInputStream

You should make some experiments with different buffer sizes to find out which buffer size seems to give you the best performance on your concrete hardware. The optimal buffer size may depend on whether you are using the Java BufferedInputStream with a disk or network InputStream.

With both disk and network streams, the optimal buffer size may also depend on the concrete hardware in the computer. If the hard disk is anyways reading a minimum of 4KB at a time, it's stupid to use less than a 4KB buffer. It is also better to then use a buffer size that is a multiple of 4KB. For instance, using 6KB would be stupid too.

Even if your disk reads blocks of e.g. 4KB at a time, it can still be a good idea to use a buffer that is larger than this. A disk is good at reading data sequentially - meaning it is good at reading multiple blocks that are located after each other. Thus, using a 16KB buffer, or a 64KB buffer (or even larger) with a BufferedInputStream may still give you a better performance than using just a 4KB buffer.

Also keep in mind that some hard disks have a read cache of some mega bytes. If your hard disk anyways reads, say 64KB, of your file into its internal cache, you might as well get all of that data into your BufferedInputStream using one read operation, instead of using multiple read operations. Multiple read operations will be slower, and you risk that the hard disk's read cache gets erased between read operations, causing the hard disk to re-read that block into the cache.

To find the optimal BufferedInputStream buffer size, find out the block size your hard disk reads in, and possibly also its cache size, and make the buffer a multiple of that size. You will definitely have to experiment to find the optimal buffer size. Do so by measuring read speeds with different buffer sizes.

mark() and reset()

An interesting aspect to note about the BufferedInputStream is that is supports the mark() and reset() methods inherited from the InputStream. Not all InputStream subclasses support these methods. In general you can call the markSupported() method to find out if mark() and reset() are supported on a given InputStream or not, but the BufferedInputStream supports them.

Closing a BufferedInputStream

When you are finished reading data from a Java BufferedInputStream you must close it. You close a BufferedInputStream by calling the close() method inherited from InputStream. Closing a Java BufferedInputStream will also close the InputStream from which the BufferedInputStream is reading and buffering data. Here is an example of opening a Java BufferedInputStream, reading all data from it, and then closing it:

BufferedInputStream bufferedInputStream = new BufferedInputStream(
                      new FileInputStream("c:\\data\\input-file.txt"));

int data = bufferedInputStream.read();
while(data != -1) {
  data = bufferedInputStream.read();
}
bufferedInputStream.close();

Notice how the while loop continues until a -1 value is read from the BufferedInputStream read() method. After that, the while loop exits, and the BufferedInputStream close() method is called.

The above code is not 100% robust. If an exception is thrown while reading data from the BufferedInputStream, the close() method is never called. To make the code more robust, you will have to use the Java Java try with resources construct. Proper exception handling for use of Java IO classes is also explained in my tutorial on Java IO Exception Handling.

Here is an example of closing a Java BufferedInputStream using the try-with-resources construct:

try(BufferedInputStream bufferedInputStream =
        new BufferedInputStream( new FileInputStream("c:\\data\\input-file.txt") ) ) {

    int data = bufferedInputStream.read();
    while(data != -1){
        data = bufferedInputStream.read();
    }
}

Notice that the BufferedInputStream is declared inside the parentheses after the try keyword. This signals to Java that this BufferedInputStream is to be managed by the try-with-resources construct.

Once the executing thread exits the try block, the BufferedInputStream is closed. If an exception is thrown from inside the try block, the exception is caught, the BufferedInputStream is closed, and then the exception is rethrown. You are thus guaranteed that the BufferedInputStream is closed, when used inside a try-with-resources block.

Reusable BufferedInputStream

One of the weaknesses of the standard Java BufferedInputStream is that it can only be used once. Once you close it, it's no longer usable. If you need to read a lot of files, or network streams, you have to create a new BufferedInputStream for each file or network stream you want to read. This means that you are creating a new object, and more importantly, a new byte array which is used as buffer inside the BufferedInputStream. This can put pressure on the Java garbage collector, if the number of files or streams read is high, and if they are read quickly after each other.

An alternative is to create a reusable BufferedInputStream where you can replace the underlying source InputStream, so the BufferedInputStream and its internal byte array buffer can be reused. To save you the trouble, I have created such a ReusableBufferedInputStream, and included the code for it further down this tutorial. First, I want to show you how using this ReusableBufferedInputStream looks.

Create a ReusableBufferedInputStream

First you need to create a ReusableBufferedInputStream. Here is an example of how to create a ReusableBufferedInputStream:

ReusableBufferedInputStream reusableBufferedInputStream =
    new ReusableBufferedInputStream(new byte[1024 * 1024]);

This example creates a ReusableBufferedInputStream with a 1 MB byte array as its internal buffer.

Set Source

When you have created a ReusableBufferedInputStream you need to set the InputStream on it to use as underlying data source. Here is how you set a source InputStream on a ReusableBufferedInputStream :

FileInputStream inputStream = new FileInputStream("/mydata/somefile.txt");

reusableBufferedInputStream.setSource(inputStream);

The setSource() method actually returns a reference to the ReusableBufferedInputStream, so you can actually create a ReusableBufferedInputStream and set the source in a single instruction:

ReusableBufferedInputStream reusableBufferedInputStream =
    new ReusableBufferedInputStream(new byte[1024 * 1024])
        .setSource(new FileInputStream("/mydata/somefile.txt"));

Reusing a ReusableBufferedInputStream

When you are done using the ReusableBufferedInputStream you need to close it. Closing it will only close the underlying source InputStream. After closing a ReusableBufferedInputStream you can use it again, simply by setting a new source InputStream on it. Here is how it looks to reuse a ReusableBufferedInputStream :

reusableBufferedInputStream.setSource(new FileInputStream("/mydata/file-1.txt"));

//read data from ReusableBufferedInputStream

reusableBufferedInputStream.close();


reusableBufferedInputStream.setSource(new FileInputStream("/mydata/file-1.txt"));

//read data from ReusableBufferedInputStream

reusableBufferedInputStream.close();

ReusableBufferedInputStream Code

Here is the code for the ReusableBufferedInputStream described above. Note, that this implementation only overrides the read() method of the InputStream class that it extends. The rest of the InputStream methods have been left out to keep the code shorter - but you can implement them yourself in case you need them.

import java.io.IOException;
import java.io.InputStream;

public class ReusableBufferedInputStream extends InputStream {

    private byte[]      buffer = null;
    private int         writeIndex = 0;
    private int         readIndex  = 0;
    private InputStream source = null;

    public ReusableBufferedInputStream(byte[] buffer) {
        this.buffer = buffer;
    }

    public ReusableBufferedInputStream setSource(InputStream source){
        this.source = source;
        this.writeIndex = 0;
        this.readIndex  = 0;
        return this;
    }

    @Override
    public int read() throws IOException {

        if(readIndex == writeIndex) {
            if(writeIndex == buffer.length) {
                writeIndex = 0;
                readIndex  = 0;
            }
            //data should be read into buffer.
            int bytesRead = readBytesIntoBuffer();
            while(bytesRead == 0) {
                //continue until you actually get some bytes !
                bytesRead = readBytesIntoBuffer();
            }

            //if no more data could be read in, return -1;
            if(bytesRead == -1) {
                return -1;
            }
        }

        return 255 & this.buffer[readIndex++];
    }



    private int readBytesIntoBuffer() throws IOException {
        int bytesRead = this.source.read(this.buffer, this.writeIndex, this.buffer.length - this.writeIndex);
        writeIndex += bytesRead;
        return bytesRead;
    }

    @Override
    public void close() throws IOException {
        this.source.close();
    }
}

Jakob Jenkov

Featured Videos

Java ConcurrentMap + ConcurrentHashMap

Java Generics

Java ForkJoinPool

P2P Networks Introduction

















Close TOC
All Tutorial Trails
All Trails
Table of contents (TOC) for this tutorial trail
Trail TOC
Table of contents (TOC) for this tutorial
Page TOC
Previous tutorial in this tutorial trail
Previous
Next tutorial in this tutorial trail
Next