Java BufferedInputStream
Jakob Jenkov |
The Java BufferedInputStream class, java.io.BufferedInputStream, provides transparent reading of chunks of bytes and buffering for a Java InputStream, including any subclasses of InputStream. Reading larger chunks of bytes and buffering them can speed up IO quite a bit. Rather than read one byte at a time from the network or disk, the BufferedInputStream reads a larger block at a time into an internal buffer. When you read a byte from the Java BufferedInputStream you are therefore reading it from its internal buffer. When the buffer is fully read, the BufferedInputStream reads another larger block of data into the buffer. This is typically much faster than reading a single byte at a time from an InputStream, especially for disk access and larger data amounts.
Java BufferedInputStream Example
To add buffering to an InputStream
simply wrap it in a BufferedInputStream
.
Here is how that looks:
BufferedInputStream bufferedInputStream = new BufferedInputStream( new FileInputStream("c:\\data\\input-file.txt"));
As you can see, using a BufferedInputStream
to add buffering to a non-buffered InputStream
is pretty
easy. The BufferedInputStream
creates a byte
array internally, and attempts to fill the
array by calling the InputStream.read(byte[])
methods on the underlying InputStream
.
Setting Buffer Size of a BufferedInputStream
You can set the buffer size to use internally by the Java BufferedInputStream
.
You provide the buffer size as a parameter to the BufferedInputStream
constructor, like this:
int bufferSize = 8 * 1024; BufferedInputStream bufferedInputStream = new BufferedInputStream( new FileInputStream("c:\\data\\input-file.txt"), bufferSize );
This example sets the internal buffer used by the BufferedInputStream
to 8 KB.
It is best to use buffer sizes that are multiples of 1024 bytes. That works best with most built-in buffering
in hard disks etc.
Except for adding buffering to your input streams, BufferedInputStream
behaves exactly
like an InputStream
.
Optimal Buffer Size for a BufferedInputStream
You should make some experiments with different buffer sizes to find out which buffer size
seems to give you the best performance on your concrete hardware. The optimal buffer size may depend on whether
you are using the Java BufferedInputStream
with a disk or network InputStream
.
With both disk and network streams, the optimal buffer size may also depend on the concrete hardware in the computer. If the hard disk is anyways reading a minimum of 4KB at a time, it's stupid to use less than a 4KB buffer. It is also better to then use a buffer size that is a multiple of 4KB. For instance, using 6KB would be stupid too.
Even if your disk reads blocks of e.g. 4KB at a time, it can still be a good idea to use a buffer that is larger
than this. A disk is good at reading data sequentially - meaning it is good at reading multiple blocks that are
located after each other. Thus, using a 16KB buffer, or a 64KB buffer (or even larger)
with a BufferedInputStream
may still give you a better performance than using just a 4KB buffer.
Also keep in mind that some hard disks have a read cache of some mega bytes. If your hard disk anyways reads, say 64KB,
of your file into its internal cache, you might as well get all of that data into your BufferedInputStream
using one read operation, instead of using multiple read operations. Multiple read operations will be slower,
and you risk that the hard disk's read cache gets erased between read operations, causing the hard disk to re-read
that block into the cache.
To find the optimal BufferedInputStream
buffer size, find out the block size your hard disk reads
in, and possibly also its cache size, and make the buffer a multiple of that size. You will definitely have to
experiment to find the optimal buffer size. Do so by measuring read speeds with different buffer sizes.
mark() and reset()
An interesting aspect to note about the BufferedInputStream
is that is supports the mark()
and reset()
methods inherited from the InputStream
. Not all InputStream
subclasses support these methods. In general you can call the markSupported()
method to find out if
mark()
and reset()
are supported on a given InputStream
or not, but the
BufferedInputStream
supports them.
Closing a BufferedInputStream
When you are finished reading data from a Java BufferedInputStream
you must close it.
You close a BufferedInputStream
by calling the close()
method inherited from InputStream
.
Closing a Java BufferedInputStream
will also close the InputStream
from which
the BufferedInputStream
is reading and buffering data.
Here is an example of opening a Java BufferedInputStream
, reading all data from it, and then closing it:
BufferedInputStream bufferedInputStream = new BufferedInputStream( new FileInputStream("c:\\data\\input-file.txt")); int data = bufferedInputStream.read(); while(data != -1) { data = bufferedInputStream.read(); } bufferedInputStream.close();
Notice how the while
loop continues until a -1
value is read from the
BufferedInputStream
read()
method. After that, the while loop exits, and the
BufferedInputStream
close()
method is called.
The above code is not 100% robust. If an exception is thrown while reading data from the
BufferedInputStream
, the close()
method is never called. To make the code more robust, you
will have to use the Java Java try with resources construct.
Proper exception handling for use of Java IO classes is also explained in my tutorial on
Java IO Exception Handling.
Here is an example of closing a Java BufferedInputStream
using the try-with-resources construct:
try(BufferedInputStream bufferedInputStream = new BufferedInputStream( new FileInputStream("c:\\data\\input-file.txt") ) ) { int data = bufferedInputStream.read(); while(data != -1){ data = bufferedInputStream.read(); } }
Notice that the BufferedInputStream
is declared inside the parentheses after the try
keyword.
This signals to Java that this BufferedInputStream
is to be managed by the try-with-resources construct.
Once the executing thread exits the try
block, the BufferedInputStream
is closed.
If an exception is thrown from inside the try
block, the exception is caught, the
BufferedInputStream
is closed, and then the exception is rethrown. You are thus guaranteed that the
BufferedInputStream
is closed, when used inside a try-with-resources block.
Reusable BufferedInputStream
One of the weaknesses of the standard Java BufferedInputStream
is that it can only be used once.
Once you close it, it's no longer usable. If you need to read a lot of files, or network streams, you have to create
a new BufferedInputStream for each file or network stream you want to read. This means that you are creating a
new object, and more importantly, a new byte array which is used as buffer inside the BufferedInputStream.
This can put pressure on the Java garbage collector, if the number of files or streams read is high, and if they
are read quickly after each other.
An alternative is to create a reusable BufferedInputStream where you can replace the underlying source
InputStream
, so the BufferedInputStream and its internal byte array buffer can be reused. To save
you the trouble, I have created such a ReusableBufferedInputStream
, and included the code for it
further down this tutorial. First, I want to show you how using this ReusableBufferedInputStream
looks.
Create a ReusableBufferedInputStream
First you need to create a ReusableBufferedInputStream
. Here is an example of how to create a
ReusableBufferedInputStream
:
ReusableBufferedInputStream reusableBufferedInputStream = new ReusableBufferedInputStream(new byte[1024 * 1024]);
This example creates a ReusableBufferedInputStream
with a 1 MB byte array as its internal buffer.
Set Source
When you have created a ReusableBufferedInputStream
you need to set the InputStream
on it to use as underlying data source. Here is how you set a source InputStream
on a
ReusableBufferedInputStream
:
FileInputStream inputStream = new FileInputStream("/mydata/somefile.txt"); reusableBufferedInputStream.setSource(inputStream);
The setSource()
method actually returns a reference to the ReusableBufferedInputStream
,
so you can actually create a ReusableBufferedInputStream
and set the source in a single instruction:
ReusableBufferedInputStream reusableBufferedInputStream = new ReusableBufferedInputStream(new byte[1024 * 1024]) .setSource(new FileInputStream("/mydata/somefile.txt"));
Reusing a ReusableBufferedInputStream
When you are done using the ReusableBufferedInputStream
you need to close it. Closing it will only
close the underlying source InputStream
. After closing a ReusableBufferedInputStream
you can use it again, simply by setting a new source InputStream
on it. Here is how it looks to
reuse a ReusableBufferedInputStream
:
reusableBufferedInputStream.setSource(new FileInputStream("/mydata/file-1.txt")); //read data from ReusableBufferedInputStream reusableBufferedInputStream.close(); reusableBufferedInputStream.setSource(new FileInputStream("/mydata/file-1.txt")); //read data from ReusableBufferedInputStream reusableBufferedInputStream.close();
ReusableBufferedInputStream Code
Here is the code for the ReusableBufferedInputStream
described above. Note, that this implementation
only overrides the read()
method of the InputStream
class that it extends. The rest of the
InputStream
methods have been left out to keep the code shorter - but you can implement them yourself
in case you need them.
import java.io.IOException; import java.io.InputStream; public class ReusableBufferedInputStream extends InputStream { private byte[] buffer = null; private int writeIndex = 0; private int readIndex = 0; private InputStream source = null; public ReusableBufferedInputStream(byte[] buffer) { this.buffer = buffer; } public ReusableBufferedInputStream setSource(InputStream source){ this.source = source; this.writeIndex = 0; this.readIndex = 0; return this; } @Override public int read() throws IOException { if(readIndex == writeIndex) { if(writeIndex == buffer.length) { writeIndex = 0; readIndex = 0; } //data should be read into buffer. int bytesRead = readBytesIntoBuffer(); while(bytesRead == 0) { //continue until you actually get some bytes ! bytesRead = readBytesIntoBuffer(); } //if no more data could be read in, return -1; if(bytesRead == -1) { return -1; } } return 255 & this.buffer[readIndex++]; } private int readBytesIntoBuffer() throws IOException { int bytesRead = this.source.read(this.buffer, this.writeIndex, this.buffer.length - this.writeIndex); writeIndex += bytesRead; return bytesRead; } @Override public void close() throws IOException { this.source.close(); } }
Tweet | |
Jakob Jenkov |