Java Reader
Jakob Jenkov |
The Java Reader class, java.io.Reader
, is the base class for all
Reader
subclasses in the Java IO API.
A Java Reader
is like a Java InputStream except that it is character based rather
than byte based. In other words, a Java Reader
is intended for reading text (characters), whereas an
InputStream
is intended for reading raw bytes.
Readers and Sources
A Reader
is typically connected to some source of data like a file, char array, network socket
etc. This is also explained in more detail in the Java IO Overview text.
Characters in Unicode
Today, many applications use Unicode (UTF-8 or UTF-16) to store text data. It may take one or more bytes
to represent a single character in UTF-8. In UTF-16 each character takes 2 bytes to represent. Therefore,
when reading text data, a single byte in the data may not correspond to one character in UTF. If you just
read one byte at a time of UTF-8 data via an InputStream
and try to convert each byte into a char
, you may not end up with the text you expected.
To solve this problem we have the Reader
class. The Reader
class is capable of
decoding bytes into characters. You need to tell the Reader
what character set to decode.
This is done when you instantiate the Reader
(actually, when you instantiate one of its subclasses).
Java Reader Subclasses
You will normally use a Reader
subclass rather than a Reader
directly. Java IO contains
a lot of Reader
subclasses. Here is a list of the Java Reader
subclasses:
- InputStreamReader
- CharArrayReader
- FileReader
- PipedReader
- BufferedReader
- FilterReader
- PushbackReader
- LineNumberReader
- StringReader
Here is an example of creating a Java FileReader
which is a subclass of Java Reader
:
Reader reader = new FileReader("/path/to/file/thefile.txt");
Read Characters From a Reader
The read()
method of a Java Reader
returns an int which contains the char
value of the next
character read. If the read()
method returns -1, there is no more data to read in the Reader
,
and it can be closed. That is, -1 as int value, not -1 as byte or char value. There is a difference here!
Here is an example of reading all characters from a Java Reader
:
Reader reader = new FileReader("/path/to/file/thefile.txt"); int theCharNum = reader.read(); while(theCharNum != -1) { char theChar = (char) theCharNum; System.out.print(theChar); theCharNum = reader.read(); }
Notice how the code example first reads a single character from the Java Reader
and checks if
the char numerical value is equal to -1. If not, it processes that char
and continues reading
until -1 is returned from the Reader
read()
method.
Read Array of Characters From Reader
The Java Reader
class also has a read()
method that takes a char
array
as parameter, as well as a start offset and length. The char
array is where the read()
method will read the characters into. The offset parameter is where in the char
array the read()
method should start reading into. The length parameter is how many characters the read()
method
should read into the char
array from the offset and forward. Here is an example of reading
an array of characters into a char
array with a Java Reader
:
Reader reader = new FileReader("/path/to/file/thefile.txt"); char[] theChars = new char[128]; int charsRead = reader.read(theChars, 0, theChars.length); while(charsRead != -1) { System.out.println(new String(theChars, 0, charsRead)); charsRead = reader.read(theChars, 0, theChars.length); }
The read(char[], offset, length)
method returns the number of characters read into the
char
array, or -1 if there are no more characters to read in the Reader
, for instance
if the end of the file the Reader
is connected to has been reached.
Read Performance
Reading an array of characters at a time is faster than reading a single character at a time from a
Java Reader
. The difference can easily be a factor 10 or more in performance increase,
by reading an array of characters rather than reading a single character at a time.
The exact speedup gained depends on the size of the char
array you read, and the OS, hardware etc.
of the computer you are running the code on. You should study the hard disk buffer sizes etc. of the target
system before deciding. However buffer sizes of 8KB and up will give a good speedup. However, once your char
array exceeds the capacity of the underlying OS and hardware, you won't get a bigger speedup from a bigger
char
array.
You will probably have to experiment with different byte array size and measure read performance, to find the
optimal char
array size.
Transparent Buffering via BufferedReader
You can add transparent, automatic reading and buffering of an array of bytes from a Reader
using a Java BufferedReader . The BufferedReader
reads a chunk of chars
into a char
array from the underlying
Reader
. You can then read
the bytes one by one from the BufferedReader
and still get a lot of the speedup that comes
from reading an array of chars
rather than one character at a time. Here is an example of wrapping a
Java Reader
in a BufferedReader
:
Reader input = new BufferedReader( new FileReader("c:\\data\\input-file.txt"), 1024 * 1024 /* buffer size */ );
Notice, that a BufferedReader
is a Reader
subclass and can be used
in any place where an Reader
can be used.
Skip Characters
The Java Reader
class has a method named skip()
which can be used to skip over
a number of characters in the input that you do not want to read. You pass the number of characters to
skip as parameter to the skip()
method. Here is an example of skipping characters from
a Java Reader
:
long charsSkipped = reader.skip(24);
This example tells the Java Reader
to skip over the next 24 characters in the Reader
.
The skip()
method returns the actual number of characters skipped. In most cases that will be
the same number as you requested skipped, but in case there are less characters left in the Reader
than the number you request skipped, the returned number of skipped characters can be less than the number
of characters you requested skipped.
Closing a Reader
When you are finished reading characters from a Reader
you should remember to close it.
Closing an Reader
is done by calling its close()
method. Here is how
closing an Reader
looks:
reader.close();
You can also use the Java try with resources construct
introduced in Java 7. Here is how to use and close a InputStreamReader
looks with the try-with-resources
construct:
try(Reader reader = new FileReader("/path/to/file/thefile.txt")){ int data = reader.read(); while(data != -) { System.out.print((char) data)); data = reader.read(); } }
Notice how there is no longer any explicit close()
method call. The try-with-resources construct
takes care of that.
Tweet | |
Jakob Jenkov |