Java Reader
Jakob Jenkov |
The Java Reader class, java.io.Reader, is the base class for all
Reader subclasses in the Java IO API.
A Java Reader is like a Java InputStream except that it is character based rather
than byte based. In other words, a Java Reader is intended for reading text (characters), whereas an
InputStream is intended for reading raw bytes.
Readers and Sources
A Reader is typically connected to some source of data like a file, char array, network socket
etc. This is also explained in more detail in the Java IO Overview text.
Characters in Unicode
Today, many applications use Unicode (UTF-8 or UTF-16) to store text data. It may take one or more bytes
to represent a single character in UTF-8. In UTF-16 each character takes 2 bytes to represent. Therefore,
when reading text data, a single byte in the data may not correspond to one character in UTF. If you just
read one byte at a time of UTF-8 data via an InputStream
and try to convert each byte into a char, you may not end up with the text you expected.
To solve this problem we have the Reader class. The Reader class is capable of
decoding bytes into characters. You need to tell the Reader what character set to decode.
This is done when you instantiate the Reader (actually, when you instantiate one of its subclasses).
Java Reader Subclasses
You will normally use a Reader subclass rather than a Reader directly. Java IO contains
a lot of Reader subclasses. Here is a list of the Java Reader subclasses:
- InputStreamReader
- CharArrayReader
- FileReader
- PipedReader
- BufferedReader
- FilterReader
- PushbackReader
- LineNumberReader
- StringReader
Here is an example of creating a Java FileReader which is a subclass of Java Reader:
Reader reader = new FileReader("/path/to/file/thefile.txt");
Read Characters From a Reader
The read() method of a Java Reader returns an int which contains the char value of the next
character read. If the read() method returns -1, there is no more data to read in the Reader,
and it can be closed. That is, -1 as int value, not -1 as byte or char value. There is a difference here!
Here is an example of reading all characters from a Java Reader:
Reader reader = new FileReader("/path/to/file/thefile.txt");
int theCharNum = reader.read();
while(theCharNum != -1) {
char theChar = (char) theCharNum;
System.out.print(theChar);
theCharNum = reader.read();
}
Notice how the code example first reads a single character from the Java Reader and checks if
the char numerical value is equal to -1. If not, it processes that char and continues reading
until -1 is returned from the Reader read() method.
Read Array of Characters From Reader
The Java Reader class also has a read() method that takes a char array
as parameter, as well as a start offset and length. The char array is where the read()
method will read the characters into. The offset parameter is where in the char array the read()
method should start reading into. The length parameter is how many characters the read() method
should read into the char array from the offset and forward. Here is an example of reading
an array of characters into a char array with a Java Reader:
Reader reader = new FileReader("/path/to/file/thefile.txt");
char[] theChars = new char[128];
int charsRead = reader.read(theChars, 0, theChars.length);
while(charsRead != -1) {
System.out.println(new String(theChars, 0, charsRead));
charsRead = reader.read(theChars, 0, theChars.length);
}
The read(char[], offset, length) method returns the number of characters read into the
char array, or -1 if there are no more characters to read in the Reader, for instance
if the end of the file the Reader is connected to has been reached.
Read Performance
Reading an array of characters at a time is faster than reading a single character at a time from a
Java Reader. The difference can easily be a factor 10 or more in performance increase,
by reading an array of characters rather than reading a single character at a time.
The exact speedup gained depends on the size of the char array you read, and the OS, hardware etc.
of the computer you are running the code on. You should study the hard disk buffer sizes etc. of the target
system before deciding. However buffer sizes of 8KB and up will give a good speedup. However, once your char
array exceeds the capacity of the underlying OS and hardware, you won't get a bigger speedup from a bigger
char array.
You will probably have to experiment with different byte array size and measure read performance, to find the
optimal char array size.
Transparent Buffering via BufferedReader
You can add transparent, automatic reading and buffering of an array of bytes from a Reader
using a Java BufferedReader . The BufferedReader
reads a chunk of chars into a char array from the underlying
Reader. You can then read
the bytes one by one from the BufferedReader and still get a lot of the speedup that comes
from reading an array of chars rather than one character at a time. Here is an example of wrapping a
Java Reader in a BufferedReader :
Reader input = new BufferedReader(
new FileReader("c:\\data\\input-file.txt"),
1024 * 1024 /* buffer size */
);
Notice, that a BufferedReader is a Reader subclass and can be used
in any place where an Reader can be used.
Skip Characters
The Java Reader class has a method named skip() which can be used to skip over
a number of characters in the input that you do not want to read. You pass the number of characters to
skip as parameter to the skip() method. Here is an example of skipping characters from
a Java Reader :
long charsSkipped = reader.skip(24);
This example tells the Java Reader to skip over the next 24 characters in the Reader.
The skip() method returns the actual number of characters skipped. In most cases that will be
the same number as you requested skipped, but in case there are less characters left in the Reader
than the number you request skipped, the returned number of skipped characters can be less than the number
of characters you requested skipped.
Closing a Reader
When you are finished reading characters from a Reader you should remember to close it.
Closing an Reader is done by calling its close() method. Here is how
closing an Reader looks:
reader.close();
You can also use the Java try with resources construct
introduced in Java 7. Here is how to use and close a InputStreamReader looks with the try-with-resources
construct:
try(Reader reader = new FileReader("/path/to/file/thefile.txt")){
int data = reader.read();
while(data != -) {
System.out.print((char) data));
data = reader.read();
}
}
Notice how there is no longer any explicit close() method call. The try-with-resources construct
takes care of that.
| Tweet | |
Jakob Jenkov | |











