The first and most important thing to remember about binary I/O is
that opening a file with ios::binary
is not, repeat
not, the only thing you have to do. It is not a silver
bullet, and will not allow you to use the <</>>
operators of the normal fstreams to do binary I/O.
Sorry. Them's the breaks.
This isn't going to try and be a complete tutorial on reading and writing binary files (because "binary" covers a lot of ground), but we will try and clear up a couple of misconceptions and common errors.
First, ios::binary
has exactly one defined effect, no more
and no less. Normal text mode has to be concerned with the newline
characters, and the runtime system will translate between (for
example) '\n' and the appropriate end-of-line sequence (LF on Unix,
CRLF on DOS, CR on Macintosh, etc). (There are other things that
normal mode does, but that's the most obvious.) Opening a file in
binary mode disables this conversion, so reading a CRLF sequence
under Windows won't accidentally get mapped to a '\n' character, etc.
Binary mode is not supposed to suddenly give you a bitstream, and
if it is doing so in your program then you've discovered a bug in
your vendor's compiler (or some other part of the C++ implementation,
possibly the runtime system).
Second, using <<
to write and >>
to
read isn't going to work with the standard file stream classes, even
if you use skipws
during reading. Why not? Because
ifstream and ofstream exist for the purpose of formatting,
not reading and writing. Their job is to interpret the data into
text characters, and that's exactly what you don't want to happen
during binary I/O.
Third, using the get()
and put()/write()
member
functions still aren't guaranteed to help you. These are
"unformatted" I/O functions, but still character-based.
(This may or may not be what you want, see below.)
Notice how all the problems here are due to the inappropriate use of formatting functions and classes to perform something which requires that formatting not be done? There are a seemingly infinite number of solutions, and a few are listed here:
“Derive your own fstream-type classes and write your own <</>> operators to do binary I/O on whatever data types you're using.”
This is a Bad Thing, because while the compiler would probably be just fine with it, other humans are going to be confused. The overloaded bitshift operators have a well-defined meaning (formatting), and this breaks it.
“Build the file structure in memory, then
mmap()
the file and copy the
structure.
”
Well, this is easy to make work, and easy to break, and is
pretty equivalent to using ::read()
and
::write()
directly, and makes no use of the
iostream library at all...
“Use streambufs, that's what they're there for.”
While not trivial for the beginner, this is the best of all solutions. The streambuf/filebuf layer is the layer that is responsible for actual I/O. If you want to use the C++ library for binary I/O, this is where you start.
How to go about using streambufs is a bit beyond the scope of this document (at least for now), but while streambufs go a long way, they still leave a couple of things up to you, the programmer. As an example, byte ordering is completely between you and the operating system, and you have to handle it yourself.
Deriving a streambuf or filebuf
class from the standard ones, one that is specific to your data
types (or an abstraction thereof) is probably a good idea, and
lots of examples exist in journals and on Usenet. Using the
standard filebufs directly (either by declaring your own or by
using the pointer returned from an fstream's rdbuf()
)
is certainly feasible as well.
One area that causes problems is trying to do bit-by-bit operations
with filebufs. C++ is no different from C in this respect: I/O
must be done at the byte level. If you're trying to read or write
a few bits at a time, you're going about it the wrong way. You
must read/write an integral number of bytes and then process the
bytes. (For example, the streambuf functions take and return
variables of type int_type
.)
Another area of problems is opening text files in binary mode. Generally, binary mode is intended for binary files, and opening text files in binary mode means that you now have to deal with all of those end-of-line and end-of-file problems that we mentioned before.
An instructive thread from comp.lang.c++.moderated delved off into this topic starting more or less at this post and continuing to the end of the thread. (The subject heading is "binary iostreams" on both comp.std.c++ and comp.lang.c++.moderated.) Take special note of the replies by James Kanze and Dietmar Kühl.
Briefly, the problems of byte ordering and type sizes mean that
the unformatted functions like ostream::put()
and
istream::get()
cannot safely be used to communicate
between arbitrary programs, or across a network, or from one
invocation of a program to another invocation of the same program
on a different platform, etc.