C++, 8-bit bytes, portability, and self-documenting code

I’ve started working on a Chip-8 emulator in C++ for practice, and it seemed obvious to me to use uint8_t to represent the 8-bit bytes in the system. However, reading through stuff online, I found that everyone seems to use char for 8-bit bytes. This seems odd to me for two reasons:

  1. It’s a bit counter-intuitive to use “char” to refer to something that isn’t meant to represent characters, not to mention uint8_t would make for more clearly self-documenting code because it communicates intent
  2. I think I recall reading that the size of char is implementation-defined, so it might not even be 8 bits at all

But given that the use of char for this purpose is so widespread and seemingly idiomatic for C++, am I just being stupid here?

I would use uint8_t as well. In practice it doesn’t make any difference, but it shows your intent better than char. I really don’t use char for anything anymore. It’s obviously not suited for characters and if I need fixed width integers the cstdint header has them all.

2 Likes

Use uint8_t. A lot of old code is using char because the _t defines for standard types hasn’t been in C++ from the beginning.

Here is a good document for writing good modern c++
https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md

1 Like

That’s reassuring. I have a follow up question:
In the emulator I need to read some binary data into “memory”, and what I thought of doing was something like this:

std::array<uint8_t, 4096> memory;
std::ifstream ifs {path-to-file};
// load file into memory starting at 0x200
auto mem_it {memory.begin() + 512};
std::copy(std::istreambuf_iterator<char>(ifs),
          std::istreambuf_iterator<char>(),
          mem_it);

This seems to work, but the the use of char is unfortunate, and replacing it with uint8_t makes the compiler complain about no known conversion from std::basic_ifstream<char> to std::basic_streambuf<unsigned char, std::char_traits<unsigned char> >*.Is there some obvious way of using uint8_t as the type parameter to std::istreambuf_iterator that I’m missing here, or am I just going to have to live with the char?

ifstream is just a typedef for std::basic_ifstream as follows:

typedef basic_ifstream<char> ifstream;

Which is why it expects you to use char when declaring iterators for the stream.

If you want to use uint8_t you would probably need to declare your file stream using the basic_ifstream template as:

std::basic_ifstream<uint8_t> ifs {path-to-file};

Thanks! That got the code to compile with uint8_t. It’s curious though, that when I tried to write back out to disk using a basic_ofstream<uint8_t>, I got a bad_cast exception from the stream constructor. It kind of seems like the STL really doesn’t want to let go of char. I don’t suppose there’s yet another thing I’m missing here?

Edit: hurrrrr
The bad_cast wasn’t coming from the constructor to basic_ofstream, it was coming from

copy(istreambuf_iterator<uint8_t>(ifs), istreambuf_iterator<uint8_t>(), mem_it);

Either way I’m kind of getting the impression that the STL insists on doing characterwise IO, rather than letting me insist on 8-bit bytewise IO. Maybe I’ll look and see if there’s something in C that I can use.

That makes sense, because a char is defined as the smallest addressable unit. This is actually a valid use case for char.

1 Like

As long as you keep the types the same there shouldn’t be any issues. And I would recommend trying to stick with STL, because when you learn it, its so much more powerful than just using C functions.

A cursory look at C didn’t yield anything that seemed sensible in this case. It seems like char and byte being synonymous is just how C++ is after all. I don’t like it, but I guess it works. I ended up using normal fstreams in the end after all. Thanks again for all the help.