Multithreading advice for a game engine?

So I’ve been doing a game engine for my thesis, and now after I am done with the thesis, I want to rewrite/refactor it. Want to put it in a better shape.

I want to make this new version multithreaded, but I have very limited knowledge and experience with that. I am thinking about a task system, where you have a pool of threads and use them to do tasks (i.e. pathfinding for multiple entities, narrow phase of collision detection, etc).

But I don’t wanna fuck it up and lock on everything, degrading performance.

Can anybody recommend a good theory on MT?

1 Like

Good first place to start is learn the difference(pros vs cons) in multi threading vs multi processing.


3 Likes

If by multiprocessing you mean multi-CPU/core. I don’t really see how that is relevant.

If you mean using multiple processes, that’s nuts coz IPC will eat my frame times.

I need multiple threads in the same address space, but I am not sure how to architecture a system that will behave nice with this execution model since I never really did something that complex and multithreaded.

Yes, but if you aren’t familiar with these subjects then that is obviously the place to start. Clearly I was wrong and you have some familiarity already. :slight_smile:

Wasn’t clear from OP(well, to me anyways). To be fair though, I’m not a programmer - I might write some shaders and hack together some game logic from time to time though.

1 Like

The title is probably bad, but in body I did mention that I have some experience with that. It’s just that I feel it’s not on par with what I want to do.

Don’t get me wrong, I appreciate that you’re trying to help :wink:


EDIT: Changed topic title

1 Like

Have you heard of the Rust programming language? One of the big selling points of the language is that it is memory-safe (i.e, no leaks, prevents segfaults, and eliminates use-after-free/double-free bugs at compile-time) without a garbage collector. It does this with a compiler pass called the “borrow checker”, which keeps track of the lifetimes of references and makes sure that a pointer can never be left dangling.

It turns out that something you get for free with this is thread-safety. Code with data-races won’t compile which is pretty dang cool.

rayon is the de-facto standard multithreading library for the Rust programming language. IIRC, the default configuration consists of a work-stealing threadpool as you describe. It’s also pretty easy to use; for example, lets say you wanted to update the physics for all the entities in the game. In Rust, this could look something like:

// syntax note: variables are `name: Type` rather than
// `Type name` like C-family languages. Also; the `&mut`
// just means "mutable reference to". 
fn update_entity_physics(entity: &mut Entity)
{
    // update code goes here...
}

// lets say our entities are in a vector called `entities`.
// this is all the code you would need to write to make
// it run in parallel.
entities.par_iter_mut()
    .for_each(update_entity_physics);

rayon extends most of the collection types in Rust with a couple methods, one of which is called par_iter_mut. This method produces an iterator over the elements of the collection (that’s what the iter in the name means). This iterator can iterate over the elements in parallel (that’s what the par in the name means), and this iterator will be mutating the elements (that’s what the mut in the name means). rayon handles splitting up the work for you which is nice.

Rust doesn’t have a runtime (or at least, in the same way C “doesn’t have a runtime”), and there’s support for both calling into- and from C code, so an option might be to just rewrite the parts that you want to be multithreaded in Rust, and have your existing code (which is presumably C/C++?) call the Rust code when you need multithreading.

If I’ve caught your interest about the language, I’ll point you in the direction of “The Book”. It’s the definitive starting point for learning the Rust language. Unfortunately, the language has a bit of a bad reputation when it comes to learning it for the first time, but IMO if you wrote a game engine for your thesis I think you’ll be fine. There’s a subreddit of people always willing to help too.

4 Likes

I really like Rust. One of my favorite. However, I have found that the documentation for implementing game engines is rather lacking because its not really mature/mainstream yet. So a lot of the tools out there are fledglings.

There is also a subreddit for rust game development.

3 Likes

Exactly this :+1:

Take a look at Intel Threading Building Blocks. It’s a library that provides helper functions and datastructures optimized for multithreading. Perhaps most importantly it comes with a task system which is exactly what you need.

1 Like

I mostly write high performance multi-threaded code in Go these days (because I can), but C++ these days also has pretty much everything you’d need in the standard library.

A useful primitive to have is an efficient thread safe queue, aka a channel.
Once you have it, assuming you also have a thread can build a threadpool, a threadpool should have as many threads as you have cores to dedicate to your work, threadpool threads generally just loop and pull callbacks off of a queue, and run them. If you have more threads than cores you’re letting the OS do context switching between your tasks which hurts performance somewhat, if you have less, you’re wasting cores.

Try not to sleep or otherwise block much while running code on a threadpool, this means segmenting the work that’s done and partitioning your task callbacks into several usually. You’ll become very familiar lambdas and std::bind, and mutexes and conditional variables.
Sometimes you’ll want to run syscalls or allocate/deallocate memory from your threadpool tasks, that’s ok in general, as long as you’re not wasting CPU time (major source of syscalls and libc calls is usually malloc, see tcmalloc, which amortizes cost of coordinating memory freelists across threads).

As a computing concept, it helps to have some awareness of how cache-coherency is maintained in a modern multi-core system, a safe bet is to try and split your data such that you have mostly read-only stuff separate from writable stuff and writeable stuff that each threadpool works at a time should be decently sized; this is so that you don’t have many cores stepping onto each others cache lines. If your thing takes seconds, aim for tens of milliseconds of work per callback, if your thing to schedule takes milliseconds, aim for about 100uS of work per callback.

4 Likes

That’s more or less what I had in mind. I am looking into atomics and lock- and wait-free ways to code that. Memory barriers still seem a bit weird, so gotta read about that.

As for the syscalls, especially malloc, I am gonna pre-allocate few gigs of memory on the start of the game, and then manage it myself so I don’t need to do syscalls.


About rust, I didn’t really like it that much, it’s a great concept but in practice I spent too much time wrestling with the compiler to do what I want. Maybe I should give it another go (since I just played a bit with it). But this is not a project to do that.

The engine is in C btw. And the rewrite will be in C too.


Thank y’all for answers.