[Devember2020] Brawl AI - cockfights but for AI. A way to capture better AI for computer games?

Edit: here is the website: https://brawl.ai

Alright, I’m new here. I saw the announcement last week on YouTube. Greetings and salutations!

I did start this side project a couple of months ago. This was soon after I was made to work from home due to corona (not beer). I’m hoping to get a proper first version out before the end of the year and I thought this challenge came at an opportune time.

The project is a little meta for this challenge:

  • A website for programming AI bots for a turn-based game.
  • The bots compete on leaderboards for the best ranking.
  • Source-code for all the competing bots will be published after a period of time. Likely in “seasons”.

Motivation for the project:

  • Some computer games just have a terrible AI. How could this situation be improved?
  • Learning AI can be challenging and there are so many different ways of doing it. Learning from other people’s code in a shared environment is a great way to learn.
  • To even start writing and practising AI a lot has to be set in place to begin. On computer game bots a lot gets in the way such as interpreting screen and mimicing user input. How could the barrier to entry be lowered?

Personal motivations:

  • I like writing bots for computer games, especially the grindy ones, but they can get you banned.
  • Learn and use some OTT tech that is normally used in large scale projects, in this small project.

Tech being used:

  • Mainly Java Spring, Go, Python and JavaScript. (More Go in the future and less Java!)
  • Currently about 15 microservices in 2 separate compute clusters.
  • The web-facing cluster is using Kubernetes.
  • ReactJS SPA for the front end.
  • Mixing x86 and ARM.

I’ve worked quite hard in the past week or so to get a beta version of the website up and running. I thought it would be much more interesting to share if I can actually show something. I’ve signed up to Linode and I’m using their 60-day trial deal. There are a number of parts still missing from the whole project, such as (the objectives for Devember2020):

  • A proper CI/CD chain and not all services are properly tested yet (partially done - some CI/CD is in place and this is currently being worked on).
  • A lot of the services are running “blind” without automated scaling and proper log collection (partially done - got metrics and some log collection, but no automated scaling, yet).
  • There is no email feature yet, such as password recovery for accounts. What would be a great and affordable email service provider for transactional emails?
  • Automated database backups. I’m considering streaming replication + nightly full backup for offline. The site has just been put up and there isn’t really any data on there yet. (fully done)
  • There are some really simplistic place-holder graphics. This is not currently a high priority as nobody wants to use an incomplete buggy mess even if it looks great.
  • Bug fixes! I’m already aware of a couple that needs fixing.

I’m probably not going to be able to improve the graphics before the end of December, but I can hopefully finish up the software and the pipeline. Plan is to put a couple of hours in some evenings of the week and more on the weekends, averaging out about an hour a day. This is a hobby project outside my current employment.

Bonus objectives for Devember2020:

  • .onion service for TOR users. A TOR microservice in kubernetes?

I’m open for (and looking for) feedback, criticism, advice etc. Please let this not distract you from your own project! I’m also looking for competition in the AI space. I’m not an expert and I’ve posted three very simple bots on the website to start with (they just over 100 lines each). They should be easy to beat, but competing alone isn’t all that fun.

Also, any questions are welcome too!

The proof is in the pudding:
https://brawl.ai

Edit: 7th of January, 2021 - Automatic backups are up. Metrics are done. CI/CD is fully up and running, but not yet fully integrated (it is being integrated to the downscaled single-node cluster).

Currently working on downscaling and rewriting Java programs in Golang. New tools that have been introduced since the start of Devember: Concourse-CI + Vault, k0s + Traefik + MetalLB, Terraform, and very soon: cert-manager and external-dns. Currently being removed: Java Spring Cloud Gateway and custom certbot.

Once the downscaling efforts are done, then I can finally do the email feature and start looking into improving the graphics. And maybe that onion service and vscode plugin. I have also received some user feedback on areas where I could make the service friendlier and more intuitive to use. I’m looking forward to tackling these next.

9 Likes

This seems like a really fun game! Once it opens for pubic access I’d love to give it a shot.

There’s a couple of things I’m curious about:

  • 15 microservices? That’s seems like a lot. Are you deploying BaaS? :wink:
  • How are you planning on isolating the user-submitted programs? Running user-provided code can be dangerous. Even Google has fucked it up many times.
  • Are the performance requirements? I imagine the best-performing solutions might be pretty heavy pytorch models.
  • Are you considering installing common packages like pytorch / tf?
  • Why ARM? I’ve heard good things about ARM for the datacenter, but did you explicitly pick ARM for your VPSes? Why?

Am looking forward to following this project. :man_technologist:

Oh and for the record: I think the graphics look great!
I wouldn’t think too much about having to replace them.

If this is a game for programmers, I think “programmer art” will be perfectly acceptible to most people :wink:

Hey. Thanks!

Feel free to give it a shot! I’m hoping to keep the interface for the bot interaction as frozen as I can. While it is under development and stuff may break, please do feel free to participate! It is definitely open to public. One of my challenges is to add features “transparently” with CI/CD without breaking anything and while keeping the website fully functional.

It is mainly due to running two clusters. I’m also practicing running many different “nano” services and updating those individually. If this project was only about getting an MVP, I would have done a monolith in a fraction of the time it has taken to build all these individual services. I wanted more experience in managing lots of services. And it can now theoretically scale horizontally to ridiculous sizes.
Lets list them (I may forget some). First cluster:

  1. Gateway service
  2. Login service
  3. Public results access service
  4. Private/Secured (for authenticated access) files/data service
  5. Game host service (for the second cluster access)
  6. Database service
  7. Custom certbot service for the TLS keys.
  8. Database backup service (under development)

Second cluster:

  1. NFS service
  2. TFTP service
  3. Docker container registry service (read-only)
  4. Docker container registry update service. Updates the images on the read-only version and removes old/unused layers.
  5. Master service for worker workload distribution
  6. Worker service for obtaining workloads from Master and running them (the game host for a single bot match)
  7. Individual workload service (runs the actual bots and communicates the results to the game host)
  8. Database service (only duplicated (subset) information from the first cluster and won’t need backups)

Then there are a number of smaller helper programs and scripts for updating the various files and patching the OS, which I hope to put into the CI/CD pipeline. Currently I’m running a couple of scripts after I change source-code, which compiles artifacts and updates services. I haven’t installed Jenkins yet…

Just the disclaimer out of the way: securing something is impossible against a determined and resourceful enemy.

However, I hope to have done enough. I heard from a talk on the security practices at Netflix that you should never disclose your full architecture to the public. While Netflix has money and a lot of personal information, it can be a valuable target. This website is likely never going to hit mainstream and is targeted towards a couple of geeks like myself.

There is a separate cluster for the internet/front-facing services, which also contain the sensitive information: an optional email address and a hashed password. This cluster is separate from the cluster that handles user-submitted, untrusted, programs. Sensitive information or even information on project/bot names or usernames is never passed along from the internet facing cluster.

The sandboxing layers that I’ve built for this:

  1. Non-root user runs the untrusted program inside docker.
  2. Docker container runs in an isolated namespace
  3. The file system within docker is read-only
  4. No network access is passed into docker
  5. Host machine file-system is mounted as read-only and in RAM only.
  6. Host machine has no storage space/volume.
  7. Host machine has no software writeable eeprom space/bios.
  8. Host machine boots from a read-only TFTP and NFS volume (PXE) and only into RAM.
  9. User programs are loaded into fixed size tempfs and passed as read-only to docker.
  10. The user’s filenames are parsed for allowed characters before they are being written to the tempfs volume. The tempfs volume is not shared between docker containers and is mounted as read-only. Each container has its own tempfs volume for each user’s untrusted source-code files.
  11. A custom startup program validates the non-root user, network and read-only conditions within docker before it executes the python executable with the user’s files.
  12. TFTP, NFS and docker registry services have read-only access to the master node’s storage.
  13. Host machine has network access only to the master node and within the cluster.
  14. The host machines (worker nodes) run in an isolated network and the only connection between them and the outside world is the master node. (Master has multiple NICs, one for the worker cluster and other for internet/outside world).
  15. The master node does not forward any network connections outside and does not provide ICMP forwarding either and no DNS.
  16. The master node provides the API which the worker node (host machine) accesses to get workloads and to submit results to. The API also runs within docker on the master node and the database is never exposed to the worker nodes.
  17. The read-only docker registry which hosts the python container only has that single container and the associated layers only.
  18. I’ll be running regular updates and hopefully keep up with security updates.

There are some other few small things in place, but those are the highlights. I’ve tried to minimise the attack surface as much as possible, but I’m sure that as a single developer I’ve surely made mistakes. Hopefully not enough to allow breaking through all the security layers.

I also have copies of all the untrusted software in a separate data centre in case things go really bad and I can identify what software caused it… and maybe make an interesting blog post once all my data is wiped.

If there is something else that I should think of or if I have overlooked at something, I’m very much open ears to improve the security.

I was looking into the security and x86 has had some problems with meltdown and spectre. Even running VMs with those could allow escaping the sandbox. There is an affordable candidate which I decided to go with: Why Raspberry Pi isn’t vulnerable to Spectre or Meltdown

However, the new Raspberry Pi 4’s have software writable eeprom. The older models like 3B+ have software read-only eeprom and load the firmware at boot. The 3B+ is rather weak and I’d like to use the faster Pi 4’s with more memory, but they are potentially vulnerable to spectre and they have writeable eeprom. The Pi foundation said that the eeprom could be made read-only on the Pi 4 with a small physical modification to the board, but the last I checked the details for these are not released yet.

The way I could think of making the Pi 4 work would be to “airgap” the Pi cluster from my home network. I have thought of getting a 4G LTE modem and only have the Pi cluster attached to it and nothing else. If it breaks into the internet then it isn’t that much of a problem. The worst it could do is to contribute to a botnet until it is flagged and I’ll patch up the security hole.

I think pytorch and tf are interesting. I haven’t personally ventured much into machine learning, but it is definitely the future, no doubt. I don’t think many games (yet!) make much use of machine learning in their AI. This will definitely catch up. I’d be very interested in trying to include these libraries into the software stack that is available for the user’s program.

Current limitations are imposed by the chosen platform (Raspberry Pi). I’ve placed an artificial memory limitation of 100MB and limited the container software to a single processor core (docker uses cgroups for these limitations). This way the computer can run two programs which cannot easily interfere with each other’s resources.

As the initial step, I haven’t actually included any external libraries to the python image. It is just the pure and full python with all of it’s standard libraries. The python:slim package is taking about 160MB or so of memory load the docker image to run, which is on top of the user program’s 100MB limitation per user + the OS and the tempfs. The poor Pi only has 1GB in total. Pi 4 would totally change this with 8GB! I figured I could run 4 programs on the quad-core Pi at one which allows each user to have a dedicated processor core.

Other limitations that I put in place is to have up to 60 seconds of execution time per software. I took this idea from chess where you start with a bunch of time and earn up to a fixed amount of time over a number of turns. So, you start with 15 seconds for your first turn and get an additional second for each of the following 45 turns. You can have more turns than that, but the bot does not receive additional execution time. Any unused execution time carries on to the next turn, like in chess.

I hope I haven’t provided too much information to what you were asking, if I was unclear about anything or didn’t say enough, just let me know! And thank you for your interest!

Thanks!

I’m hoping to revisit the graphics at some point, but as long as they aren’t a deterrent for anyone then they are a lower priority to other functionality that I’d like to have on the site.

It is definitely “programmer art” :smile:

I hope I haven’t provided too much information to what you were asking,

Not at all! This was a fun read :smiley:

if I was unclear about anything or didn’t say enough, just let me know!

I just might take you up on that one :wink:
This stuff is super interesting to me.

I wanted more experience in managing lots of services.

I’ve definitely built projects like this and gotten a lot out of it. Once I re-wrote a 50-line python script to 8 “nanoservices” with a couple of friends. Although I wouldn’t necessarily recommend it, it was a good lesson in managing complexity.

Also I thought you meant 15 original codebases - but it seems like a portion of your services are actyally off-the-shelf containers.

  1. Gateway service
  2. Custom certbot service for the TLS keys.

This sounds like you might have invented Traefik or Caddy. Why did you make a custom certbot service?

On related note: Are you using any of the other spicy new CNCF tools? I feel like there is a lot of fun (and profit?) to be had from that type of microservice-native services.

[security]

I’m impressed! You are like 10 steps ahead of the few things i was thinking of re: security. :sweat_smile: Good work!

I was looking into the security and x86 has had some problems with meltdown and spectre.

Damn, that is in-depth security awareness. I love it!

I have thought of getting a 4G LTE modem and only have the Pi cluster attached to it and nothing else.

I’ve done a single-pi deployment like that a couple of times before. It was kind of a pain to set up, but once I got it working it was pretty great. I did it for portability though, not security.

I heard from a talk on the security practices at Netflix that you should never disclose your full architecture to the public.

Security by obscurity is a tough one… I’ll be quick to spout the usual line about it not being security at all- But then again I think it might actually be underrated in a lot of cases. It definitely can be a huge practical hinderance to the casual attacker. I’m torn.

On a related note: Are you planning on open-sourcing (parts of) the application code? Security pros/cons aside, I think there are significant development advantages (and developer goodwill) to be had.

I think pytorch and tf are interesting. I’d be very interested in trying to include these libraries into the software stack that is available for the user’s program.

An unfortunate thing about both pytorch and tf is that they’re both kind of huge. It can get real heavy to juggle multi-gigabyte docker images :frowning:

Here’s a crazy idea: one could let players expose a bot API on their own host, which brawl.ai connects to to play. That would let people use whichever software stack and dependencies they want.
Of course that’s a huge undertaking - just wanted to put it out there.

I took this idea from chess where you start with a bunch of time and earn up to a fixed amount of time over a number of turns. So, you start with 15 seconds for your first turn and get an additional second for each of the following 45 turns.

I love this chess-clock idea!

Currently I’m running a couple of scripts after I change source-code, which compiles artifacts and updates services. I haven’t installed Jenkins yet…

Oh my. I have only had piss-poor (to say it nicely) experiences with Jenkins.
:man_technologist: Switching away from Jenkins has been the best thing to my productivity since docker.
Have you tried Gitlab CI? :fox_face:
My own CI/CD stack consists of a Docker registry, Ansible, Watchtower and Gitlab CI (DAWG stack?) - and it’s pretty awesome.

please do feel free to participate! It is definitely open to public.

I guess I wasn’t paying enough attention. That is awesome!
I will definitely give it a shot at some point. Right now I’m putting most of my dev time into my own devember project though :wink:

Definitely! There is a project called No Code - the ideal way of writing applications. Just provide a config and you’re done. Even better if they were self-configuring. It may be a bit tongue in cheek, but the less you write the less you have to maintain.

But then there is the other extreme: if 100 lines of code can prevent you from importing 100MB of software libraries, it is probably better to write those 100 lines of code. Adding software libraries also add complexity and vulnerabilities. It is a fine balance.

The are a couple of factors that have affected these decisions.

Ingress with GKE vs Linode
I did originally deploy this website on GKE. I figured the people who wrote kubernetes would probably have the best support for it. While GKE is very easy to use and with just a click of a button you can have clusters in all continents, load balancing and replicated databases… it isn’t a very cost effective way. Well, it depends. If you start counting people’s salaries and maintenance costs, then throwing that maintenance money towards GKE is probably a decent idea. However, this site is not made for generating money and is only a hobby project with likely a very small userbase (currently just me) forever, I wanted to keep the costs down.

What really turned me away from GKE is was their external IP configurations for the clusters. You really have to use their external load balancers for their managed kubernetes clusters. You cannot just use kubernetes’ own built-in load balancing functionality with a single external IP. While GCP advertises free ingress traffic, it isn’t free for load balancers, even for their cheapest, regional, option. Fair enough if egress costs, but ingress as well? You could use NodePorts but then you might as well not use kubernetes as it breaks the whole point.

GKE has great support for ingress controllers. Linode hasn’t (correct me if I’m wrong!). What I’ve gathered from the docs, it is either Kubernetes’ NGINX Ingress Controller or none for Linode. While Traefik, NGINX and other controllers can provide a neat way to get TLS certs (such as using cert-manager) and do load balancing, it is effectively adding an extra layer of routing and latency to the service. The gateway service that I use is a Spring Cloud Gateway with a bit of custom logic that handles JWT for OAuth2. I’m not sure if this logic is available on some of the ingress controllers or not, but it isn’t just a dummy router and it actually handles the access validation for resource servers. I’m likely going to revisit the gateway at some point. I’ve heard good things about Kong.

TLS with Linode
I had a bit of trouble getting other ingress controllers to work on Linode apart from the NGINX, which they do support officially. I already have all the routing functionality via the Spring Cloud Gateway and adding an NGINX just for the TLS adds unnecessary latency (an extra service/network hop). Besides, the NodeBalancer (Linode’s load balancer) can already handle TLS which effectively leaves the ingress controller to do nothing: no routing and no tls handling. It would just be waste of memory, cpu cycles, latency and bandwidth. I might revisit this if the site gets suddenly a bit of traffic and I need to actually start caching results. The ingress-controller could perform the caching step quite nicely.

Now the problem became: how do I get the TLS cert for Linode’s load balancer. I looked into using an automated system like cert-manager, but cert-manager works with ingress controllers such as the NGINX-Ingress. My domain name registrar, which also provides a free external DNS service, doesn’t support the DNS validation required by the cert-manager. For the HTTPS validation the cert-manager starts randomly named services. I added some custom routing for the gateway with a kubernetes client (fabric8) to find the correct service, but the cert-manager’s ACME validation software checks the originating hostname, which is set in the ingress-controller. Routing traffic to the randomly named service wasn’t enough. At this point I was writing more custom code than it would have taken to write my own certbot service.

So, I wrote my own certbot service.

The custom certbot-service works without an ingress-controller, it does the ACME http validation and generates the cert secrets in kubernetes. It only took me an afternoon and the longest part was parsing the issued x509 certificate and extracting the expiry date. It turns out Golang’s x509 Certificate doesn’t work with Let’s Encrypt’s certificate as it expects a different layout of attributes. The x509 Certificate code is apparently also on life support and mostly frozen.

I wish Linode’s NodeBalancer would be able to issue TLS certs automatically for kubernetes, like how GKE can. This would ease up the deployment in their managed cluster. I’m not sure if they have the source-code anywhere so I could look into contributing this feature (I haven’t really looked). After all, they do charge for the load balancer quite a bit and it isn’t optional for kubernetes, and it doesn’t do much more than just provide an external IP address for the cluster. It isn’t really an issue for large cluster with beefy and expensive nodes, but for tiny clusters it costs as much as a single node. Automatic TLS management would be so useful, especially as they already handle the TLS handshake for you as long as you provide certs. This should really be just an annotation in the service yaml, like:
service.beta.kubernetes.io/linode-loadbalancer-automatic-tls-cert: "true"

I hope someone will correct me on this and say: “there is a better way…”.

OAuth2 Authorization server with Spring Security
The OAuth2 is a feature that I dropped (for now), but I kept the JWT part. I really wanted an SSO, which is a bit silly for a single site/domain and I wrote an authorization server using the Spring Security OAuth library. I knew it was deprecated (2) but I implemented it anyway. It looked like the best candidate without having to use something like Keycloak.

Mixing Spring’s a modern resource server and client with the old authorization server requires a few work arounds. Also, the old authorization server doesn’t support the new reactive Java tools and is using Tomcat vs Netty. Pivotal wasn’t going to continue the OAuth authorization server, but they recently announced that they had reverted that decision and now have an experimental community-driven version being developed (by the community?).

I decided to drop the OAuth as it made the login process a bit clunky with the SPA that I had written. The Spring authorization server framework was built around generating static pages (the old school way) whereas ReactJS just modifies your current page. I lost the control over transition animations between pages. It also introduced an extra template language, Thymeleaf. I’d like to revisit the OAuth authorization server once the new rewrite becomes stable. Or I might just look into options with Golang.

The OAuth would become handy for supporting third-party apps. I don’t know if there will be any interest for third-party apps. I now have this OAuth authorization server that I could just deploy, but I’ve removed it for now.

I’m definitely looking into istio and grafana, but I haven’t got that far yet. I’m also interested in terraform, but I don’t have a lot to manage yet. Currently it is mainly some bash scripts, a bit of gradle and some yaml files. I’ll look into this once I get to tackle the CI/CD pipeline. Helm is also very popular, but I haven’t come across anything that it would solve in this application over what is available by default from kubernetes.

Doing the onion service could provide an opportunity to try out some of the cloud native tools.

Don’t be. Obscurity definitely has its values. Providing a map for the enemy helps them out. However, relying on the enemy to not have a map for security is foolish.

There are plans, but they depend on some external factors at the moment.

I was under a strict IP contract until February this year, which prevented me from having personal projects like this. I still have some open-source related restrictions with my current employer/line manager that would affect my employment negatively. I hope to have these resolved soon™, clean up the code, put some tests for regressions and share some code :slightly_smiling_face: No promises yet, but I hope to get there.

I’ve had some similar ideas and considerations.

There are a couple of problems with supporting external programs:

  • One of the main motivations for the website is for learning. Publishing all the source-code for each bot after a period of time allows others, like myself, to learn from other people’s code. There is no guarantee that the external program that connects to the competition is using the source-code that they may provide. If you rank to the top with an external program, but don’t provide source-code for it, then others cannot learn from how the program is written - and reuse the code for their own bots.
  • When you submit a program for the competition and if the program gets ranked, it will need to defend its position against any new programs that want to get ranked too. Even if it could be possible with webhooks or otherwise to ask an external program to run, so it can have a match against a new program… it is difficult to guarantee the availability of such external programs over multiple weeks/months.

In short: source-code and availability are the main problems.

What I’d like to do is to provide an API to test your own bots against already submitted bots. Even if you cannot qualify for ranking this way, you could run your bot locally and test it against the best ranking bot to ensure that it performs adequately before submitting it. Once you are happy with your tests, you can submit it for ranking and everyone will have access to it later on. This does impose the risk that nobody will submit their bot unless it can beat the top ranking bot first.

Something like a VSCode plugin is definitely on my TODO list. First I need to finish up the website before working on additional tools.

I have some experience with Jenkins, but I have not really tried Gitlab CI. I have Gitlab running privately where I host my source-code. I was thinking of moving to something lighter like Gitea and then separate the CI/CD pipeline to an external program, like Jenkins.

I currently dislike Gitlab for similar reasons that I have with IntelliJ: they provide a community edition for free and deliberately cut a lot of features out for the enterprise editions. While it is nice to have a community edition, some features will never be included even if there is a pull request for it as it would compete with the enterprise edition. I’m trying to move to more open systems and support those. Note: I do use IntelliJ IDEA Ultimate, but only for Java. VSCode support for Java is just not there yet.

I haven’t really started tackling the CI/CD part yet. So, I don’t have a lot of strong opinions on what will or won’t work well for this application. I’m doing the automated database backups first as the first priority. I’ll likely look into the CI/CD or the password recovery via email after the backups are sorted. The last thing I want to do is to have another programmer interested on the website and lose them due to data loss.

I’m definitely open for all CI/CD suggestions and I will try them out as soon as I get that far! I’ll check the Gitlab CI too in much more detail.

I did it!!

I am proud to announce that I now hold the top two spots on the Brawl.AI leaderboards :grin:

It was a lot of fun to string a coherent bot together.
The strategy it’s using is not sophisticated at all: basically just patrol the map until it finds an enemy, then stop and shoot.

But hey, it performs better than @Player1’s three bots :tada:

1 Like

That’s awesome!

I need to pull up my socks and try to reclaim the top spot.

2 Likes

:1st_place_medal: :smirk:

Backups are finally done!

It is surprisingly easy to backup a postgresql database. The pg_basebackup tool is all that I needed in the end. Its capabilities have changed a lot even in the past 3-4 years. The database is now backed up in two ways:

  • The database is being stream replicated to my home network in real-time. This allows less than 1 second data loss in the even that the persistent disk in Linode fails.
  • Full copies of the database are taken periodically, which allows recovery in case of a bad SQL statement. E.g. if a table gets dropped by mistake.

I had to figure out how to run pg_basebackup against the Kubernetes pod that runs the postgresql database without exposing ports or allowing a direct connection to the database from the internet. So, I ended up writing two auxiliary programs that create a TCP proxy with TLS between my home network and the k8s cluster. Now I can just run the pg_basebackup from my home backup server and it can connect directly to the postgresql instance inside the cluster without having to expose ports. Reverse shell would have worked too, but this was only about 500 lines of Golang over two programs.

This backup work took way longer than expected. My main backup server that runs ZFS ran into some problems. I had an Arch Linux installation from around 2014 on it which I updated regularly. However, somehow some file got corrupted on the server and the whole server started throwing segmentation faults, even with simple commands like ls and cp. It resulted in corrupting all the metadata on the ZFS volume. Even zpool import -fFXm failed after 9 and a half days. I let it finish twice on two separate computers with clean installations. It turned out that all file data was still intact and I managed to import the pool by disabling spa_load_verify_metadata zfs kernel module parameter. I learned that I need to make backups of my backups. I’ve also set that up now. It would have been painful to start re-backing up everything. Now things are 3 to 4 backups deep in different physical locations (depending on the importance of the data).

I still haven’t figured a great way to perform offline backups that don’t require manual labour. Does anyone know a great way to automate offline backups? This might require a raspberry pi project with a servo that plugs in a usb cable and unplugs it after the backup :thinking: but is that then really offline?

You are still king on the leaderboards. I’ve been spending some time writing a better algorithm for positioning the bots. One bot that I’m writing will try to hide and run away, and win by time. The other one will try to use some cover.

I’m pretty sure that would defeat my bot!

I was looking at your bot and thought it might :grinning: It depends on how well I can get it to work. It will be interesting if it ends up being the best bot. Then I need to make a bot that hunts it down :thinking:

1 Like

I previously (attempted) to create a workable network game engine using cython and even then it had serious performance issues that slowed development to a crawl having to optimise everything to the point where it would have been faster to just code it in C in the first place (with less headache too). I would use straight python if it’s a calculate later deal but go or C++ if you want realtime performance. I would try go for the frontend, python for the website and c++ for the actual game engine with a micro language to program the bots that translates into c++ class interfaces.

Thanks for your interest yasin!

The only place where python is currently used is the AI part. It is the language for the bots to interact with the system/host. I chose python purely out of its popularity, it is easy to learn and that it doesn’t need to be compiled ahead of time. It is the language for the user.

The rest of the system is written combination of Go and Java. Most of the web-facing services are in Java with a couple of small auxiliary Go programs. The python runtime handling and the “game server” (the many small services that make up the platform) are written in Go. Go has a much smaller memory footprint to Java and thus I can have more memory left for runtime in a memory constrained environment.

I haven’t ran into any performance issues that would warrant the use of C++ language, and I hope Go and Java will be performant enough for this project. I don’t think there is currently a problem using Python for the user’s code as that puts everyone on the same page in terms of performance. I might need to reconsider this if I add more language options for the user.

I’m currently working on finishing some of the missing features. I’ve just recently migrated from Gitlab to Gitea and I’ve got Jenkins running. I’m implementing the CI/CD pipeline. Then the email service + .onion address!

Well I was thinking in terms of scale, perhaps in the future you might want to add multiple bots per ai, perhaps hundreds or thousands and/or want to simulate many AIs at once. Python is perfectly fine for what you’re doing though, I’m just throwing angles at you.

I get what you mean. Currently the setup is to do 1vs1 matches, but it would be cool to do larger battles with multiple bots.

I’d also really like to venture beyond turn-based at some point. The way I’ve done first person bots in commercial games have all been through interpreting the screen. Capturing screenshots, scanning for specific pixels and patterns and then sending keyboard and mouse inputs based on those. It gets rather complicated very quickly when you try to get more clever with your bot. I know the hardcore bots scan through the program’s memory to get extra information or rewrite parts of the software’s binary or plugins to inject own code/helper functions. Going down that path is something I’d like to avoid as it will be too high barrier for some to enter. Once I’ve got this 1vs1 turn-based platform polished, I’m definitely looking for the next steps. Whether it is for first person or other game types or multiple bots in the same battle, the speed of python may very well become an issue and I may need to introduce some other language options than just Python.

I haven’t put much thought into the first person approach yet, but I’d like to keep it as simple as possible. I’d also really like to simplify the current bot programming even further… I’m very much open to suggestions!

One of the current limitations for supporting only python is the available memory that I have on my cluster computers, the python environment takes a big chunk of the RAM. Unless I’m running compiled binaries, adding more virtual machines / interpreters for other languages currently requires too much memory. I’m considering loading and unloading different interpreters to and from memory to support more languages, and ideally without a reboot. I haven’t looked much into this yet.

I’ve managed to claim the number one spot. :partying_face:

I’m looking for your next move.

:tada: Awesome!

I’m not sure when I’ll get around to it, since my days are split pretty evenly between my own Devember project and Advent of Code.

But I’ll be sure to come back and kick your ass sometime - don’t you worry :wink: