A CPU simulator where you program the instruction set

Medic5700 · November 3, 2020, 4:36pm

I’m new here so… neat?

I took an interest in low level algorithms and tried to show/display how they work, but realized that doing that individually for each algorithm would be haphazard and inconsistent. So I decided a better approach would be to make a simulation of a CPU I could use to consistently show the algorithms.

I’m essentially making a CPU simulator/assembly interpreter that ‘runs’ assembly like strings. The key point is you can configure the instruction set or the CPU to suit the algorithm your running. While at the same time it tracks various stats to enable meaningful comparisons between algorithms. I’m also aiming it to be extensible enough that in the future I could say, swap out a branch predictor, add floating point instructions, etc

EX: r0 * 9 = r1
CPU(2 registers, 8-bit)
multiply(r0, 9, r1)
energy used: 64 bit-flips

CPU(2 registers, 8-bit)
shiftL(r0, 3, r1)
add(r0, r1, r1)
energy used: 16 bit-flips

Mastic_Warrior · November 4, 2020, 9:34am

Nice, we did something like this in college with the SicSim “architecture”. We ended up building our own assembler and compiler for the spec from scratch as our final project.

max1220 · November 15, 2020, 9:06pm

Nice, I’ve toyed around with this as well. Do yourself a favor, and actually “compile” a bytecode for your machine - directly “interpreting” assembler is slow and error-prone. Plus real computers work that way, and you can learn something about compilers! Plus the emulator code itself will most likely become more compact as well.
Might I ask in what language you’re doing this?

Medic5700 · November 18, 2020, 6:16pm

I’m writing it in Python3

github.com

Medic5700/Book-of-Algorithms/blob/master/CPUsimulator.py

"""
By: Medic5700
An implementation of an CPU simulator to allow a better and standardized way to illistrate bitwise instructions in lowlevel algorithms.
This project is geared towards demonstrating algorithms, and therefor generalizes a lot of stuff. IE: bitLength is settable, instruction words are one memroy element big, etc

IE: I needed something for a super dumb/special use case of demonstrating how a low level operation/algorithm works (memory error correction)
    without using a weird workaround (having a seperate 8-bit memory and 1-bit parity array vs creating a cpu with 9-bit memory)
    in a reliable and extensable way (making a cpu simulator that can be used for multiple algorithms)
"""

import sys
version = sys.version_info
assert version[0] == 3 and version[1] >= 8 #asserts python version 3.8 or greater

class CPUsimulatorV2:
    """A an implimentation of a generic and abstract ALU mainly geared towards illistrating algorithms

    Issues:
    should allow for adding arbitrary amount of arbitrary sized registers
        -> registers that are not bitLength sized should be able to be added with a method instead of with the constructor.

This file has been truncated. show original

And you aren’t wrong about compiling. I am setting it up so that the code is ‘kindof’ compiled in a very obtuse kind of way, because flexibility =P

max1220 · November 19, 2020, 4:31am

I’m not an Python programmer, but from what I can tell, the lowest representation of instructions in your program are still strings. By “compile” I mean translate your “higher-level representation”(In this code your Assembler code), to a lower-level representation, an actual bytecode. (This is missing in your implementation, as far as I can tell).
You would then load that bytecode into an emulated memory region, then start decoding instructions read from memory as words. Your PC would then be an actual pointer to some memory. This is important if you want to test certain algorithms that need this bytecode specified, for example self-modifying code, or jump tables etc.
Am I correct that the “native” integer format(word) is python’s int? The Add instruction doesn’t seem to respect the bitLength.

Medic5700 · December 1, 2020, 11:42am

You are mostly correct in your understanding of my code.
I haven’t actually gotten around to the lower level representation of stuff yet (also I’m still figuring it out).
All the ‘registers’ and ‘memory’ are arrays of python ints, yes. (hopefully to be changed out to something more complex once I get more stuff working).
The Add instruction is intentionally generic, because prototype; however it does respect the bitlength of the destination register (it trunks the output to the bitlength of whatever register it’s outputting to)

Medic5700 · December 9, 2020, 2:46am

Random update [2020-12-08]:
You can add/configure registers/memory/flags.
You can inject data into the registers/memory.
Currently working on a Parser.
That’s about it so far since I want the parser working before implementing more stuff.

Medic5700 · December 16, 2020, 8:59am

Random update [2020-12-16]
The Parser almost works the way I want it, AND it’s modular.
Downside, it’s super fragile and assumes no typos, errors, etc.

Mastic_Warrior · December 17, 2020, 11:59am

Writing the parser is essentially like writing the compiler but with tolerance for errors. IT is a mighty fine line.

Medic5700 · December 23, 2020, 5:51pm

Random update [2020-12-23]
Welp, I got a working prototype working after being stuck on how to implement the execution engine for a week.
It’s still fairly fragile, but it can run code.

At most right now, it can run code like the following

Medic5700 · January 6, 2021, 8:39am

Random update [2021-01-06]
After having a working prototype to test against, the long battle with a bug army began.
…
It can now reliably run some rudimentary source code (the multiplication via shift and add from last update), but there are still a number of things not implemented (assembler directives, system calls, load instructions, etc.)

And some super prototype stuff, a couple hundred lines implementing/describing a small fraction of a RiscV CPU (RV32I). Mainly used to figure out what’s missing, how my program could be used, any design flaws, etc.
(In below image is some of the instruction set being mapped to instruction strings. Notice how the ‘beq’ and ‘bne’ instructions use the same underlying function ‘self.opJump’ but with different arguments and the input arguments shifted around)

and it running some RiscV like assembly code (the multiplication via shift and add)

Devember was fun, I learned a lot, and possibly went down a couple too many rabbit holes. Tis the season to get lost in technical document details =P
Here is the source code github link if anyone is interested:

github.com

Medic5700/Book-of-Algorithms/blob/master/CPUsimulator.py

"""
By: Medic5700
An implementation of an CPU simulator to allow a better and standardized way to illistrate bitwise instructions in lowlevel algorithms.
This project is geared towards demonstrating algorithms, and therefor generalizes a lot of stuff. IE: bitLength is settable, instruction words are one memroy element big, etc

IE: I needed something for a super dumb/special use case of demonstrating how a low level operation/algorithm works (memory error correction)
    without using a weird workaround (having a seperate 8-bit memory and 1-bit parity array vs creating a cpu with 9-bit memory)
    in a reliable and extensable way (making a cpu simulator that can be used for multiple algorithms)

Development Stack:
    Python 3.8 or greater
    A terminal that supports ANSI (IE: default Ubuntu Terminal or the "Windows Terminal" app for Windows)
"""

import sys
version = sys.version_info
assert version[0] == 3 and version[1] >= 8 #asserts python version 3.8 or greater

import copy #copy.deepcopy() required because state['flag'] contains a dictionary which needs to be copied
import functools #used for partial functions when executioning 'instruction operations'

This file has been truncated. show original