The complexity of handling two video streams and doing things to the picture is huge. Think GPU/ASIC of FPGA levels of complex.
Easiest/cheapest way I can see is a KM-switch and something like an Elgato Cam Link or Blackmagic DeckLink in the machine that runs the main screen, then open software like OBS.