A “unique” set of challenges with no deadline and no actual direction.
This is a working document so it’s littered with errors, omissions and unknowns, sorry about that.
Table of Contents
Managing the in-house infrastructure has become more and more difficult over time, and because it "works" there's been a resistance to altering it.
1 Specific Challenges
1.1 Power
The electricity supply in South Africa is unstable, unpredictable and unavailable for hours at a time.
Protecting equipment from power surges during startup of the grid is of critical importance. Voltage levels can also fluctuate to a large degree during "normal" operation with LED lights in my house dimming for brief periods and UPS' beeping briefly while the power to devices that aren't on a battery backup continues to be on.
https://en.wikipedia.org/wiki/South_African_energy_crisis
For higher "level" outages, the time that power is restored for before the next scheduled outage begins can be too short to recharge UPS' fully.
1.2 Internet Access
Though my fiber connection is stable during most two to three hour outages, it does eventually go down if the outage persists for more than five hours.
Additionally, cell coverage decreases over time, falling back to 3G within about an hour of the outage starting all the way to no network connectivity after two hours. Calls are sometimes still possible from outside my home, but signal penetration disappears with our brick walls.
I have a cell internet backup for when the fiber is down outside of power outages.
1.3 Working from home
To further add to the challanges, I have been working from a home office since 2012-odd, with power grid issues being an issue since 2007, gradually increasing in severity and duration.
My work primarily involves software development and support with computer access being an absolute must and internet access being neccesarry for a large chunk of what I do.
Excessive downtime causes knock-on effects in project delivery, time to resolve for issues and an interference with "normal" life as the downtime needs to be made up somewhere.
Much of the information I work with is confidential and protected by various laws across the world, leaking information isn't simply a breach of contract, it could be criminal is some instances.
Running a seperate VPN for work that is isolated from VPN's used for services in the home is neccessary.
2 State of Affairs
2.1 Current
The current state of affairs is a mixture of computer hardware, network devices and power backup devices.
This poses difficulty in ensuring essential devices remain up and running and non-essential items shutdown, preferrably safely.
2.1.1 Storage
Two seperate HPE Micro servers run Unraid to provide services to network.
One hosts important data on a redundent array (BtrFS at the moment) and the other hosts virtual applicances including Plex, a second Unifi controller and at one stage an IoT management service.
A BlackArmor NAS offers some overflow storage options, but is overdue for retirement.
Network storage share a UPS with the network devices. 3000VA, no current extended battery but the UPS does make provision for it.
2.1.2 Network Services
My current network is run on Ubiquiti Unifi hardware, a DreamMachine is used as the controller, router and primary DNS, this must remain up for the duration of an outage if I wish to maintain fiber connectivity.
The fiber connection is managed by a Calix ONT that is bridged to the DreamMachine's WAN port. The ONT is powered by a mini-ups that can run it for roughly 12-hours uninterrupted. Possibly more if the WIFI on the Calix is fully disable.
Additionally, an 8-port 150W POE switch provides power to a Flex and then two wireless APs. There will ideally remain on to ensure wireless access to the fiber connection.
Additional TP-Link switches are used, but can be disconnected from the network on powerloss, no essential devices connect to these.
All the network equipment requiring "mains" power (AC) input are connected to a 3000VA online UPS.
2.1.3 Workstations
My primary workstation is connected to a UPS that is dedicated to it. 3000VA online with an extended battery, typically delivering around two hours of work time.
This workstation does not need to remain online for work and is probably better off being shutdown as soon as possible.
Additionally I frequently work on a MacBook Pro 16" and a Dell XPS13. These become my primary systems during extended outages.
Several other computers are connected to the network at any one time, but none are critical and have their own 1000VA online UPS' to get them to safe shutdown.
2.1.4 Additional Hardware and IoT
The network also hosts two Apple TV's, an Nvidia Shield, three Alexa's (more on this later), some audio equipment, a Phillips Hue Hub, several phones, consoles and tablets at any one time.
One TV and Apple TV remains on for about an hour on it's current UPS to provide entertainment for outages outside of working hours.
The other devices can and do shutdown at failure.
2.1.5 Power Generation
We have provision for a generator with failover should the mains go down, unfortunately the only location for a generator is not safe (CO2 build-up) and is noisy beyond reason.
The cost of generators have gone up substantially since 2007 and "silent" generators are now prohibitively expensive for units with a rated capacity that doesn't require rewiring the house to seperate high current devices from the rest of the house while still having a single switching point. Additionally solar power is at an absolute high point in terms of cost at the moment. Having failover to battery backup or generation for the entire house would be nice, but isn't on the cards at the moment.
2.2 Issues
There are a number of things I'd like to improve on over the current setup.
- IoT devices aren't seperated into their own vlan, confidential information is kept behind password protected shares but better segregation is prudent. I have booted devices that engage in network scanning off, but the wife would like some of it back.
- Ineffecient use of backup power, several devices that are not critical remain powered on during outages. The largest capacity UPS is currently powering a workstation that should be shutdown immediately. Several devices share the networking UPS, limiting the time I have internet access throughout the house, I can still get wired connectivity in my office, but there are challenges with that connection being made directly to the Calix router and not from behind the DreamMachine.
- Internet failover, cell based connectivity is massively expensive in South Africa, especially for data hungry services like game downloads, OS updates and Netflix. Having the backup cell service online permanently sharing the network load is not practical. Manually "failover" is currently required. I would like to automate the process and block non-essential services when fiber is not available. This may not be possible with the DreamMachine.
- Nothing is ever charged. With announcements of power outages occationally made by turning the power off and others having mere hours of notice, there is a high probability that some devices won't be charged. This is less of an issue for tablets, is mitigated for phones with the availability of powerbanks (again, possibly not charged). It is a larger issue for my work laptops.
- PLACEHOLDER add comments on IoT device - what I wanted them for and how they fail to deliver
- PLACEHOLDER add comments on overcapatilization - Imigration, change of residence, change of job
- PLACEHOLDER add comments on proposed legislation - Cloud/Powergen
- PLACEHOLDER add comments on limited availability of hardware
2.3 Future
2.3.1 Scaling back on equipment and improved management
Ideally I will be able to consolidate some of the infrastructure into a single host, eliminating the need to run multiple systems with a larger cumulative power draw (at idle).
Removing old, unused or lesser user devices from the network and backup power system will pay dividends on "online" time in future.
Centralized management and monitoring would be a great addition, currently there are some hoop-jumping excercises required for management.
2.3.2 Space optimization
Our home is by no means small, it may even be large by most standards, but space is still limited and freeing up space and reducing clutter is appealing.
I find that I enjoy entropy and anything that was organized an hour ago will devolve to chaos if I get busy. Reducing the shit I can throw around will make for a happier wife lol.
PLACEHOLDER I'm not sure where this is actually going, so I'll leave a placeholder here for progress made (or lack thereof)