Phaselockedloopable- PLL's continued exploration of networking, self-hosting and decoupling from big tech

Reminder to make sysd service to at boot:

echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

Easy Copy Pasta:

Install fedora first

Install dev libraries

sudo dnf groupinstall "Development Tools" "Development Libraries"

Make sure you have git and what not

sudo dnf install git curl wget rsycn

Install the damn plugins that should be default

sudo dnf -y install dnf-plugins-core

Install DKMS

sudo dnf install dkms

Pull the tweaks into confs and then install LTS kernel

sudo dnf copr enable kwizart/kernel-longterm-5.15

sudo dnf install kernel-longterm kernel-longterm-devel

reboot and monitor for perf issues then proceed
Install nvidia via negativo and other negativo goodies

dnf config-manager --add-repo=https://negativo17.org/repos/fedora-multimedia.repo
dnf install ffmpeg
dnf config-manager --add-repo=https://negativo17.org/repos/fedora-nvidia.repo
dnf -y install nvidia-driver nvidia-driver-cuda cuda-devel

Install ZFS

dnf install -y https://zfsonlinux.org/fedora/zfs-release$(rpm -E %dist).noarch.rpm
dnf install -y zfs
modprobe zfs

Note to self if not built execute loop:

for directory in /lib/modules/*; do
  kernel_version=$(basename $directory)
  dkms autoinstall -k $kernel_version
done

Set to always load on boot

echo zfs > /etc/modules-load.d/zfs.conf

Install ssh, docker and cockpit

sudo dnf config-manager \
    --add-repo \
    https://download.docker.com/linux/fedora/docker-ce.repo

sudo dnf install docker-ce docker-ce-cli containerd.io docker-compose-plugin
sudo dnf install ssh cockpit*

Remember firewalld and SELinux contexts can mess things up

Dont forget steam

dnf config-manager --add-repo=https://negativo17.org/repos/fedora-steam.repo dnf -y install steam kernel-modules-extra

Note to self for 98-user.conf

# Increase size of file handles and inode cache, if default not enough
fs.file-max = 2097152  

# Do less swapping
vm.swappiness = 5
vm.dirty_ratio = 60
vm.dirty_background_ratio = 2

# Assume RTT in data center with 10GbE network = 1~100ms,  BDP=0.1sec * 10Gbps /8 = 134217728 Byte 
# Increase the maximum amount of option memory buffers 256M
net.core.optmem_max = 268435456

# Maximum Socket Send and Read Buffer: close to 128M
net.core.rmem_default = 212992
net.core.wmem_default = 212992
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728

# To increase backlog for 10G NICS : reference to Oracle white paper
net.core.netdev_max_backlog = 300000
net.core.somaxconn = 8192

#(control the packets queue length, cost 64 bytes per entry, maxlen 65535)
net.ipv4.tcp_max_syn_backlog = 8192

# allow fast TCP connection
net.ipv4.tcp_fastopen = 1
net.ipv4.tcp_fin_timeout = 10

# F-RTO is an enhanced recovery algorithm for TCP retransmission timeouts. used when network unstable, RTT change scenario, conflict with MTU probe 
net.ipv4.tcp_frto = 2
net.ipv4.tcp_keepalive_intvl = 15
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_time = 600

# Maximal number of timewait, depends on the network state, set smaller if don't need tw in good network
net.ipv4.tcp_max_tw_buckets = 5000

# Minimal number of segments per TSO frame
net.ipv4.tcp_min_tso_segs = 2

#(enable buffer auto-tuning)
net.ipv4.tcp_moderate_rcvbuf = 1

#(sockets that no connect with any process
#(appliaction closed,but TCP still not fin), over half would warning may cost RAM 2GB, can increase this if warning)
net.ipv4.tcp_max_orphans = 65536
net.ipv4.tcp_orphan_retries = 1 

# TCP Retry
net.ipv4.tcp_retries1 = 3
net.ipv4.tcp_retries2 = 5

# SYN Prevention
net.ipv4.tcp_syn_retries = 3
net.ipv4.tcp_synack_retries = 3
 
#(controls how much of the tcp congestion window is consumed by a single TSO frame)
net.ipv4.tcp_tso_win_divisor = 8

net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_tw_reuse = 1 

#IP Fragmentation memory allocation
net.ipv4.ipfrag_low_thresh=393216
net.ipv4.ipfrag_high_thresh=544288

# user-spaced socket port range
net.ipv4.ip_local_port_range=5536 65535

unbound and pihole exist on a proxmox vm

the vm has a wireguard connection going from the vm instance to my main linode.

nginx is used as my reverse proxy to frontend my dns.

pihole-unbound.conf (stole your config)

server:
    # If no logfile is specified, syslog is used
    # logfile: "/var/log/unbound/unbound.log"
    verbosity: 3

    # IP TCP/UDP
    interface: 127.0.0.1
    interface: ::1
    port: 5335
    do-ip4: yes
    do-udp: yes
    do-tcp: yes
    do-ip6: no
    prefer-ip6: no
    so-reuseport: yes
    max-udp-size: 3072
    udp-upstream-without-downstream: yes

    # Root Hints
    root-hints: "/etc/unbound/root.hints"

    # DNSSEC and HARDEN
    harden-short-bufsize: yes
    harden-large-queries: yes
    harden-glue: yes
    harden-dnssec-stripped: yes
    harden-below-nxdomain: yes
    harden-referral-path: yes
    target-fetch-policy: "0 0 0 0 0"
    harden-algo-downgrade: no
    use-caps-for-id: no # Set no if you plan to use DNSSEC
    edns-buffer-size: 1472 # Set MTU of network
    hide-identity: yes
    hide-version: yes
    qname-minimisation: yes
    aggressive-nsec: yes
    unwanted-reply-threshold: 10000000
    deny-any: no
    rrset-roundrobin: yes
    minimal-responses: yes
    module-config: "validator iterator"
    root-key-sentinel: yes
    val-clean-additional: yes
    val-log-level: 2
    trust-anchor-signaling: yes
    prefetch-key: yes
    auto-trust-anchor-file: "/var/lib/unbound/root.key"

   # OPTIMIZE
    prefetch: yes
    cache-min-ttl: 0
    serve-expired: yes
    serve-expired-ttl: 14400
    so-reuseport: yes
    msg-cache-slabs: 8
    rrset-cache-slabs: 8
    infra-cache-slabs: 8
    key-cache-slabs: 8
    outgoing-range: 4096
    msg-cache-size: 256m
    rrset-cache-size: 512m
    num-threads: 4
    so-rcvbuf: 8m
    so-sndbuf: 8m

    tls-cert-bundle: /etc/ssl/certs/ca-certificates.crt # Only necessary for DoT Lookups

    # Ensure privacy of local IP ranges
    private-address: 192.168.0.0/16
    private-address: 169.254.0.0/16
    private-address: 172.16.0.0/12
    # private-address: 10.0.0.0/8
    private-address: ff00::/8


remote-control:
    # Enable Remote Control
    control-enable: yes
    control-use-cert: "no"
    # what interfaces are listened to for remote control.
    control-interface: 127.0.0.1
    control-port: 8953

    # unbound server key file.
    server-key-file: "/etc/unbound/unbound_server.key"

    # unbound server certificate file.
    server-cert-file: "/etc/unbound/unbound_server.pem"

    # unbound-control key file.
    control-key-file: "/etc/unbound/unbound_control.key"

    # unbound-control certificate file.
    control-cert-file: "/etc/unbound/unbound_control.pem"

nginx.conf

load_module /usr/lib/nginx/modules/ngx_stream_module.so;
user www-data;
worker_processes auto;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;

events {
        worker_connections 768;
        # multi_accept on;
}

stream {
include /etc/nginx/tcp.d/*.conf;
}


http {

        ##
        # Basic Settings
        ##
        client_max_body_size 0;
        sendfile on;
        tcp_nopush on;
        types_hash_max_size 2048;
        # server_tokens off;

        # server_names_hash_bucket_size 64;
        # server_name_in_redirect off;

        include /etc/nginx/mime.types;
        default_type application/octet-stream;

        ##
        # SSL Settings
        ##

        ssl_protocols TLSv1.2 TLSv1.3; # Dropping SSLv3, ref: POODLE
        ssl_prefer_server_ciphers on;

        ##
        # Logging Settings
        ##

        access_log /var/log/nginx/access.log;
        error_log /var/log/nginx/error.log;

        ##
        # Gzip Settings
        ##

        gzip on;
        # gzip_vary on;
        # gzip_proxied any;
        # gzip_comp_level 6;
        # gzip_buffers 16 8k;
        # gzip_http_version 1.1;
        # gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rs>

        ##
        # Virtual Host Configs
        ##

        include /etc/nginx/conf.d/*.conf;
        include /etc/nginx/sites-enabled/*;
}


#mail {
#       # See sample authentication script at:
#       # http://wiki.nginx.org/ImapAuthenticateWithApachePhpScript
#
#       # auth_http localhost/auth.php;
#       # pop3_capabilities "TOP" "USER";
#       # imap_capabilities "IMAP4rev1" "UIDPLUS";
#
#       server {
#               listen     localhost:110;
#               protocol   pop3;
#               proxy      on;
#       }
#
#       server {
#               listen     localhost:143;
#               protocol   imap;
#               proxy      on;
#       }
#}



002-dns.conf (nginx stream)

    upstream dns {
        zone dns 64k;
        server 10.0.0.5:53;
    }

   # DoT server for decryption
   server {
        listen 853 ssl;
        listen [::]:853 ssl;
        ssl_certificate /etc/letsencrypt/live/mysite.com-0004/fullchain.pem; # managed by Certbot
        ssl_certificate_key /etc/letsencrypt/live/mysite.com-0004/privkey.pem; # managed by Certbot
        proxy_pass dns;
        proxy_connect_timeout   30s;
        preread_timeout         50s;
        ssl_session_tickets on;
        ssl_session_timeout   4h;
        ssl_handshake_timeout 30s;

    }

   # Regular DNS
   server {
        listen 53;
        listen 53 udp;
        listen [::]:53;
        listen [::]:53 udp;
        proxy_pass dns;
        proxy_connect_timeout   30s;
        proxy_responses 1;
        preread_timeout         50s;
    }

@PhaseLockedLoop

1 Like

I know what it is. You have a cert name mismatch. you cannot use the automated cert mode with LE to do what I did. Its why I did move to a paid cert but it is possible to remind yourself every 90 days to renew. The automatic mode only issues a wildcard cert for the subdomains. You must have a cert that is created for BOTH tld and *.tld which yours is not. Try running just tld on ssllabs or a curl or a kdig yourself.

It will throw the mismatch but if you do any subdomain it works. Your cert is messed up

1 Like

I am moving to the lounge. I didnt realize i was still in your thread.

1 Like

You know I sadly havent updated this in a while. Maybe I will for adding ZFS SLOG

3 Likes

Its been a long time fellas. @redocbew @Argone @Dynamic_Gravity @HaaStyleCat but i may just be embracing a certain set of cloud native ways of doing things

7 of these

4 pictured

Will kit out with 32 gb of ram, 1 tb ssd each

The ssds will just contain databases of the things I run.

@wendell this video sold me on longhorn. Can replica stuff across the nodes AND still utilize my zfs storage server as a back up for those things replicated across the HA cluster storage. A lot of data will still just go over rge network to the storage server running zfs. Will have it as an nfs target.

Hopefully my first foray into Kubernetes goes well

The optiplexes each have i7-4790s in them

9 Likes

Hmmm the cluster shall be named Nordrljos or the viking word for “Northern Lights”

Since by all intents and purposes Bifrost will be the reverse proxy into the HA ingress controller. Bifrost or the burning bridge stretches through the northern lights into the realms of gods according to the mythology. Every machine has a viking god name :smiling_imp:

3 Likes

Ill also finally have a truly separated Testing and Prod as well

1 Like

Will be documenting here for now until its not a shit show and I feel like writing a proper guide

2 Likes

Longhorn is cool. You can’t trust version upgrades yet though. Pain. Suffering. Sometimes.

4 Likes

Easy to install but resource intensive. You’ll eventually notice a you’re spending more system resources on storage than you should. I gave up.

Piraeus/DRBD is faster and less resource intensive. DRBD has been here since forever and is very good at detecting and handling split brain situations.

But if I wanted to send a message to my old self I would just say to go with NFS. Life is too short for on prem Kubernetes storage solutions.

3 Likes

Thats really good to know. Im taking a look at versions right now. That will dissuade me from messing with the v2.0 previews

This guys videos are incredibly helpful when looking for some basic overviews and installation ideas

1 Like

Well the nice thing is Im not using longhorn storage for anything Im too concerned about. Plus they will be regularly backed up over nfs to a zfs storage server so I may not feel the same pains. we will see the mileage I get

1 Like

HA NFS on Infiniband/Omnipath is dead easy too… and not significantly slower than gen3 nvme,…

2 Likes

I honestly considered this as a potential path. The cards are cheap. The modules are out there.

Now I severely X doubt that the 3900x in the storage server is up to the task of saturating that BUT the zfs storage config is not exactly performant. It will likely be the bottleneck of the two

Config:
6x 4TB Toshiba MG04SCA40EE
2x SSD mirrored L2arc
2x Optane mirrored SLOG
Raid z2

Im sadly running up against space limitations soon because ive got about 3tb usable left. Upgrading it isnt out the realm of possibility. Everything at this point has no duplicates. The snapshots are trimmed up. The compression is all lz4. The services themselves are buttoned up to not use space unnecessarily. I may have undersized the drives for what I was going to do

Its fast unless your saturate the ssds (hard to do admitedly)

The network backbone still is just on a netgear managed switch with 24 gigabit ports

1 Like

It would be nice if a good well tuned mini pc on the level of quality of minis forum came with 2 or 3 10gbe ports. Id love to make that the firewall. With a solid ryzen 7 chip in it so it could handle ips ids at that speed

1 Like

What size L2arc and slog ar you aming for?

I run 5 6TB Iron wolf 7200rpm (yes they sell two speeds for whatever reason) in raidz2. Currently no L2arc or zil/slog… I have the drives in there for both devs but theres just sooooo much info back and forth of best practice for sizing them and ram overhead etc I never implemented them. Think its a 16gb optane, and a 480 dct samsung.

Love to see how the speeds turn out. Wondering if I have something set wrong wether its block sizes, encryption, compression or whatever. Look forward to seeing how it shapes up. I have a 3900xt in my rig (since I upgraded). Have 10g nics on three machines to test see how mine can compair. One pc has all nvme ssd storage so can crank out files fast as the drives can write :slight_smile:

1 Like

Hmm, I haven’t really looked into this before. Might have to give it a look. :thinking:

3 Likes

32 gb optanes for zil/slog. You really don’t need much
120 gb l2arc ssds are enough when mirrored

2 Likes