Nx2l's Home Lab blog

nx2l · April 2, 2022, 1:38am

My libvirtd host has a NAT bridge by default

I setup a isolated bridge to mimic a empty LAN.

Then i have an external bridge on the host that mimics being the WAN (which is really my LAN)

So I think i’ll reset… and start with pfsense as first guest/vm.

oO.o · April 2, 2022, 1:41am

So when I was looking at it, I think I had planned to run the following on the gateway (pfsense or whatever you want):

dhcp with pxe/kickstart/etc
dns
reverse proxy
wild card cert

Although I don’t know to what degree you can automate config on pfsense which you’d kind of want to…

Also on the fence about whether or not to spin up a domain controller first.

nx2l · April 2, 2022, 1:43am

i dont really want to do pfsense for router… (it kinda feels like cheating) but i figure that would just mimic an organizations real network…

and then for the okd cluster.

something like metalLB or whatever the docs say works is what i can go with, and DNS can just be updated to point to that instead of the pfsense?

oO.o · April 2, 2022, 1:46am

You could just use another RHEL box or vm as your gateway. I’m pretty sure there’s some automation to set that up in with the okd stuff. I remember it vaguely from a youtube video…

nx2l · April 2, 2022, 1:49am

im barely into the docs, but i thought you are supposed to setup some kind of automated load balancer for the pods

oO.o · April 2, 2022, 1:51am

I think that’s the reverse proxy which I think is haproxy typically.

nx2l · April 2, 2022, 9:18pm

well the nuc8 fan just started whining…

i cant really feel air coming out of it… so either its dead or about to overheat and die

temps slowly climbing

this may be the end

somehow it recovered…

nx2l · April 3, 2022, 2:43am

at this point…

even if nothing else was worth it…

this bind/dhcp setup with ddns is worth it… (since i use fedora, i had to make some tweaks)
but its fucking awesome.

nx2l · April 4, 2022, 9:45pm

so i redid some networking stuff
so i could work around the dns forwarding stuff not working consistently (for the rest of the home LAN stuff ; blah blah forwarder issues)

now DNS is working better.

also put the 10Gb sfp+ card in

so had to redo networking mostly b/c of that… and change stuff over to vlan ids

seems happy now.

now i can get back to building the internal mirror next chance i get

nx2l · April 6, 2022, 11:13pm

THe mirror is setup again and populated with the image i will use.

I just have to figure out what i need to do…

I ran this:

$ oc adm release extract -a ${LOCAL_SECRET_JSON} --command=openshift-install "${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}"

now i just need to figure out what to do next…
i think i need to use the openshift-install script next… but not sure how yet

nx2l · April 8, 2022, 10:24pm

i have the MACs for the vms setup in DHCP now… just have to figure out this install yaml that i need next

lots of reading on this page… so i can figure out the install-config.yaml

https://docs.okd.io/4.10/installing/installing_bare_metal/installing-bare-metal.html

nx2l · April 9, 2022, 3:42pm

ok so i fumbled my way thru the iso install method…

Installing a user-provisioned cluster on bare metal - Installing on bare metal | Installing | OKD 4.10

nx2l · April 9, 2022, 4:09pm

ok so i guess the next step is approving these pending csrs

oc get csr -ojson | jq -r ‘.items[] | select(.status == {} ) | .metadata.name’ | xargs oc adm certificate approve

so now its…

now the workers are showing up…

nx2l · April 9, 2022, 4:18pm

i guess i have to wait for the console to start working…
so either i messed up and will have to start over, or i just have to wait.

i might have been too slow on approving csr’s… idk if those errors will resolve themselves…

nx2l · April 9, 2022, 6:22pm

cleared the okd vm drives to do it again…

its taking longer for bootstrap to say its done this time… so i think thats a good thing

im really hoping some more green shows up soon

nx2l · April 9, 2022, 7:16pm

ok…

so the api service was not coming up on the control planes so the bootstrap could go away… idk why

but i rebooted master0 and the api service finally started on it… and then the bootstrap went red…

so i tried the same tactic on master1 and after a few minute the api service went green for it also…

trying the same on master2… but still red at the moment for it.

I think i have to get an account on redhat to pull some missing components… b/c the master nodes appears to be trying to reach the redhat registry… and that might explain some of the errors

NAME                                       VERSION                          AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.10.0-0.okd-2022-03-07-131213   False       True          True       58m     WellKnownAvailable: The well-known endpoint is not yet available: kube-apiserver oauth endpoint https://192.168.7.210:6443/.well-known/oauth-authorization-server is not yet served and authentication operator keeps waiting (check kube-apiserver operator, and check that instances roll out successfully, which can take several minutes per instance)
baremetal                                  4.10.0-0.okd-2022-03-07-131213   True        False         False      57m     
cloud-controller-manager                   4.10.0-0.okd-2022-03-07-131213   True        False         False      62m     
cloud-credential                           4.10.0-0.okd-2022-03-07-131213   True        False         False      62m     
cluster-autoscaler                         4.10.0-0.okd-2022-03-07-131213   True        False         False      57m     
config-operator                            4.10.0-0.okd-2022-03-07-131213   True        False         False      58m     
console                                    4.10.0-0.okd-2022-03-07-131213   False       True          False      20m     DeploymentAvailable: 0 replicas available for console deployment...
csi-snapshot-controller                    4.10.0-0.okd-2022-03-07-131213   True        True          False      57m     Progressing: Waiting for Deployment to deploy csi-snapshot-controller pods
dns                                        4.10.0-0.okd-2022-03-07-131213   True        True          False      57m     DNS "default" reports Progressing=True: "Have 4 available DNS pods, want 5."
etcd                                       4.10.0-0.okd-2022-03-07-131213   True        True          True       56m     InstallerPodContainerWaitingDegraded: Pod "installer-6-master2" on node "master2" container "installer" is waiting since 2022-04-09 19:06:23 +0000 UTC because ContainerCreating...
image-registry                             4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
ingress                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      15m     
insights                                   4.10.0-0.okd-2022-03-07-131213   True        False         False      52m     
kube-apiserver                             4.10.0-0.okd-2022-03-07-131213   True        True          True       21m     GuardControllerDegraded: Missing operand on node master2...
kube-controller-manager                    4.10.0-0.okd-2022-03-07-131213   True        True          True       54m     InstallerPodContainerWaitingDegraded: Pod "installer-8-master1" on node "master1" container "installer" is waiting since 2022-04-09 19:04:53 +0000 UTC because ContainerCreating...
kube-scheduler                             4.10.0-0.okd-2022-03-07-131213   True        False         False      54m     
kube-storage-version-migrator              4.10.0-0.okd-2022-03-07-131213   False       True          False      6m28s   KubeStorageVersionMigratorAvailable: Waiting for Deployment
machine-api                                4.10.0-0.okd-2022-03-07-131213   True        False         False      57m     
machine-approver                           4.10.0-0.okd-2022-03-07-131213   True        False         False      57m     
machine-config                                                              True        True          True       47m     Unable to apply 4.10.0-0.okd-2022-03-07-131213: timed out waiting for the condition during syncRequiredMachineConfigPools: error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 3)
marketplace                                4.10.0-0.okd-2022-03-07-131213   True        False         False      57m     
monitoring                                                                  False       True          True       42m     Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.
network                                    4.10.0-0.okd-2022-03-07-131213   True        True          True       58m     DaemonSet "openshift-ovn-kubernetes/ovnkube-master" rollout is not making progress - last change 2022-04-09T19:05:21Z
node-tuning                                4.10.0-0.okd-2022-03-07-131213   True        False         False      57m     
openshift-apiserver                        4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
openshift-controller-manager               4.10.0-0.okd-2022-03-07-131213   True        False         False      53m     
openshift-samples                          4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
operator-lifecycle-manager                 4.10.0-0.okd-2022-03-07-131213   True        False         False      57m     
operator-lifecycle-manager-catalog         4.10.0-0.okd-2022-03-07-131213   True        False         False      57m     
operator-lifecycle-manager-packageserver   4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
service-ca                                 4.10.0-0.okd-2022-03-07-131213   True        False         False      58m     
storage                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      58m

so i guess i’ll hae to reset again and do some more reading on these errors/issues

nx2l · April 10, 2022, 3:15am

after 3 attempts…

i think that going slower and giving the control plane VMs some more ram might help…

starting over again tomorrow.

nx2l · April 10, 2022, 12:03pm

10GB of ram on the control plane vms def helped

but i think they need more…

i think if all the errors dont clear up this time… i’ll throw my optane drive on the host and configure it as swap… that might allow me to throw more ram at the master nodes

NAME                                       VERSION                          AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.10.0-0.okd-2022-03-07-131213   False       True          False      18m     WellKnownAvailable: The well-known endpoint is not yet available: kube-apiserver oauth endpoint https://192.168.7.212:6443/.well-known/oauth-authorization-server is not yet served and authentication operator keeps waiting (check kube-apiserver operator, and check that instances roll out successfully, which can take several minutes per instance)
baremetal                                  4.10.0-0.okd-2022-03-07-131213   True        False         False      17m     
cloud-controller-manager                   4.10.0-0.okd-2022-03-07-131213   True        False         False      23m     
cloud-credential                           4.10.0-0.okd-2022-03-07-131213   True        False         False      23m     
cluster-autoscaler                         4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
config-operator                            4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
console                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      2m31s   
csi-snapshot-controller                    4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
dns                                        4.10.0-0.okd-2022-03-07-131213   True        False         False      17m     
etcd                                       4.10.0-0.okd-2022-03-07-131213   True        False         False      16m     
image-registry                             4.10.0-0.okd-2022-03-07-131213   True        False         False      7m58s   
ingress                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      6m27s   
insights                                   4.10.0-0.okd-2022-03-07-131213   True        False         False      12m     
kube-apiserver                             4.10.0-0.okd-2022-03-07-131213   True        True          False      10m     NodeInstallerProgressing: 1 nodes are at revision 7; 2 nodes are at revision 9
kube-controller-manager                    4.10.0-0.okd-2022-03-07-131213   True        False         False      15m     
kube-scheduler                             4.10.0-0.okd-2022-03-07-131213   True        False         False      14m     
kube-storage-version-migrator              4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
machine-api                                4.10.0-0.okd-2022-03-07-131213   True        False         False      17m     
machine-approver                           4.10.0-0.okd-2022-03-07-131213   True        False         False      17m     
machine-config                                                              True        True          True       7m47s   Unable to apply 4.10.0-0.okd-2022-03-07-131213: timed out waiting for the condition during syncRequiredMachineConfigPools: error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 3)
marketplace                                4.10.0-0.okd-2022-03-07-131213   True        False         False      17m     
monitoring                                 4.10.0-0.okd-2022-03-07-131213   True        False         False      4m30s   
network                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      19m     
node-tuning                                4.10.0-0.okd-2022-03-07-131213   True        False         False      17m     
openshift-apiserver                        4.10.0-0.okd-2022-03-07-131213   True        False         False      10m     
openshift-controller-manager               4.10.0-0.okd-2022-03-07-131213   True        False         False      16m     
openshift-samples                          4.10.0-0.okd-2022-03-07-131213   True        False         False      11m     
operator-lifecycle-manager                 4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
operator-lifecycle-manager-catalog         4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
operator-lifecycle-manager-packageserver   4.10.0-0.okd-2022-03-07-131213   True        False         False      12m     
service-ca                                 4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
storage                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      18m

well the only one that didnt resolve itself was:

machine-config                                                              True        True          True       11m     Unable to apply 4.10.0-0.okd-2022-03-07-131213: timed out waiting for the condition during syncRequiredMachineConfigPools: error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 3)

nx2l · April 10, 2022, 10:56pm

OMG

I think its going to finish this time…

as long as the auth fixes itself… this is looking good.

NAME                                       VERSION                          AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.10.0-0.okd-2022-03-07-131213   False       True          False      14m     WellKnownAvailable: The well-known endpoint is not yet available: kube-apiserver oauth endpoint https://192.168.7.210:6443/.well-known/oauth-authorization-server is not yet served and authentication operator keeps waiting (check kube-apiserver operator, and check that instances roll out successfully, which can take several minutes per instance)
baremetal                                  4.10.0-0.okd-2022-03-07-131213   True        False         False      13m     
cloud-controller-manager                   4.10.0-0.okd-2022-03-07-131213   True        False         False      15m     
cloud-credential                           4.10.0-0.okd-2022-03-07-131213   True        False         False      15m     
cluster-autoscaler                         4.10.0-0.okd-2022-03-07-131213   True        False         False      13m     
config-operator                            4.10.0-0.okd-2022-03-07-131213   True        False         False      14m     
console                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      26s     
csi-snapshot-controller                    4.10.0-0.okd-2022-03-07-131213   True        False         False      4m42s   
dns                                        4.10.0-0.okd-2022-03-07-131213   True        False         False      13m     
etcd                                       4.10.0-0.okd-2022-03-07-131213   True        True          False      12m     NodeInstallerProgressing: 3 nodes are at revision 5; 0 nodes have achieved new revision 6
image-registry                             4.10.0-0.okd-2022-03-07-131213   True        False         False      4m33s   
ingress                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      4m20s   
insights                                   4.10.0-0.okd-2022-03-07-131213   True        False         False      8m7s    
kube-apiserver                             4.10.0-0.okd-2022-03-07-131213   True        True          False      4m14s   NodeInstallerProgressing: 1 nodes are at revision 4; 2 nodes are at revision 6
kube-controller-manager                    4.10.0-0.okd-2022-03-07-131213   True        True          False      11m     NodeInstallerProgressing: 1 nodes are at revision 5; 2 nodes are at revision 7
kube-scheduler                             4.10.0-0.okd-2022-03-07-131213   True        False         False      10m     
kube-storage-version-migrator              4.10.0-0.okd-2022-03-07-131213   True        False         False      4m22s   
machine-api                                4.10.0-0.okd-2022-03-07-131213   True        False         False      13m     
machine-approver                           4.10.0-0.okd-2022-03-07-131213   True        False         False      13m     
machine-config                             4.10.0-0.okd-2022-03-07-131213   True        False         False      7m59s   
marketplace                                4.10.0-0.okd-2022-03-07-131213   True        False         False      13m     
monitoring                                                                  Unknown     True          Unknown    13m     Rolling out the stack.
network                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      14m     
node-tuning                                4.10.0-0.okd-2022-03-07-131213   True        False         False      13m     
openshift-apiserver                        4.10.0-0.okd-2022-03-07-131213   True        False         False      7m14s   
openshift-controller-manager               4.10.0-0.okd-2022-03-07-131213   True        False         False      11m     
openshift-samples                          4.10.0-0.okd-2022-03-07-131213   True        False         False      6m26s   
operator-lifecycle-manager                 4.10.0-0.okd-2022-03-07-131213   True        False         False      13m     
operator-lifecycle-manager-catalog         4.10.0-0.okd-2022-03-07-131213   True        False         False      13m     
operator-lifecycle-manager-packageserver   4.10.0-0.okd-2022-03-07-131213   True        False         False      7m18s   
service-ca                                 4.10.0-0.okd-2022-03-07-131213   True        False         False      14m     
storage                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      14m

nx2l · April 10, 2022, 11:01pm

IT WORKED

NAME                                       VERSION                          AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.10.0-0.okd-2022-03-07-131213   True        False         False      3m15s   
baremetal                                  4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
cloud-controller-manager                   4.10.0-0.okd-2022-03-07-131213   True        False         False      20m     
cloud-credential                           4.10.0-0.okd-2022-03-07-131213   True        False         False      21m     
cluster-autoscaler                         4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
config-operator                            4.10.0-0.okd-2022-03-07-131213   True        False         False      19m     
console                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      5m36s   
csi-snapshot-controller                    4.10.0-0.okd-2022-03-07-131213   True        False         False      9m52s   
dns                                        4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
etcd                                       4.10.0-0.okd-2022-03-07-131213   True        False         False      17m     
image-registry                             4.10.0-0.okd-2022-03-07-131213   True        False         False      9m43s   
ingress                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      9m30s   
insights                                   4.10.0-0.okd-2022-03-07-131213   True        False         False      13m     
kube-apiserver                             4.10.0-0.okd-2022-03-07-131213   True        False         False      9m24s   
kube-controller-manager                    4.10.0-0.okd-2022-03-07-131213   True        False         False      16m     
kube-scheduler                             4.10.0-0.okd-2022-03-07-131213   True        False         False      15m     
kube-storage-version-migrator              4.10.0-0.okd-2022-03-07-131213   True        False         False      9m32s   
machine-api                                4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
machine-approver                           4.10.0-0.okd-2022-03-07-131213   True        False         False      19m     
machine-config                             4.10.0-0.okd-2022-03-07-131213   True        False         False      13m     
marketplace                                4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
monitoring                                 4.10.0-0.okd-2022-03-07-131213   True        False         False      3m38s   
network                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      19m     
node-tuning                                4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
openshift-apiserver                        4.10.0-0.okd-2022-03-07-131213   True        False         False      12m     
openshift-controller-manager               4.10.0-0.okd-2022-03-07-131213   True        False         False      16m     
openshift-samples                          4.10.0-0.okd-2022-03-07-131213   True        False         False      11m     
operator-lifecycle-manager                 4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
operator-lifecycle-manager-catalog         4.10.0-0.okd-2022-03-07-131213   True        False         False      18m     
operator-lifecycle-manager-packageserver   4.10.0-0.okd-2022-03-07-131213   True        False         False      12m     
service-ca                                 4.10.0-0.okd-2022-03-07-131213   True        False         False      19m     
storage                                    4.10.0-0.okd-2022-03-07-131213   True        False         False      19m

the host is using swap since ram is 100% but it worked so idc

@oO.o