Ansible Adventures

oO.o · March 14, 2022, 6:11pm

I am creating this thread to catalog my trials, errors and triumphs as I work through my Ansible collection.

Today I came across an issue while testing. I use Vagrant to test a variety of operating systems. I take a snapshot of the VMs just after I initialize them so I can return them to a clean state for testing. Unfortunately, in some operating systems, this means that the system clock is stuck at the time of the snapshot after a restore. This becomes an issue when I go to install anything with a package manager because the repository certificates appear to be coming from the future which causes the installation or cache update to fail. This is even more problematic on OS’s that don’t come with Python installed since that limits me to using only the script and raw modules to configure the host.

So, this is my time sync task file. It only relies on the raw module to configure the host and assumes that time on localhost is correct. It does handle disparate time zones by configuring everything in GMT/UTC (but without actually changing the timezone of localhost or target host). And it is check_mode compliant.

- name: Get platform
  ansible.builtin.raw: uname
  register: uname_reg
  changed_when: false

- name: Get time from the host
  ansible.builtin.raw: date -u '+%Y-%m-%d %H:%M:%S'
  register: remote_date_reg
  changed_when: false

- name: Calculate difference in dates
  ansible.builtin.set_fact:
    date_diff: "{{  ((  local_date_var | to_datetime
                        - remote_date_var | to_datetime ).total_seconds() / 60)
                    | int
                    | abs }}"
  vars:
    local_date_var: "{{ now(true, '%Y-%m-%d %H:%M:%S') }}"
    remote_date_var: "{{remote_date_reg['stdout_lines'][0]}}"

- name: Sync time between remote host and localhost
  block:

  - name: Times are in sync within 5 minutes
    ansible.builtin.assert:
      that: date_diff | int < 5
      quiet: true

  rescue:

  - name: Use date format mmddHHMMccyy.ss
    ansible.builtin.set_fact:
      date_format: '%m%d%H%M%Y.%S'
    when: uname_reg['stdout_lines'][0] | lower in ['linux', 'darwin']

  - name: Use date format ccyymmddHHMM.ss
    ansible.builtin.set_fact:
      date_format: '%Y%m%d%H%M.%S'
    when: uname_reg['stdout_lines'][0] | lower in ['freebsd', 'openbsd']

  - name: Set date
    ansible.builtin.raw: >
      TZ=GMT {{ ansible_become_method | default('') }} date {{local_date_var}}
    vars:
      local_date_var: "{{ now(true, date_format) }}"
    when: not ansible_check_mode

oO.o · March 14, 2022, 6:34pm

I want to be able to use service names in my inventory variables but couldn’t figure out a good way to parse /etc/services. There are also some minor differences in /etc/services between operating systems. So I thought maybe I should just pull from the source, which of course is here:

https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml

Ideally I’d have this information in a dictionary so I can use it easily in my Ansible roles. I tried processing the xml file with jq but it was throwing some errors I didn’t understand. Luckily the csv version worked.

I came up with this terrible solution that took literally a day to run on my Macbook Air, but it works. My absolute stubbornness about maintaining an 80 character width squeaked by. Readability was sacrificed, alas.

github.com

o0-o/ansible_collection_site/blob/main/roles/inventory/tasks/conv_iana-srv-defs_to_yaml.yml

---
# vim: ts=2:sw=2:sts=2:et:ft=yaml.ansible
#
# Convert the IANA Service Name and Transport Protocol Port Number
# Registry to YAML and save in /defaults/main.yml
#
# It shouldn't be necessary to run this as IANA service definitions
# don't change often. It is provided here out of transparency to show
# how we convert the IANA registry to the iana_srv_defs dictionary. It
# will be updated on every major release.
#
# This can take several hours to run.
#
########################################################################

# Use csv because xml version doesn't convert well with jc
- name: Download service definitions CSV from the IANA website
  ansible.builtin.uri:
    url: "https://www.iana.org\
      /assignments\

This file has been truncated. show original

I stored the resulting yaml file here if it’s useful to anyone.

github.com

o0-o/ansible_collection_site/blob/main/roles/inventory/defaults/main.yml

---
# vim: ts=2:sw=2:sts=2:et:ft=yaml.jinja2
#
# See README.md for documentation:
# github.com/o0-o/ansible_collection_site/tree/main/roles/inventory
#
########################################################################

# Comment Formatting
default_comment_prefix: "# vim: ts=8:sw=8:sts=8:noet\n#"
ansible_comment_prefix: "# vim: ts=2:sw=2:sts=2:et:ft=yaml.ansible\n#"
yaml_comment_prefix: "# vim: ts=2:sw=2:sts=2:et:ft=yaml\n#"
default_comment_postfix: |
  #
  ########################################################################

# Current Working Directory
cwd: "{{lookup('env', 'PWD')}}"

# Service Definitions

This file has been truncated. show original

Note that protocols with multiple port assignments look like this:

The syntax for port ranges differs between different programs (pf, firewalld, nmcli, etc), so I thought this was easiest. Not many have large ranges like X11…

oO.o · March 14, 2022, 6:36pm

LOOP VARIABLES IN ANSIBLE ARE READ-ONLY AND NO ERROR OR WARNING IS THROWN IF YOU TRY TO CHANGE THEM, THEY JUST SILENTLY GO UNCHANGED

The forum doesn’t like all caps, so I have to type this here…

oO.o · March 14, 2022, 6:53pm

Sometimes it’s difficult to distinguish between physical and virtual interfaces when using tools like ifconfig, ip, nmcli, etc. Often, the most authoritative source for this information is dmesg. Here is how I pull mac addresses from dmesg which I can later correlate to interfaces using whichever network utility is available.

{{  dmesg_reg['stdout']
    | regex_findall('(?:[0-9a-fA-F]{2}[:-]){5}[0-9a-fA-F]{2}')
    | unique
    | map('lower')
    | map('replace', '-', ':')
    | difference(['ff:ff:ff:ff:ff:ff']) }}

I haven’t run into this yet, but this could catch part of a weirdly formatted UUID or something else because - is widely used outside of mac addresses. Not sure how best to solve this or if it even needs to be solved. On Linux and BSD, I’ve never seen a mac address with dashes in it (it’s a Windows thing IIRC), but I’m not certain about switches and I’d like to maintain a consistent mac address regex across the whole project… I have yet to dive into switch config, so if it’s not an issue there, I might just stick with colons.

oO.o · March 15, 2022, 4:36pm

If you use listen: to notify multiple handlers and think you’re clever and use the same listen: value across multiple roles, note that handlers across all roles are loaded in advance, so when you use notify:, it will trigger handlers in roles that have not been run yet.

In my case, I use listen: save host vars to dump host variables into inventory/hostvars/host.yml. More and more variables are included with each role which means that if the handlers are flushed early, they will run into a lot of undefined variables. Easy enough to fix with default but worth noting.

oO.o · March 15, 2022, 4:46pm

If you use Ansible with Vagrant a lot like I do, maintaining inventory variables for the SSH connection parameters becomes annoying. This will automagically get them from Vagrant if the first SSH attempt fails:

- name: Configure SSH connection
  block:

  - name: SSH host address is configured
    ansible.builtin.set_fact:
      host_addr: "{{host_addr_var}}"
    vars:
      host_addr_var: "{{ ansible_host | default(inventory_hostname) }}"

  - name: Host is reachable
    ansible.builtin.shell:
      cmd: "ping -c 2 {{host_addr}}"
    changed_when: false
    delegate_to: 127.0.0.1
    when:
    - ansible_host | default(inventory_hostname) != 'localhost'
    - host_addr != '127.0.0.1'

  # Must quote 'true' with raw module
  - name: SSH is functional
    ansible.builtin.raw: 'true'
    changed_when: false
    when: ansible_connection | default('ssh') == 'ssh'

  # If connection to host isn't functional it may be a Vagrant VM in
  # which case, we can retrieve SSH parameters.
  rescue:

  # Look for Vagrant SSH configuration under FQDN but fall back to short
  # hostname.
  - name: Get SSH configuration from Vagrant
    ansible.builtin.shell:
      chdir: "{{lookup('env', 'PWD')}}"
      cmd: >
        vagrant ssh-config {{inventory_hostname}} ||
        vagrant ssh-config {{ inventory_hostname | split('.') | first }}
    changed_when: false
    register: vagrant_ssh_cfg_reg
    delegate_to: 127.0.0.1

  # Must quote 'true' with raw module
  - name: Vagrant SSH connection is functional
    ansible.builtin.raw: 'true'
    changed_when: false
    vars:
      ansible_user: "{{ vagrant_ssh_cfg_reg['stdout']
                        | regex_search('User .*$', multiline=true)
                        | split(' ')
                        | last }}"
      ansible_host: "{{ vagrant_ssh_cfg_reg['stdout']
                        | regex_search('HostName .*$', multiline=true)
                        | split(' ')
                        | last }}"
      ansible_port: "{{ vagrant_ssh_cfg_reg['stdout']
                        | regex_search('Port .*$', multiline=true)
                        | split(' ')
                        | last }}"
      ansible_ssh_private_key_file: "{{ vagrant_ssh_cfg_reg['stdout']
                                        | regex_search( 'IdentityFile .*$',
                                                        multiline=true )
                                        | split(' ')
                                        | last }}"

  - name: Use Vagrant SSH configuration
    ansible.builtin.set_fact:
      ansible_host: "{{ssh_host_var}}"
      ansible_port: "{{ssh_port_var}}"
      ansible_user: "{{ssh_user_var}}"
      ansible_ssh_private_key_file: "{{ssh_key_var}}"
    changed_when: true
    vars:
      ssh_user_var: "{{ vagrant_ssh_cfg_reg['stdout']
                        | regex_search('User .*$', multiline=true)
                        | split(' ')
                        | last }}"
      ssh_host_var: "{{ vagrant_ssh_cfg_reg['stdout']
                        | regex_search('HostName .*$', multiline=true)
                        | split(' ')
                        | last }}"
      ssh_port_var: "{{ vagrant_ssh_cfg_reg['stdout']
                        | regex_search('Port .*$', multiline=true)
                        | split(' ')
                        | last }}"
      ssh_key_var: "{{  vagrant_ssh_cfg_reg['stdout']
                        | regex_search('IdentityFile .*$', multiline=true)
                        | split(' ')
                        | last }}"

oO.o · March 15, 2022, 7:50pm

Putting host-scoped variables in task names seems like a great idea when you’re testing against a single host, but then you quickly realize how useless it is when it runs against multiple hosts.

Exceptions to this are tasks that use run_once or if the tasks will always be run serially.

nx2l · March 17, 2022, 2:49am

Remember that util update fact module the other day?..

It helped me create a new clone of a json variable but with edited fields of what I needed changed.

oO.o · March 17, 2022, 10:39am

Nice.

That’s a weird one (ansible.util.update_fact). I used it the other day for the first time and didn’t realize that it doesn’t change the variable, it just spits out a new one so you have to capture it with register:.

In my case, I had a variable in a dict that was a list unless it only got one argument, then it was a string. Because Python is so helpful, foo[0] treats a string as a character array and prints the first character of the string instead of erroring out. So if that variable was a string, I updated it to be a list with just one value.

Here it is if anyone is interested:

- name: Treat an assigned IP as a pool of one to simplify logic later
  block:

  - name: If one address exists for the interface, convert it to a list
    ansible.utils.update_fact:
      updates:
      - path: "current_net['addr']"
        value: "{{[current_net['addr']]}}"
    register: update_net_reg
    changed_when: false

  - name: Apply address list to interface network variable
    ansible.builtin.set_fact:
      current_net: "{{update_net_reg['current_net']}}"

  when:
  - current_net['addr'] is defined
  - current_net['addr'] | type_debug != 'list'
  - current_net != {}

nx2l · March 17, 2022, 11:59am

ditto

i had to read the doc page to realize that

oO.o · March 24, 2022, 10:03pm

Today’s frustration:

If you use the ansible.netcommon.hwaddr filter to format/validate a mac address, the documentation indicates that passing 'unix' or ’linux' will result in a familiar looking mac address with 6 pairs of hexadecimal separated by colons. HOWEVER, if you use 'unix', leading zeros for each pair are removed! Passing 'linux' will not remove leading zeros.

oO.o · March 25, 2022, 6:07pm

I chuckle to myself whenever I see this in the exception traceback:

ANSIBALLZ

There’s a bug in the hostname module that causes it to fail on OpenBSD.

  - name: Configure hostname
    ansible.builtin.hostname:
      name: "{{fqdn}}"
    become: yes

TASK [o0_o.site.network : Configure hostname] ****************************************************
changed: [debian11.hq.example.com]
changed: [fedora35.hq.example.com]
changed: [rocky8.hq.example.com]
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: TypeError: can only concatenate str (not "list") to str
fatal: [openbsd7.hq.example.com]: FAILED! => {"changed": false, "module_stderr": "Connection to 127.0.0.1 closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n  File \"/home/vagrant/.ansible/tmp/ansible-tmp-1648228732.094858-70241-265717247399649/AnsiballZ_hostname.py\", line 107, in <module>\r\n    _ansiballz_main()\r\n  File \"/home/vagrant/.ansible/tmp/ansible-tmp-1648228732.094858-70241-265717247399649/AnsiballZ_hostname.py\", line 99, in _ansiballz_main\r\n    invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\r\n  File \"/home/vagrant/.ansible/tmp/ansible-tmp-1648228732.094858-70241-265717247399649/AnsiballZ_hostname.py\", line 47, in invoke_module\r\n    runpy.run_module(mod_name='ansible.modules.hostname', init_globals=dict(_module_fqn='ansible.modules.hostname', _modlib_path=modlib_path),\r\n  File \"/usr/local/lib/python3.8/runpy.py\", line 207, in run_module\r\n    return _run_module_code(code, init_globals, run_name, mod_spec)\r\n  File \"/usr/local/lib/python3.8/runpy.py\", line 97, in _run_module_code\r\n    _run_code(code, mod_globals, init_globals,\r\n  File \"/usr/local/lib/python3.8/runpy.py\", line 87, in _run_code\r\n    exec(code, run_globals)\r\n  File \"/tmp/ansible_ansible.builtin.hostname_payload_wqc942d6/ansible_ansible.builtin.hostname_payload.zip/ansible/modules/hostname.py\", line 983, in <module>\r\n  File \"/tmp/ansible_ansible.builtin.hostname_payload_wqc942d6/ansible_ansible.builtin.hostname_payload.zip/ansible/modules/hostname.py\", line 977, in main\r\nTypeError: can only concatenate str (not \"list\") to str\r\n", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}
changed: [arch-current.hq.example.com]
changed: [centos7.hq.example.com]
changed: [ubuntu2004.hq.example.com]

oO.o · March 29, 2022, 2:52pm

So nmcli output when showing a connection can print one connection per line with each field separated by a colon:

[vagrant@fedora35 ~]$ nmcli --get-values uuid,device connection show
4774acbf-b200-45df-8b6b-a3c90c3b349a:eth0
3e3b0ef1-7d23-443b-a35d-5242f85a0946:eth1
48431e6c-ed1a-4bd3-8d26-5d7fc9b4e5af:eth2
07c8398d-4b85-3df0-8eb8-0c51afc0c787:

But it will only produce multiline output when showing a device. Each device is separated by a blank line:

[vagrant@fedora35 ~]$ nmcli --get-values 'GENERAL.HWADDR,GENERAL.DEVICE' --escape no device show
08:00:27:FE:97:BA
eth0

8A:AB:BA:DA:14:10
eth1

FE:93:9D:B9:25:00
eth2

00:00:00:00:00:00
lo

The former is very easy to process into a matrix:

output['stdout_lines'] | map('split', ':')

But doing the equivalent to the multiline output is trickier:

  - command: nmcli --get-values 'GENERAL.HWADDR,GENERAL.DEVICE' --escape no device show
    register: nm_reg

  - debug:
      msg: "{{  nm_reg['stdout_lines']
                | default([])
                | map('lower')
                | map('regex_replace', '^$', '%')
                | join(' ')
                | split('%')
                | map('trim')
                | map('split', ' ')
                | difference([['00:00:00:00:00:00', 'lo']]) }}"

TASK [debug] *********************************************************************************************************************************************************************************************************************************
ok: [fedora35] => {
    "msg": [
        [
            "08:00:27:fe:97:ba",
            "eth0"
        ],
        [
            "8a:ab:ba:da:14:10",
            "eth1"
        ],
        [
            "fe:93:9d:b9:25:00",
            "eth2"
        ]
    ]
}

oO.o · March 29, 2022, 4:18pm

oO.o:

"{{  nm_reg['stdout_lines']
                | default([])
                | map('lower')
                | map('regex_replace', '^$', '%')
                | join(' ')
                | split('%')
                | map('trim')
                | map('split', ' ')
                | difference([['00:00:00:00:00:00', 'lo']]) }}"

@sgtawesomesauce if you came across something like this at work or in an open source project, are you just like “wtf” or is it comprehensible? Should I just put some comments above the task outlining what it does? It feels arbitrary to just break it into multiple set_facts when no intermediate stage is of any use.

candybar · March 29, 2022, 4:35pm

Is that missing a ' between $ & ,?

oO.o · March 29, 2022, 4:59pm

Fixed! Discourse had a glitch formatting the quote for some reason so I retyped that section.

oO.o · March 30, 2022, 12:19am

lol, I need some oversight…

This will parse a list of physical addresses on the host from dmesg output. This will provide true hardware mac addresses and will skip over vlans, virtual interfaces, etc. Also, if a device was plugged in but was not present when Ansible gathered facts (device name not included in ansible_interfaces), it is not included in the list.


  - name: Get boot log (dmesg)
    ansible.builtin.command:
      cmd: dmesg
    become: true
    changed_when: false
    register: dmesg_reg

  - name: Parse hardware addresses and device names from boot log
    ansible.builtin.set_fact:
      phy_ifaces: >-
        {{  phy_ifaces
            | default([])
            | union(  [ { 'dev':  iface_lines_item
                                  | map('regex_replace', ':', '')
                                  | intersect(ansible_interfaces)
                                  | join,
                          'hw_addr':  iface_lines_item
                                      | select( 'match',
                                                '^' + hw_addr_re_var + '$' )
                                      | join } ] ) }}
    vars:
      hw_addr_re_var: '([0-9a-fA-F]{2}:){5}[0-9a-fA-F]{2}'
    when:
    - iface_lines_item
      | map('regex_replace', ':', '')
      | intersect(ansible_interfaces)
      != []
    - iface_lines_item
      | select( 'match', '^' + hw_addr_re_var + '$' )
      != []
    loop: "{{ dmesg_reg['stdout_lines']
              | map('lower')
              | select('match', '^.*' + hw_addr_re_var + '.*$')
              | map('split', ' ') }}"
    loop_control:
      loop_var: iface_lines_item

  - debug:
      var: phy_ifaces

ok: [fedora35] => {
    "phy_ifaces": [
        {
            "dev": "eth0",
            "hw_addr": "08:00:27:fe:97:ba"
        },
        {
            "dev": "eth1",
            "hw_addr": "08:00:27:ab:7d:a6"
        },
        {
            "dev": "eth2",
            "hw_addr": "0a:5d:08:6d:fe:2f"
        }
    ]
}
ok: [debian11] => {
    "phy_ifaces": [
        {
            "dev": "eth0",
            "hw_addr": "08:00:27:fe:b5:aa"
        },
        {
            "dev": "eth1",
            "hw_addr": "08:00:27:4e:fb:3d"
        },
        {
            "dev": "eth2",
            "hw_addr": "08:00:27:0d:a6:90"
        }
    ]
}
ok: [arch-current] => {
    "phy_ifaces": [
        {
            "dev": "eth0",
            "hw_addr": "08:00:27:42:78:de"
        },
        {
            "dev": "eth1",
            "hw_addr": "08:00:27:02:cd:8d"
        }
    ]
}
ok: [openbsd7] => {
    "phy_ifaces": [
        {
            "dev": "em0",
            "hw_addr": "08:00:27:4d:a2:90"
        },
        {
            "dev": "em1",
            "hw_addr": "0a:5d:08:6d:00:01"
        },
        {
            "dev": "em2",
            "hw_addr": "0a:5d:08:6d:00:02"
        },
        {
            "dev": "em3",
            "hw_addr": "0a:5d:08:6d:00:03"
        }
    ]
}

This would be a universal solution to discovering physical interfaces on a host, except on Arch and macOS, the dmesg buffer tends to get full of garbage and you lose the early boot messages pretty quickly. On Arch, this is mostly attributable to auditd and on macOS, it was primarily the wifi spamming the kernel log.

I was planning to use nmcli to do this if it was available, but I realized that what it calls the hardware address does not report the true hardware address of the NIC if it’s configured to be randomized. However, the better solution appears to be the ip command which will specify the “permanent” hardware address if it differs from the configured one in the output of ip link (haven’t tested this on all distros yet though).

oO.o · March 31, 2022, 6:46pm

I am working through an issue where a portion of my role needs to be run serially to avoid address collisions. Officially, there is not a way to switch to serial execution in the middle of a role, however there are ways to effectively achieve this with run_once and delegation. However, there are several gotchas. I was able to pretty easily work around hostvars using delegate_facts: true and/or simply navigating through the hostvars dictionary. But what has stopped me from using run_once is that handlers are not delegated. If you use notify on a run_once task, it will only run the handler on the first host, despite delegation on the task itself. Also of note, task results (failed, changed, ok) are also not delegated, so if you delegate a config change to Host B from Host A, Host A will be marked as changed.

So I am working on achieving the same result but with a host loop and when: ansible_host == item. We’ll see if that works out.

oO.o · March 31, 2022, 9:02pm

This appears to be working. It looks like this:

- name: Configure the network
  ansible.builtin.include_tasks: cfg_net.yml
  loop: "{{ansible_play_hosts}}"
  loop_control:
    loop_var: host_item

# cfg_net.yml
- name: Configure interfaces
  ansible.builtin.include_tasks: cfg_iface.yml
  when: inventory_hostname == host_item
  loop: "{{ phy_ifaces | default([]) }}"
  loop_control:
    loop_var: phy_iface_item

A nice side benefit of running tasks in serial is that you can put variables into the task names which makes debugging easier and generally gives you a better idea of what’s going on.

- name: "Interface {{iface['dev']}} has an IPv4 address of {{ip_var}}"
  ansible.builtin.set_fact:
    iface_ip4: "{{ip_var}}"
  vars:
    iface_var: "{{ vars[ 'ansible_' + iface['dev'] ] | default }}"
    ip_var: "{{ iface_var['ipv4'][0]['address']
                | default(iface_var['ipv4']['address'])
                | default('') }}"

oO.o · April 1, 2022, 3:37am

One thing that can be frustrating in Ansible is that variables cannot be unset. Once you declare a variable, it exists forever. You can set it to an empty string, empty list or whatever, but it’s never completely gone.

However, there are 2 good ways to scope variables… well it’s really one way applied in 2 different ways.

I think most people using Ansible are aware that you can add a vars: section to any task to declare some variables that will only be used within that task. But you can also do this on blocks and include_tasks which will make those variables available to a series of sub-tasks and afterwards that variable will be undefined. It is the closest you can get to a “local” variable.

Additionally, while most examples I’ve come across online structure blocks like this:

- name: The block
  block:
    - task1
    - task2
    - task3
  when: block conditional
  vars:
    block_var: foo

I have found it much more readable to contain all of the block information at the beginning of the block, so:

- name: The block
  when: block conditional
  vars:
    block_var: foo
  block:
    - task1
    - task2
    - task3

This prevents me from having to scroll down scanning for an indention change to find the conditionals and/or variables for the block.