Ansible Adventures

Related to this post:

I have Ansible generating standardized IAID/DUIDs for DHCP across several platforms.

# RFC 4361 advocates for IAID/DUID use in DHCPv4. No specific method for
# generating the IAID is specified, other than it be unique, persistent
# and 32 bits. We use a truncated SHA256 hash the hardware MAC address,
# or if the interface is virtual and doesn't have a hardware address,
# we hash the interface name (dev).
- name: "The IAID for {{iface['dev']}} is {{iaid_var}}"
    iface: "{{  iface
                | combine( { 'iaid': iaid_var } ) }}"
    iaid_var: "{{ ( iface['hw_addr']
                    | default(iface['dev'])
                    | hash('sha256') )[:8]
                    | regex_findall('..')
                    | join(':') }}"

- name: "The DUID for {{iface['dev']}} is {{duid_var}}"
    iface: "{{  iface
                | combine( { 'duid': duid_var } ) }}"
    duid_var: "{{ ( '0004'
                    + ( ( ansible_system_vendor
                          + ansible_product_name
                          + ansible_product_uuid
                          + ansible_product_version
                          + ansible_product_serial )
                        | hash('sha256') )[:32] )
                  | regex_findall('..')
                  | join(':') }}"
TASK [ : The IAID for eth0 is 52:e6:d6:1e] *****************************************************************
ok: []

TASK [ : The DUID for eth0 is 00:04:d4:d6:5f:0d:ae:79:ea:72:b7:97:91:84:2d:d1:b4:c8] ***********************
ok: []

It turned out (imo) that generating my own IAID and DUID and supplying them to NetworkManager or networkd was easier than trying to get the stock values out of Linux.*

At least it’s all standardized now.

* I have not yet fully implemented this, so knock on wood…

1 Like

macOS is to Ansible what Internet Explorer was to CSS.

# Ansible fails to collect certain hardware information on macOS
- name: Get system hardware information (macOS)
    cmd: system_profiler -json SPHardwareDataType
  changed_when: false
  register: sys_prof_hw_reg

- name: Define missing Ansible hardware facts (macOS)
    ansible_system_vendor: 'Apple Inc.'
    ansible_product_version: "{{  ansible_product_name
                                  | regex_replace('^[A-Za-z]*', '') }}"
    ansible_product_uuid: "{{ hw_var['platform_UUID'] }}"
    ansible_product_serial: "{{ hw_var['serial_number'] }}"
    hw_var: "{{ ( sys_prof_hw_reg['stdout']
                  | from_json )['SPHardwareDataType'][0] }}"
The error was: error while evaluating conditional (carp_var is undefined)

…what? How does that error?

If you are building a collection of handlers across multiple roles, all handlers are loaded per play. So if you, like me, had a catch-all listen: save host vars, and add lineinfile handlers to it across multiple roles, you may need a when: var is defined as well because flushing handlers will even call handlers from roles that have not yet executed.

1 Like

Decided to be clever with this one:

# The nmcli module seems insistent that certain parameters are set in
# the presence of others, so in order to gradually build the
# configuration, we have to build the command and then execute it. This
# is not recommended by Ansible because set_fact can be overridden in
# variable precedence. We can protect against that with the assert task
# below.

- name: >-
    Test that we have control of the the nm_task variable by assigning it a
    benign value
    nm_task: {}

- name: Confirm nm_task has that value
    that: nm_task == {}
    quiet: true
    fail_msg: >
      The variable nm_tasks has been overriden (potentially maliciously).

- name: >-
    Define the basic community.general.nmcli task for {{ nm_name_pretty }}
      conn_name: "{{ nm_con_name }}"
      type: "{{ nm_iface_type }}"
      method6: disabled
      autoconnect:  true
      state: "{{ present }}"

- name: Confirm nm_task has changed
    that: nm_task != {}
    quiet: true
    fail_msg: >
      The variable nm_tasks has been overriden (potentially maliciously).

For reference:

An attack on a set_fact declaration seems unrealistic to me and even if it was, it would be a much larger threat than is described in the link above, but whatever. Just doing my due diligence…

1 Like

I thought you might be interested in reading this.


Currently, I’m targeting bare metal deployment across several OS’s including OpenBSD, so Vagrant is the better option for me but when I start working on services further down the line, I might switch to Docker.

My thought process that it would be easier to use Docker to test your mock deploys. Because then you can use GitLab or GitHub pipelines to increase your testing.


Screen Shot 2022-04-08 at 18.34.52


1 Like

isnt set_fact module setting it after you tried to debug it?

The IP in the task name doesn’t match the IP in the set_fact. It’s defined below in the vars section (but it’s kind of long so I didn’t include it). There’s no pre-existing value, so I don’t understand.

Oh wait, I see it! lol

  ip_var: "{{ available_ips_var | random }}"

So I reasonably assumed that this meant that it would pick a random IP and save it in the ip_var variable for re-use in both name and set_fact. BUT, it recalculates it each time it’s referenced! Why?!

Simple example:

  - debug:
      msg: "{{ rand }}-{{ rand }}-{{ rand2 }}-{{ rand2 }}"
      rand: "{{ [0,1,2,3,4,5,6,7,8] | random }}"
      rand2: "{{ rand }}"
ok: [localhost] => {
    "msg": "6-6-5-5"

So it will re-use the value within an action but if it’s referenced in multiple places within a task, it’s completely re-evaluated each time.

Here’s the entire task from above if anyone is curious. It’s how I assign IP addresses at random.

    # We use the integer representation of the IP address so that we can
    # use range to enumerate all usable host addresses which is then
    # differenced against the used IP list and finally a random IP is
    # chosen. However, if an address pool is present, an IPv4 address is
    # chosen at random from that list instead.
    - name: "An IPv4 address has been selected at random"
        ip4_candidate: "{{ ip_var }}"
        first_ip_int_var: "{{ iface['subnet_addr']
                              | ansible.netcommon.ipv4('next_usable')
                              | ansible.netcommon.ipv4('int') }}"
        last_ip_int_var: "{{  iface['subnet_addr']
                              | ansible.netcommon.ipv4('last_usable')
                              | ansible.netcommon.ipv4('int') }}"
        available_ips_var: "{{  iface['addr_pool']
                                | default(  range(  first_ip_int_var | int,
                                                    last_ip_int_var | int )
                                            | list )
                                | ansible.netcommon.ipv4
                                | difference( used_ip4s | default([]) ) }}"
        ip_var: "{{ available_ips_var | random }}"

are you checking dns or pinging it to see if its in use before using it?

1 Like
  1. Used IPs are pulled from Ansible facts and inventory and excluded from the available IP list.

  2. A random IP is selected.

  3. Both the remote host and local host try to ping it.

  4. If either is successful, the task file adds the selected IP to the used list and calls the random assignment recursively until it finds a good IP or runs out and fails.

# groupvars inventory file
# site_ips:
#   subnet/vlan:
#     dhcp-client-id:
#       ip4:
#       ip6: #future use

- run_once: true

    - name: Get IPv4 addresses from all hosts
          - ansible_all_ipv4_addresses
      delegate_to: "{{ play_host_item }}"
      delegate_facts: true
      loop: "{{ ansible_play_hosts }}"
        loop_var: play_host_item

    - name: Define a list of used IPv4 addresses
        used_ip4s: "{{ ansible_used_ip4s_var | union(site_used_ip4s_var) }}"
        ansible_used_ip4s_var: >-
          {{  ansible_play_hosts
              | map('extract', hostvars, 'ansible_all_ipv4_addresses')
              | select('defined')
              | flatten
              | unique }}
        site_used_ip4s_var: >-
          {{  ( site_ips | default({}) ).values()
              | default({})
              | map('dict2items')
              | flatten
              | selectattr('value.ip4', 'defined')
              | map(attribute='value.ip4') }}
    - name: Address collision tests (will re-attempt on failure)

        # NOTE: ping on some platforms format 100%, others 100.0%
        - name: "Ping {{ ip4_candidate }} from the host (collision test)"
            cmd: "ping -c 2 {{ ip4_candidate }}"
          register: host_ping_reg
          changed_when: false
          failed_when: not  host_ping_reg['stdout']
                            | regex_search('100.*% packet loss')

        - name: "Ping {{ ip4_candidate }} from localhost (collision test)"
            cmd: "ping -c 2 {{ ip4_candidate }}"
          register: localhost_ping_reg
          changed_when: false
          failed_when: not  localhost_ping_reg['stdout']
                            | regex_search('100.*% packet loss')

      # If a collision is detected, recursively call this tasks file
      # until all available IPs are exhausted.

        - name: >-
            Collision was detected, adding {{ ip4_candidate }} to used IPv4
            used_ip4s: "{{ used_ip4s | union( [ip4_candidate] ) }}"

        - name: Increment recursion counter
            def_iface_ip4_rec_count: "{{  def_iface_ip4_rec_count
                                          | default(1)
                                          | int
                                          + 1 }}"

        - name: >-
            Begin attempt {{ def_iface_ip4_rec_count }} to assign an IPv4
            address to {{ iface['dev'] }}
          ansible.builtin.include_tasks: def_iface_ip4.yml

Oh and IP assignment through configuration is run serially to avoid collisions.


Ansible is randomly detecting this vlan on my RHEL VMs when it doesn’t exist (Fedora below but also happening on Rocky):

I seem to remember something about facts caching? That is a VLAN in my config, it just isn’t on that host at the moment. I am constantly popping snapshots on these though so if it’s getting cached, that could be why…

Adding ansible.builtin.meta: clear_facts at the beginning of the play to see if that helps. Hard to test because it’s intermittent.


Finally decided to take a crack at using Ansible with Mikrotik’s RouterOS. I had a feeling it would be a pain, and yes it was.

First thing, you’ll need to make an account at Mikrotik’s site. Once that’s done, log in and select Make a demo key from the left menu.

Next, go grab whichever vagrant box you want or I guess download the CHR image from Mikrotik and spin up your VM. SSH into it and it should let you know that you have 24 hours to enter a key. Hit enter to get a prompt. Then, literally paste the whole multiline key into the command prompt. Like magic, it will accept this and prompt you to reboot. Do so and then snapshot the VM for your convenience.

Note that none of this is necessary on actual Mikrotik hardware which comes pre-licensed.

Anyway, so then you might try to add the VM to your inventory and run a basic command with community.routeros.command which will of course fail. You will want to add the following variables to the RouterOS host, or if you like, to a group:

ansible_connection: ansible.netcommon.network_cli

At this point, it might work, but it probably won’t. First thing, Miktrotik uses some sketchy console detection/colors that I already knew about because it would always crash minicom on login. The solution to this is to append +cet1024w to the username, so in my case:

ansible_user: vagrant+cet1024w

In addition to the color issue, the 1024w (console width) circumvents a problem if the username and/or hostname are too long.

At this point, I could get facts out of the RouterOS VM, but community.routeros.command would crash, with "msg": "encountered RSA key, expected OPENSSH key".

Long story short, by default ansible.netcommon.network_cli uses paramiko to connect via ssh. I honestly don’t know what paramiko is but it just does not work for me. Luckily, we can:

ansible_network_cli_ssh_type: libssh

But wait! We need ansible-pylibssh. On Linux, or in any properly managed Python environment, that should merely be a pip install away, but unfortunately on macOS, this is not the case. For some reason, while using the correct version of pip3 from the Ansible dependency installed by Homebrew, import reverts to macOS’s stock version of Python3 when attempting to import toml.

Quick solve for this is /usr/bin/python3 -m pip install toml. No idea if these version mismatches will byte me later, but at this point I am finally able to issue commands to the RouterOS VM. Huzzah.


Rube Goldberg Machine GIFs - Find & Share on GIPHY

1 Like

I am using this formulation to replicate check_mode functionality in the RouterOS command module. I wish there was a “do nothing” module like meta: noop that I could set changed_when: true on instead of doing this weird thing with assert. Thoughts?

- name: "Configure VLAN filtering on {{ bridge_dev }}"

    - name: "VLANs are enabled on {{ bridge_dev }}"
          - "put [/interface bridge get {{ bridge_dev }} vlan-filtering]"
      register: ros_bridge_vlans_reg
      changed_when: false
      failed_when: ros_bridge_vlans_reg['stdout_lines'][0][0] == false


    - name: "Enable VLANs on {{ bridge_dev }}"
          - "/interface bridge set {{ bridge_dev }} vlan-filtering=yes"
      when: not ansible_check_mode

    # Use tautology (that true) to indicate change in check mode.
    - name: "Change on previous task (Check Mode)"
        that: true
        quiet: true
      changed_when: true
      when: ansible_check_mode

If it’s not clear, the issue here is that in check mode, when a task is encountered that would result in a change, that task is marked as changed but doesn’t actually affect the host. So to replicate that, we have to first check if a change would occur, then perform the change if not in check mode or if we are in check mode, indicate a change without actually doing anything.

So on systemd systems, this comes out of /etc/machine-id and can be created with the systemd-machine-id-setup command. But much like you’ve done, I also just chucked a hash of my own creation in that file. Remember you’ve also got uuidgen on linux.


Remember the -t flag in nmcli for making the output script parseable.


That stuff looks way more complicated that just do it via a normal shell script. Programming in ansible/yaml? Interesting

If I would have an apprentice and he/she would come up with that stuff, I would make him/her write it in C as punishment :slight_smile:

I think I’m getting old :wink:

1 Like