Ansible Adventures

oO.o · November 5, 2022, 3:25am

I have used import_playbook a couple times trying some stuff out, but I’m not really familiar with it off hand. BUT, inherently, each playbook has its own list(s) of hosts. It should only run on those hosts. The play hosts in the playbook that calls import_playbook should have no effect on which hosts the imported playbook runs on (or how many times its executed). So I think the behavior is already what you want when you call it with run_once.

oO.o · November 19, 2022, 8:19pm

My Network Variables

In my Ansible inventory, I am defining hosts networking across 4 different variables:

The net variable defines the overall network.

The subnets variable defines each subnet in the network.

The hosts variable defines interfaces per host.

Each network has a corresponding group. net, subnets and hosts are defined in the network’s group_vars file.

Additionally, interfaces may be defined by the ifaces variable in host_vars files.

In general, interface definitions inherit default values from parent entities via combine. So the effective definition of an interface is:

[ net, subnet, host, interface ] | combine(recursive=true)

The Problem

There are some complexities to this. For instance, when configuring a physical interface with VLANs, the MTU should not be inherited as above. Instead, it should be the highest value of all configured VLANs on that interface. This requires building each VLAN definition before determining the max MTU value and applying that to the interface.

Interface definitions include enabled services, so if I have a host time-srv-1 that is serving NTP on eth1, I need to build time-srv-1's interface definitions to determine what IP to use for local NTP configuration.

Currently, I have an interfaces role that builds these definitions and it works fine. BUT this means that in the NTP example, all local NTP servers need to be included in the play and run the interfaces role whenever NTP is configured on any host. Of course, this is not ideal.

Solution?

I think this means I should write a vars plugin that builds the interface definitions for all inventory hosts at runtime and puts those definitions in hostvars. This way I can query complete interface definitions of inventory hosts that are not necessarily targeted by the current playbook.

I may need to invoke some expert advice here, because the documentation on Ansible plugin development is a little sparse… @geerlingguy have you written any vars plugins?

It would be great to look at something well-written (and commented) other than the native host_group_vars.py. I get the gist and will start experimenting, but some more examples would be helpful. I’m also unsure of how to affect the order in which vars plugins run. In my case, I’d need mine to always run after the host_group_vars plugin. Does that mean I should extend it?

Alternatively, is using a facts cache viable here? Is that a better solution?

Sidenote

The hosts variable may seem unnecessary since usually all host-specific variables should be defined in host_vars, but it allows me to:

Define CARP/VRRP interfaces once instead of duplicated across member hosts.
Define PXE-enabled interfaces for hosts that are waiting for netinstall.

Example Variables

---
# inventory/hq.example.com.yml

all:
  children:
    hq_example_com:
      hosts:
        gw1.hq.example.com:
        gw2.hq.example.com:
        #www1.prod.hq.example.com:

---
# inventory/group_vars/hq_example_com.yml

net:
  etld: example.com
  net_id: '157'
  net_name: hq
  dhcp4: true

subnets:
  wan:
    dhcp4: false
    ip4: 192.0.0.2
    prefix4: 24
    routes:
      0.0.0.0/0: 192.0.0.1
  prod:
    subnet_id: '123'

hosts:
  # gw is the short hostname associated with CARP interfaces running on gw1 and gw2
  gw:
    wan:
      member_hosts:
        - gw1
        - gw2
    prod:
      member_hosts:
        - gw1
        - gw2
  # www1.prod.hq.example.com is waiting for pxe boot
  www1:
    prod:
      hw_addr: 0a:74:65:62:5b:37
      os:
        name: debian
        vers: 11
        arch: amd64
      srv:
        www: true

---
# inventory/host_vars/gw1.hq.example.com.yml

ifaces:
  - iface_name: em0
    subnet: wan
    link_local: true
  - iface_name: em1
    subnet: prod

oO.o · November 20, 2022, 12:31am

This will not work because it appears that host variables/facts are not available to plugins as a rule. But I believe they are available to modules, so maybe the answer is a module?

After reviewing the debug and set_fact action plugins, I believe I can accomplish what I need with a custom iface_facts module and corresponding action plugin that will build out the interface definitions for a target host. In cases where I need to query interfaces on inventory hosts that are not included in the current playbook, I will need to use delegate_to to build the interface definitions for those hosts.

Based on the debug action plugin, this is how I can access host variables in an action plugin:

self._templar.template('var_name', convert_bare=True, fail_on_undefined=True)

I’m not entirely sure what is going on here. I think some or most of this has to do with jinja2. Of course, fail_on_undefined is self-explanatory…

There also appears to be some magic going on with results['ansible_facts'] as any dictionary values set there are combined with host variables automagically. Is this what is being referred to in Ansible’s set_fact module where it says:

attributes:
    action:
        details: While the action plugin does do some of the work it relies on the core engine to actually create the variables, that part cannot be overriden
        support: partial

After pouring over various parts of the core Ansible plugins and modules today, I feel that I have a much better idea of what’s going on under the hood and how to begin writing my own plugins and modules. Humble beginnings for sure but I feel something that was completely opaque to me is now at least somewhat maleable.

oO.o · November 22, 2022, 5:46pm

After sifting through Ansible source for a couple hours, I am resorting to stackoverflow. I’ll probably push a couple questions up there this week and link them here.

cotton · November 22, 2022, 6:36pm

I just want to make sure I understand the problem correctly.

The play is to configure network interfaces on some set of network node(s).
There are certain components of this task which requiring knowing information about a superset of network nodes, which are not included in the set that is being configured.
- Your example was configuring a few systems in a play, but need to know the greatest MTU of all systems on the superset of machines on that VLAN.

You also said you’re having trouble where you have to include all the NTP servers in the play when you want to configure NTP on a network node?

Doe this mean they have to be included in the inventory of the playbook, and you run the “interfaces” role against the target systems and the NTP servers.

I need to clarify that because I might be able to help you out here.

cotton

oO.o · November 22, 2022, 6:58pm

I have a role that does that and part of that role consolidates service and network definitions that are spread out between host and group variables. For instance, if DHCP will generally be used on a subnet, then it is defined as part of the subnet in a group_vars file. An interface will inherit DHCP enablement based on subnet assignment. This prevents having to set DHCP for each interface on each host.

Kind of, but limited to a single physical interface. If I’m running 4 VLANs on an interface, the MTU for the physical device needs to be the highest between the 4 VLANS or it becomes a choke point. The way this is handled varies between operating systems, but in general I think this is best practice. In any case, it’s just an example of how getting a fully functional definition of an interface is somewhat complicated.

Currently, the only way to differentiate which hosts serve NTP is by looping through their interface definitions and seeing if item['srv']['ntp'] == true. Then I know item['ip4'] is the address I should use for NTP. But because the interface definitions need to be built up, all possible NTP server hosts have to run the interfaces role before the complete interface definitions are available. This is why I want to move what the interfaces role does into a plugin or module.

I believe delegation works on include_role, so I don’t think they need to be in the play explicitly. In fact, they don’t even need to be reachable since all of this is simply manipulating facts coming from host_vars and group_vars files.

cotton · November 22, 2022, 9:23pm

Can we assume that there are NTP DNS SRV records for the network?

In other words, can you offload figuring out the IP address of the NTP servers to the local DNS servers rather than your definitions?

oO.o · November 23, 2022, 12:57am

This is an interesting solution that did occur to me. I may implement this when I get to DNS. This would essentially mean that all servers would need to run the interfaces role when local DNS is configured, but not in general.

I believe my assumptions about what is possible with plugins and modules may have been incorrect and there may not be an elegant way to implement this there. I am not seeing a good way to use a plugin to process facts provided by the inventory into more useful structures. In general, I have been trying to make the inventory, host_vars and group_vars files very user-friendly to edit, but I need to strike a balance between that and keeping the complexity of my roles manageable.

My fallback approach is to consolidate service configuration to the hosts dictionary which is supplied by group_vars. This is slightly less than ideal because it means there could be redundant configuration between host_vars and group_vars, but it’s not the end of the world.