DO374 - ch05s05

Bookmark this page

Managing Inventory Variables

Objectives

Structure host and group variables by using multiple files per host or group, and use special variables to override the host, port, or remote user that Ansible uses for a specific host.

The Basic Principles of Variables

You can use variables to write flexible and reusable tasks, roles, and playbooks. You can also use them to specify differences in configuration between different systems. You can set variables in many places, including:

In the defaults/main.yml and vars/main.yml files for a role.
In the inventory file, either as a host variable or a group variable.
In a variable file in the group_vars/ or host_vars/ subdirectories of the playbook or inventory.
In a play, role, or task.

As you define and manage variables in a project, plan to follow these principles:

Keep it simple

Even though you can define Ansible variables in many ways, try to define them by using only a couple of different methods and in only a few places.

Do not repeat yourself

If a set of systems has a common configuration, then organize them into a group and set inventory variables for them in a file in the group_vars/ directory. This way, you do not have to define the same settings on a host-by-host basis, and when you do have to modify a variable for that group of systems you can do it by updating the variable file only once.

Organize variables in small, readable files

If you have a large project with many host groups and variables, then split the variable definitions into multiple files. To make it easier to find particular variables, group related variables into a file and give the file a meaningful name.

Remember that you can use subdirectories instead of files in the host_vars/ and group_vars/ directories. For example, you could have a group_vars/webserver/ directory for the webserver group, and that directory could contain a file called firewall.yml, which contains only variables related to firewall configuration. That directory could also contain other group variable files for configuring other components of the servers in the group.

Variable Merging and Precedence

When you define the same variable in multiple ways, Ansible uses precedence rules to choose a value for that variable. After processing all the variable definitions, Ansible generates a set of merged variables for each host at the start of each task.

When Ansible merges variables, if there are two definitions for the same variable in different places, then it uses the value that has the highest precedence.

Precedence order is covered in the Variable Precedence: Where Should I Put a Variable? documentation at https://docs.ansible.com/ansible/6/user_guide/playbooks_variables.html#variable-precedence-where-should-i-put-a-variable. The following discussion breaks this down into more detail, from the lowest to the highest precedence.

Determining Command-line Option Precedence

Options that you pass to ansible-navigator run on the command line, other than --extra-vars (or -e) for extra variables, have the lowest precedence. For example, you can set the remote user with the --user (or -u) option to override the configuration file, but setting ansible_user at a higher precedence overrides it.

Determining Role Default Precedence

Variables set by a role in the rolename/defaults/main.yml file have a low precedence so that they are easy to override. These variables provide some reasonable default values and users often override them to configure the role.

Determining Host and Group Variable Precedence

You can set host-specific and group-specific variables in a number of places. You can set them relative to the location of your inventory or your playbook, by collecting host facts, or by reading facts from a cache.

The precedence order for these variables is as follows, from lowest to highest:

Group variables (in ascending order of precedence):
- Set directly in an inventory file or by a dynamic inventory script
- For all set in the inventory group_vars/all file or subdirectory
- For all set in the playbook group_vars/all file or subdirectory
- For other groups set in the inventory group_vars/ subdirectory
- For other groups set in the playbook group_vars/ subdirectory
Host variables (in ascending order of precedence):
- Set directly in an inventory file or by a dynamic inventory script
- Set in the inventory host_vars/ subdirectory
- Set in the playbook host_vars/ subdirectory
Host facts and cached facts

The biggest source of confusion here is the distinction between the group_vars/ and host_vars/ subdirectories relative to the inventory and those relative to the playbook. This difference does not exist if you use a static inventory file in the same directory where you run the playbook.

If you have group_vars/ and host_vars/ subdirectories in the same directory as your playbook, then Ansible automatically includes those group and host variables.

When you use a flat inventory file in a directory other than the one the playbook is in, then Ansible also automatically includes the group_vars/ and host_vars/ directories in the inventory file directory. However, the variables that Ansible includes from the playbook directory override them if there is a conflict. For example, if you use /etc/ansible/hosts as your inventory, then Ansible uses the /etc/ansible/group_vars/ directory as another group variable directory.

When you use an inventory directory that contains multiple inventory files, then Ansible includes the group_vars/ and host_vars/ subdirectories of your inventory directory.

Consider the following tree structure:

.
├── ansible.cfg
├── group_vars/
│   └── all
├── inventory/
│   ├── group_vars/
│   │   └── all
│   ├── phoenix-dc
│   └── singapore-dc
└── playbook.yml

The playbook.yml file is the playbook. The ansible.cfg file configures the inventory directory as the inventory source, and phoenix-dc and singapore-dc are two static inventory files.

The inventory/group_vars/all file is inside the inventory directory and defines variables for all the hosts.

The group_vars/all file, in the same directory as the playbook, also loads variables for all the hosts. In the case of a conflict, these settings override inventory/group_vars/all.

Note

The value of this distinction is that you can set default values for variables that are bundled with the inventory files in a location that is shared by all your playbooks. You can still override the settings for those inventory variables in individual playbook directories.

You can set host or group variables directly in inventory files (such as phoenix-dc in the preceding example), but it is generally not a good practice. It is easier to find all the variable settings that apply to a host or group if you group them in files that only contain settings for that host or group. If you have to also examine the inventory files, then that can be more time-consuming and error prone, especially for a large inventory.

Determining Play Variable Precedence

The next category of variables is those that you set in the playbook as part of a play, task, role parameter, or that you include or import. These have higher precedence than host or group variables, role defaults, and command-line options other than --extra-vars (or -e).

The precedence order for these variables is as follows, from lowest to highest:

Set by the vars section of a play
Set by prompting the user with a vars_prompt section in a play
Set from a list of external files by the vars_files section of a play
Set by a role rolename/vars/main.yml file
Set for the current block with a vars section of that block
Set for the current task with a vars section of that task
Loaded dynamically with the include_vars module
Set for a specific host by using either the set_fact module or by using register to record the result of task execution on a host
Parameters set for a role in the playbook when loaded by the role section of a play or by using the include_role module
Set by a vars section on tasks included with the include_tasks module

Notice that the normal vars section in a play has the lowest precedence in this category. There are many ways to override those settings if necessary. Variables set here override host-specific and group-specific settings in general.

Using vars_prompt is not a recommended practice. To operate correctly, you need to configure ansible-navigator to allow interaction with the ansible-navigator run command while it is running. The vars_prompt directive is also not compatible with automation controller.

The vars_files directive is useful for organizing large lists of variables that are not host or group specific. These variables are organized by function into separate files. It can also help you separate sensitive variables into a separate file that you can encrypt with Ansible Vault. You can keep this separate from variables that are not sensitive and that you do not need to encrypt.

You can set variables that apply only to a specific block or task. These values override the play variables and inventory variables. You should use these sparingly, because they can make the playbook more complex.

- name: Task with a local variable definition
  vars:
    taskvariable: task
  ansible.builtin.debug:
    var: taskvariable

Notice that variables that you load by using include_vars have a high precedence, and can override variables set for roles and specific blocks and tasks. Often, you might want to use vars_files instead if you do not want to override those values with your external variable files.

The set_fact module and the register directive both set host-specific information, either a fact or the results of task execution on that host. Note that set_fact sets a high precedence value for that variable for the rest of the playbook run, but if you cache that fact then Ansible stores it in the fact cache at normal fact precedence (lower than play variables).

Determining the Precedence of Extra Variables

Extra variables that you set by using the --extra-vars (or -e) option of the ansible-navigator run command always have the highest precedence. This is useful so that you can override the global setting for a variable in your playbook from the command line without editing any of the Ansible project files.

Separating Variables from Inventory

The inventory sources define the hosts and host groups that Ansible uses, whether they are static files or a dynamic inventory script. If you are managing your inventory as static files, then you can define variables in the inventory files, in the same files in which you define your host and host group lists. However, this is not the best practice. As your environment grows in both size and variety, the inventory file becomes large and difficult to read.

In addition, you probably want to migrate to dynamic inventory sources rather than static inventory files to make it easier to manage your inventory. You might also still want to manage inventory variables statically, however, separately from or in addition to the output from the dynamic inventory script.

A better approach is to move variable definitions from the inventory file into separate variable files, one for each host group. Each variable file is named after a host group, and contains variable definitions for the host group.

[user@host project]$ tree -F group_vars
group_vars/
├── db_servers.yml
├── lb_servers.yml
└── web_servers.yml

This structure makes it easier to locate configuration variables for any of the host groups: db_servers, lb_servers, or web_servers. Provided that each host group does not contain too many variable definitions, the above organizational structure is sufficient. As playbook complexity increases, however, even these files can become long and difficult to understand.

An even better approach for large, diverse environments is to create subdirectories for each host group under the group_vars/ directory. Ansible parses any YAML files in these subdirectories and associates the variables with a host group based on the parent directory.

[user@host project2]$ tree -F group_vars
group_vars/
├── db_servers/
│   ├── 3.yml
│   ├── a.yml
│   └── myvars.yml
├── lb_servers/
│   ├── 2.yml
│   ├── b.yml
│   └── something.yml
└── web_servers/
    └── nothing.yml

In the preceding example, variables in the myvars.yml file are associated with the db_servers host group, because the file is in the group_vars/db_servers/ subdirectory. The file names in this example, however, make it difficult to know where to find a particular variable.

If you use this organizational structure for variables, then group variables with a common theme into the same file, and use a file name that indicates that common theme. When a playbook uses roles, a common convention is to create variable files named after each role.

A project organized according to this convention might be as follows:

[user@host project3]$ tree -F group_vars
group_vars/
├── all/
│   └── common.yml
├── db_servers/
│   ├── mysql.yml
│   └── firewall.yml
├── lb_servers/
│   ├── firewall.yml
│   ├── haproxy.yml
│   └── ssl.yml
└── web_servers/
    ├── firewall.yml
    ├── webapp.yml
    └── apache.yml

With this organizational structure for a project, you can quickly see the types of variables that are defined for each host group.

Ansible merges all the variables present in files in directories under the group_vars/ directory with the rest of the variables. Separating variables into files grouped by functionality makes the whole playbook easier to understand and maintain.

Using Special Inventory Variables

A number of variables are available that you can use to change how Ansible connects to a host listed in the inventory. Some of these are most useful as host-specific variables, but others might be relevant to all hosts in a group or in the inventory.

ansible_connection: The connection plug-in to use to access the managed host. By default, Ansible uses the ssh plug-in for all hosts except localhost, for which Ansible uses the local plug-in.
ansible_host: The actual IP address or fully qualified domain name to use when connecting to the managed host, instead of using the name from the inventory file. By default, this variable has the same value as the inventory hostname.
ansible_port: The port that Ansible uses to connect to the managed host. For the ssh connection plug-in, the default value is 22.
ansible_user: The user that Ansible uses to connect to the managed host. The default Ansible behavior is to connect to the managed host by using the same username as the user running the Ansible Playbook on the control node.
ansible_become_user: After Ansible has connected to the managed host, it switches to this user using ansible_become_method, which is sudo by default. You might need to provide authentication credentials in some way.
ansible_python_interpreter: The path to the Python executable that Ansible should use on the managed host. For Ansible 2.8 and later, this defaults to auto_legacy, which automatically selects a Python interpreter on the host running the playbook depending on what operating system it is running. Consequently, this setting is less likely to be required compared to earlier versions of Ansible.

Configuring Human Readable Inventory Hostnames

When Ansible executes a task on the remote host, the output displays the inventory hostname. Because you can specify alternative connection properties by using the preceding special inventory variables, you can assign arbitrary names to inventory hosts. When you assign inventory hosts with meaningful names, you are better able to understand playbook output and diagnose playbook errors.

Consider a playbook that uses the following YAML inventory file:

web_servers:
  hosts:
    server100.example.com:
    server101.example.com:
    server102.example.com:
lb_servers:
  hosts:
    server103.example.com:

With the preceding inventory file, the Ansible output references these names. Consider the following hypothetical output from a playbook that uses this inventory file:

[user@host project]$ ansible-navigator run --mode stdout site.yml
...output omitted...
PLAY RECAP *******************************************************************
server100.example.com  : ok=4    changed=0    unreachable=0    failed=0  ...
server101.example.com  : ok=4    changed=0    unreachable=0    failed=0  ...
server102.example.com  : ok=4    changed=0    unreachable=0    failed=0  ...
server103.example.com  : ok=3    changed=0    unreachable=0    failed=1  ...

From the preceding output, you cannot easily tell that the failed host is a load balancer.

To improve that output, use an inventory file that contains descriptive names and define the necessary special inventory variables:

web_servers:
  hosts:
    webserver_1:
      ansible_host: server100.example.com
    webserver_2:
      ansible_host: server101.example.com
    webserver_3:
      ansible_host: server102.example.com
lb_servers:
  hosts:
    loadbalancer:
      ansible_host: server103.example.com

Now the output from the playbook provides descriptive names:

[user@host project]$ ansible-navigator run --mode stdout site.yml
...output omitted...
PLAY RECAP *******************************************************************
loadbalancer    : ok=3    changed=0    unreachable=0    failed=1  ...
webserver_1     : ok=4    changed=0    unreachable=0    failed=0  ...
webserver_2     : ok=4    changed=0    unreachable=0    failed=0  ...
webserver_3     : ok=4    changed=0    unreachable=0    failed=0  ...

In some situations you might want to use an arbitrary hostname in your playbook that you map to a real IP address or hostname with the ansible_host directive. For example:

You might want Ansible to connect to that host by using a specific IP address that is different from the one that resolves in DNS. For example, there might be a particular management address that is not public, or the machine might have multiple addresses in DNS but one is on the same network as the control node.
You might be provisioning cloud systems that have arbitrary names, but you want to refer to those systems in your playbook with names that make sense based on the roles that they play. If you use a dynamic inventory, then your dynamic inventory source might assign host variables automatically based on the intended role of each system.
You might be referring to the machine by a short name in the playbook, but you need to refer to it by a fully qualified domain name in the inventory to properly connect to it.

Identifying the Current Host by Using Variables

When a play is running, you can use a number of variables and facts to identify the name of the current managed host executing a task:

inventory_hostname: The name of the managed host currently being processed, taken from the inventory.
ansible_host: The actual IP address or hostname that was used to connect to the managed host, as previously discussed.
ansible_facts['hostname']: The short (unqualified) hostname collected as a fact from the managed host.
ansible_facts['fqdn']: The fully qualified domain name (FQDN) collected as a fact from the managed host.

One final variable that can be useful is ansible_play_hosts, which is a list of all the hosts that have not yet failed during the current play, and therefore are going to be used for the tasks remaining in the play.

References

Ansible Documentation: Variable Precedence

Ansible Documentation: How to Build Your Inventory

Ansible Documentation: Special Variables

Discuss Developing Advanced Automation with Red Hat Ansible Automation Platform

Go to community

Welcome to the Developing Advanced Automation with Red Hat Ansible Automation (DO374) group!

Deanna

12 wrz 2023

We are excited to launch a space dedicated to the Red Hat Training course Developing Advanced Automation with Red Hat Ansible Automation Platform! To gain the most value from this group - click the "Join Group" button in the upper right hand corner of the group home page.We encourage group members to collaborate in this group to discuss topics, ask questions, share best practices and tips, provide course feedback, and share their accomplishments as it relates to DO374.Read more about Developing Advanced Automation with Red Hat Ansible Automation Platform here.

2367

Revision: do374-2.2-82dc0d7