Verify that Ansible can communicate with managed network devices, and troubleshoot any problems with those connections.
Automation tasks can fail for various reasons. The following issues are the most common reasons that network automation tasks can fail.
Connectivity issues, such as routing issues from the Ansible control node to the managed nodes.
Authentication issues, such as an incorrect username, password, or SSH key credentials.
Timeout issues, where commands take longer to run than the configured timeout value. A timeout issue might also be due to an authentication or connectivity issue.
Use the platform specific *_ping modules to verify connectivity to managed nodes.
For example, use the ios_ping module from the cisco.ios collection to ping Cisco managed nodes:
---
- name: Ping Cisco managed nodes
hosts: ios
gather_facts: false
tasks:
- name: Ping Cisco managed nodes
cisco.ios.ios_ping:
dest: "{{ inventory_hostname }}"
count: 5If a ping task fails, you see an error indicating the root cause of the issue.
For example, the following error indicates that the managed node is unreachable via the IP address or hostname provided:
TASK [Ping Cisco managed nodes] **************************************************
fatal: [iosxe1.lab.example.com]: FAILED! => {"changed": false, "msg": "ssh connection failed: ssh connect failed: Timeout connecting to iosxe1.lab.example.com"}When writing playbooks for network automation, always verify that you are using the correct setting for each network platform option, such as ansible_connection and ansible_network_os.
Visit the network platform documentation at https://docs.ansible.com/ansible/latest/network/user_guide/platform_index.html to verify that you are using the correct settings for each network platform.
When you run a playbook, it might fail with the error message No authentication methods available, or another error indicating that authentication to the managed node has failed.
TASK [Change configuration on managed nodes] *************************************fatal: [junos1.lab.example.com]: FAILED! => {"changed": false, "msg": "No authentication methods available"}
This message means that the Ansible control node where you ran the playbook cannot connect via password or SSH key to the managed node.
If you cannot connect to managed nodes using SSH keys, you can still connect by adding the -k option to the ansible-navigator run command.
This option prompts for the connection password.
Because the -k option only prompts for one connection password, problems can occur if the managed nodes use different passwords.
One solution is to ensure that the hosts targeted by the playbook use the same connection password. This might mean having separate playbooks for the various types of network devices.
Another solution is to use a single playbook, but use the --limit option to restrict which managed nodes are targeted by the playbook:
[user@host ~]$ ansible-navigator run snmp.yml --limit junos1.lab.example.comYou can explicitly define the location of the SSH key for an inventory host or host group.
Define the SSH key location by setting the ansible_ssh_private_key_file variable with the path to the appropriate SSH key.
Although setting the ansible_ssh_private_key_file variable works from the CLI, it does not work from automation controller if you import the inventory (and associated variables).
Automation controller can fail with a message similar to: No such file or directory: '/home/runner/.ssh/lab_rsa'.
A possible authentication error is not being able to enter the privileged configuration mode of a managed node.
To resolve this issue, ensure that you set the ansible_become variable to true and the ansible_become_method variable to enable.
For network automation playbooks, the only valid value for the ansible_become_method option is enable.
[user@host ~]$cat inventory...output omitted... [ios:vars] ansible_connection=ansible.netcommon.network_cli ansible_network_os=cisco.ios.iosansible_become=trueansible_become_method=enable...output omitted...
You might see the following error if the amount of time to wait for a command from the managed node is too low for a task to succeed:
fatal: [junos1.lab.example.com]: FAILED! => {"changed": false, "msg": "command
timeout triggered, timeout value is 30 secs.\nSee the timeout setting options in
the Network Debug and Troubleshooting Guide."}You also might see a timeout error due to authentication or DNS resolution errors.
To troubleshoot these types of errors, you can set the value of the ansible_command_timeout variable for a play or a task.
- name: Save the running-config
vars:
ansible_command_timeout: 120
cisco.ios.ios_command:
commands:
- copy running-config startup-configThe related environment variable is the ANSIBLE_PERSISTENT_COMMAND_TIMEOUT variable.
This environment variable is set to 30 seconds by default.
You can use the export command to modify the value of the ANSIBLE_PERSISTENT_COMMAND_TIMEOUT environment variable in your current session:
export ANSIBLE_PERSISTENT_COMMAND_TIMEOUT=120
Always use the long form of commands.
For example, use the command no shutdown versus no shut.
When committing a configuration, the network OS converts the short form of commands into the long form of commands.
For example, if you specify the command shut, then the configuration line is stored as shutdown.
The *_config module compares the command in your task to the line in the configuration.
If you use the short form of the command, then the module returns changed=true every time the play runs, even though the configuration might already be correct.
This means that your task is not idempotent and the configuration is updated every time the task runs.
To avoid this problem, use the long form of commands when using *_config modules in your tasks.
You can still abbreviate the configuration statements in the play name if desired.
---
- name: Change interface configuration
hosts: ios
gather_facts: false
tasks:
- name: Shutdown interface Gig0/1
cisco.ios.ios_config:
lines:
- shutdown
parents: interface GigabitEthernet0/1