Ansible start a process and wait check until telnet condition is successful-CodePudding

I trigger multiple Tomcat startup scripts and then need to check if all process listens on their specific port across multiple hosts in the quickest time possible.

For the test case, I m writing 3 scripts that run on a single host and listen on ports 4443, 4445, 4447 respectively as below.

/tmp/startapp1.sh

while test 1 # infinite loop
sleep 10
do
    nc -l localhost 4443 > /tmp/app1.log
done

/tmp/startapp2.sh

while test 1 # infinite loop
sleep 30
do
    nc -l localhost 4445 > /tmp/app2.log
done

/tmp/startapp3.sh

while test 1 # infinite loop
sleep 20
do
nc -l localhost 4447 > /tmp/app3.log
done

Below is my code to trigger the script and check if the telnet is successful:

main.yml

- include_tasks: "internal.yml"
  loop:
    - /tmp/startapp1.sh 4443
    - /tmp/startapp2.sh 4445
    - /tmp/startapp3.sh 4447

internal.yml

- shell: "{{ item.split()[0] }}"
  async: 600
  poll: 0

- name: DEBUG CHECK TELNET
  shell: "telnet {{ item.split()[1] }}"
  delegate_to: localhost
  register: telnetcheck
  until: telnetcheck.rc == 0
  async: 600
  poll: 0
  delay: 6
  retries: 10

- name: Result of TELNET
  async_status:
    jid: "{{ item.ansible_job_id }}"
  register: _jobs
  until: _jobs.finished
  delay: 6
  retries: 10
  with_items: "{{ telnetcheck.results }}"

To run: ansible-playbook main.yml

Requirement: the above three scripts should start along with telnet check in about 30 seconds.

Thus, the basic check that needs to be done here is telnet until: telnetcheck.rc == 0 but due to async the telnet shell module does not have entries for rc and hence I get the below error:

"msg": "The conditional check 'telnetcheck.rc == 0' failed. The error was: error while evaluating conditional (telnetcheck.rc == 0): 'dict object' has no attribute 'rc'"

In the above code where and how can I check if telnet had succeeded i.e telnetcheck.rc == 0 and make sure the requirement is met?

CodePudding user response：

Currently I am not aware a solution with which one could start a shell script and wait for a status of it in one task. It might be possible to just change the shell script according the necessary behavior and let it provide self checks and exit codes. Or you could implement two or more tasks, whereby one is executing the shell script and the others later check on certain conditions.

Regarding your requirement

wait until telnet localhost 8076 is LISTENING (successful).

you may have a look into the module wait_for.

---
- hosts: localhost
  become: false
  gather_facts: false

  tasks:

  - name: "Test connection to local port"
    wait_for:
      host: localhost
      port: 8076
      delay: 0
      timeout: 3
      active_connection_states: SYN_RECV
    check_mode: false # because remote module (wait_for) does not support it
    register: result

  - name: Show result
    debug:
      msg: "{{ result }}"

Further Q&A

An other approach of testing from Control Node on Remote Node if there is a LISTENER on localhost could be

---
- hosts: test.example.com
  become: true
  gather_facts: false

  vars:

    PORT: "8076"

  tasks:

  - name: "Check for LISTENER on remote localhost"
    shell:
      cmd: "lsof -Pi TCP:{{ PORT }}"
    changed_when: false
    check_mode: false
    register: result
    failed_when: result.rc != 0 and result.rc != 1

  - name: Report missing LISTENER
    debug:
      msg: "No LISTENER on PORT {{ PORT }}"
    when: result.rc == 1

CodePudding user response：

Using an asynchronous action and an until in the same task makes nearly no sense.

Either you want to use until, and then each port probe would be stuck until they answer, or you want to run them asynchronously and the async_status will catch the return as it should if you wrap the telnet in a shell until loop.

In your until loop, the issue is that the return code won't be set until the command does indeed return, so you just have to check if the rc key of the dictionary is defined.

Mind that for all the examples below, I am manually open port with nc -l -p <port>, this is why they do gradually open, when then should all open at once with your Tomcat server.

With until:

- shell: "telnet localhost {{ item.split()[1] }}"
  delegate_to: localhost
  register: telnetcheck
  until:
    - telnetcheck.rc is defined
    - telnetcheck.rc == 0
  delay: 6
  retries: 10

This will yield:

TASK [shell] *****************************************************************
FAILED - RETRYING: [localhost]: shell (10 retries left).
changed: [localhost] => (item=/tmp/startapp1.sh 4443)
FAILED - RETRYING: [localhost]: shell (10 retries left).
changed: [localhost] => (item=/tmp/startapp2.sh 4445)
FAILED - RETRYING: [localhost]: shell (10 retries left).
changed: [localhost] => (item=/tmp/startapp3.sh 4447)

With async:

- shell: "until telnet 127.0.0.1 {{ item.split()[1] }}; do sleep 2; done"
  delegate_to: localhost
  register: telnetcheck
  async: 600
  poll: 0

- async_status:
    jid: "{{ item.ansible_job_id }}"
  register: _jobs
  until: _jobs.finished
  delay: 6
  retries: 10
  loop: "{{ telnetcheck.results }}"
  loop_control:
    label: "{{ item.item }}"

This will yield:

TASK [shell] *****************************************************************
changed: [localhost] => (item=/tmp/startapp1.sh 4443)
changed: [localhost] => (item=/tmp/startapp2.sh 4445)
changed: [localhost] => (item=/tmp/startapp3.sh 4447)

TASK [async_status] **********************************************************
FAILED - RETRYING: [localhost]: async_status (10 retries left).
changed: [localhost] => (item=/tmp/startapp1.sh 4443)
FAILED - RETRYING: [localhost]: async_status (10 retries left).
changed: [localhost] => (item=/tmp/startapp2.sh 4445)
FAILED - RETRYING: [localhost]: async_status (10 retries left).
changed: [localhost] => (item=/tmp/startapp3.sh 4447)

This said, you have to seriously consider @U880D's answer, as this is a more native answer for Ansible:

- wait_for:
    host: localhost
    port: "{{ item.split()[1] }}"
    delay: 6
    timeout: 60

This will yield:

TASK [wait_for] **************************************************************
ok: [localhost] => (item=/tmp/startapp1.sh 4443)
ok: [localhost] => (item=/tmp/startapp2.sh 4445)
ok: [localhost] => (item=/tmp/startapp3.sh 4447)