Powered by Plus3 IT Systems

The /var/log/cloud-init.log Log-File

This is the default location where the Red Hat packaged version of the cloud-init service for Enterprise Linux 6 and 7 writes all of its log-output to – on RHEL 8+, logging data is split-out across this file and the /var/log/cloud-init-output.log file. All automation directly-initiated through cloud-init and that emits STDOUT and/or STDERR messages will be duplicated here.

Primary diagnostic-use with respect to execution of watchmaker will be in tracking errors emitted during preparation to execute watchmaker. If the watchmaker process fails to start (meaning that /var/log/watchmaker/watchmaker.log is never created), this would be a good location to find why watchmaker failed to start.

Useful string-searches for locating executional points-of-interest (“landmarks”) will be (ordered most- to least-useful):

  • : FAIL:

  • /var/lib/cloud/instance/script

  • /var/lib/cloud/instance

  • : SUCCESS:

By far, the search for : FAIL: will be the most important in uncovering errors. The other searches will mostly be of use in progress-tracking and verifying expected event-sequencing[1].

Example Failure

Typically, searching for “: FAIL: will bring the file-cursor to a logged-block similar to:

2023-06-21 11:12:36,078 - subp.py[DEBUG]: Unexpected error while running command.
Command: ['/var/lib/cloud/instance/scripts/00_script.sh']
Exit code: 1
Reason: -
Stdout: -
Stderr: -
2023-06-21 11:12:36,078 - cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)
2023-06-21 11:12:36,078 - handlers.py[DEBUG]: finish: modules-final/config-scripts-user: FAIL: running config-scripts-user with frequency once-per-instance
2023-06-21 11:12:36,078 - util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python3.6/site-packages/cloudinit/config/cc_scripts_user.py'>) failed
2023-06-21 11:12:36,079 - util.py[DEBUG]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python3.6/site-packages/cloudinit/config/cc_scripts_user.py'>) failed
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/cloudinit/stages.py", line 1090, in _run_modules
    run_name, mod.handle, func_args, freq=freq
  File "/usr/lib/python3.6/site-packages/cloudinit/cloud.py", line 55, in run
    return self._runners.run(name, functor, args, freq, clear_on_fail)
  File "/usr/lib/python3.6/site-packages/cloudinit/helpers.py", line 185, in run
    results = functor(*args)
  File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_scripts_user.py", line 44, in handle
    subp.runparts(runparts_path)
  File "/usr/lib/python3.6/site-packages/cloudinit/subp.py", line 426, in runparts
    % (len(failed), ",".join(failed), len(attempted))
RuntimeError: Runparts: 1 failures (00_script.sh) in 1 attempted commands

In this case, the failure happened during the execution of the userdata-script, /var/lib/cloud/instance/scripts/00_script.sh. Even if the script hasn’t logged anything directly useful in this log file or hasn’t even been configured to log its own activities any where, knowing that it was during the execution of this file is useful.

  1. The provisioning-administrator knows where in the cloud-init automation-sequence things failed

  2. One can look in other logs for actionable diagnostic-information

  3. If there’s no such information in other log files, one can hand-execute the failing script to see if the error can be reproduced (and in a way that assists the provisioning-administator with isolating the source of the failure)

For the third point, if the failure is in a BASH script, executing the script with the diagnostic flag set (e.g., bash -x /var/lib/cloud/instance/scripts/00_script.sh) one may be able to see where the script fails.

Similarly, if hand-execution of the script succeeds it can point to the script making incorrect assumptions about the cloud-init managed execution environment. This can include things like:

  • Lack of necessary environment variables

  • Improperly defined environment-variables

  • Attempts to execute commands that require a controlling-TTY (i.e., an interactive-login shell)

  • Attempting to do something that the instance’s security posture blocks[2].

Note that comparing execution via cloud-init versus execution from an interactive-shell works whether the script is written in BASH or some other interpreted language.