util-linux/misc-utils/kill.1.adoc
Chris Down 30463e36ae kill: Support mandating the presence of a userspace signal handler
In production we've had several incidents over the years where a process
has a signal handler registered for SIGHUP or one of the SIGUSR signals
which can be used to signal a request to reload configs, rotate log
files, and the like. While this may seem harmless enough, what we've
seen happen repeatedly is something like the following:

1. A process is using SIGHUP/SIGUSR[12] to request some
   application-handled state change -- reloading configs, rotating a log
   file, etc;
2. This kind of request is deprecated and removed, so the signal handler
   is removed. However, a site where the signal might be sent from is
   missed (often logrotate or a service manager);
3. Because the default disposition of these signals is terminal, sooner
   or later these applications are going to be sent SIGHUP or similar
   and end up unexpectedly killed.

I know for a fact that we're not the only organistion experiencing this:
in general, signal use is pretty tricky to reason about and safely
remove because of the fairly aggressive SIG_DFL behaviour for some
common signals, especially for SIGHUP which has a particularly ambiguous
meaning. Especially in a large, highly interconnected codebase,
reasoning about signal interactions between system configuration and
applications can be highly complex, and it's inevitable that on occasion
a callsite will be missed.

In some cases the right call to avoid this will be to migrate services
towards other forms of IPC for this purpose, but inevitably there will
be some services which must continue using signals, so we need a safe
way to support them.

This patch adds support for the -r/--require-handler flag, which checks
if a userspace handler is present for the signal being sent. If it is
not, the process will be skipped.

With this flag we can enforce that all SIGHUP reload cases and SIGUSR
equivalents use --require-handler. This effectively mitigates the case
we've seen time and time again where SIGHUP is used to rotate log files
or reload configs, but the sending site is mistakenly left present after
the removal of signal handler, resulting in unintended termination of
the process.

Signed-off-by: Chris Down <chris@chrisdown.name>
2022-10-26 17:10:55 +01:00

124 lines
5.8 KiB
Text

//po4a: entry man manual
////
Copyright 1994 Salvatore Valente (svalente@mit.edu)
Copyright 1992 Rickard E. Faith (faith@cs.unc.edu)
May be distributed under the GNU General Public License
////
= kill(1)
:doctype: manpage
:man manual: User Commands
:man source: util-linux {release-version}
:page-layout: base
:command: kill
== NAME
kill - terminate a process
== SYNOPSIS
*kill* [**-**_signal_|*-s* _signal_|*-p*] [*-q* _value_] [*-a*] [*--timeout* _milliseconds_ _signal_] [*--*] _pid_|_name_...
*kill* *-l* [_number_] | *-L*
== DESCRIPTION
The command *kill* sends the specified _signal_ to the specified processes or process groups.
If no signal is specified, the *TERM* signal is sent. The default action for this signal is to terminate the process. This signal should be used in preference to the *KILL* signal (number 9), since a process may install a handler for the TERM signal in order to perform clean-up steps before terminating in an orderly fashion. If a process does not terminate after a *TERM* signal has been sent, then the *KILL* signal may be used; be aware that the latter signal cannot be caught, and so does not give the target process the opportunity to perform any clean-up before terminating.
Most modern shells have a builtin *kill* command, with a usage rather similar to that of the command described here. The *--all*, *--pid*, and *--queue* options, and the possibility to specify processes by command name, are local extensions.
If _signal_ is 0, then no actual signal is sent, but error checking is still performed.
== ARGUMENTS
The list of processes to be signaled can be a mixture of names and PIDs.
_pid_::
Each _pid_ can be expressed in one of the following ways:
_n_;;
where _n_ is larger than 0. The process with PID _n_ is signaled.
*0*;;
All processes in the current process group are signaled.
*-1*;;
All processes with a PID larger than 1 are signaled.
**-**__n__;;
where _n_ is larger than 1. All processes in process group _n_ are signaled. When an argument of the form '-n' is given, and it is meant to denote a process group, either a signal must be specified first, or the argument must be preceded by a '--' option, otherwise it will be taken as the signal to send.
_name_::
All processes invoked using this _name_ will be signaled.
== OPTIONS
*-s*, *--signal* _signal_::
The signal to send. It may be given as a name or a number.
*-l*, *--list* [_number_]::
Print a list of signal names, or convert the given signal number to a name. The signals can be found in _/usr/include/linux/signal.h_.
*-L*, *--table*::
Similar to *-l*, but it will print signal names and their corresponding numbers.
*-a*, *--all*::
Do not restrict the command-name-to-PID conversion to processes with the same UID as the present process.
*-p*, *--pid*::
Only print the process ID (PID) of the named processes, do not send any signals.
*-r*, *--require-handler*::
Do not send the signal if it is not caught in userspace by the signalled process.
*--verbose*::
Print PID(s) that will be signaled with *kill* along with the signal.
*-q*, *--queue* _value_::
Send the signal using *sigqueue*(3) rather than *kill*(2). The _value_ argument is an integer that is sent along with the signal. If the receiving process has installed a handler for this signal using the *SA_SIGINFO* flag to *sigaction*(2), then it can obtain this data via the _si_sigval_ field of the _siginfo_t_ structure.
*--timeout* _milliseconds signal_::
Send a signal defined in the usual way to a process, followed by an additional signal after a specified delay. The *--timeout* option causes *kill* to wait for a period defined in _milliseconds_ before sending a follow-up _signal_ to the process. This feature is implemented using the Linux kernel PID file descriptor feature in order to guarantee that the follow-up signal is sent to the same process or not sent if the process no longer exists.
+
Note that the operating system may re-use PIDs and implementing an equivalent feature in a shell using *kill* and *sleep* would be subject to races whereby the follow-up signal might be sent to a different process that used a recycled PID.
+
The *--timeout* option can be specified multiple times: the signals are sent sequentially with the specified timeouts. The *--timeout* option can be combined with the *--queue* option.
+
As an example, the following command sends the signals *QUIT*, *TERM* and *KILL* in sequence and waits for 1000 milliseconds between sending the signals:
+
....
kill --verbose --timeout 1000 TERM --timeout 1000 KILL \
--signal QUIT 12345
....
== EXIT STATUS
*kill* has the following exit status values:
*0*::
success
*1*::
failure
*64*::
partial success (when more than one process specified)
== NOTES
Although it is possible to specify the TID (thread ID, see *gettid*(2)) of one of the threads in a multithreaded process as the argument of *kill*, the signal is nevertheless directed to the process (i.e., the entire thread group). In other words, it is not possible to send a signal to an explicitly selected thread in a multithreaded process. The signal will be delivered to an arbitrarily selected thread in the target process that is not blocking the signal. For more details, see *signal*(7) and the description of *CLONE_THREAD* in *clone*(2).
Various shells provide a builtin *kill* command that is preferred in relation to the *kill*(1) executable described by this manual. The easiest way to ensure one is executing the command described in this page is to use the full path when calling the command, for example: */bin/kill --version*
== AUTHORS
mailto:svalente@mit.edu[Salvatore Valente],
mailto:kzak@redhat.com[Karel Zak]
The original version was taken from BSD 4.4.
== SEE ALSO
*bash*(1),
*tcsh*(1),
*sigaction*(2),
*kill*(2),
*sigqueue*(3),
*signal*(7)
include::man-common/bugreports.adoc[]
include::man-common/footer.adoc[]
ifdef::translation[]
include::man-common/translation.adoc[]
endif::[]