# Catching application SIGNALs(kill, term, stop, etc..) in KERNEL space (driver)



## JasonZ (Dec 21, 2021)

Is there a way to catch all SIGNALs sent to application with certain PID in the driver?


----------



## Alain De Vos (Dec 21, 2021)

An application has to have signal-handlers. It's not related to pid of the application. Otherwise you need to hack the kernel.


----------



## gpw928 (Dec 21, 2021)

JasonZ said:


> Is there a way to catch all SIGNALs sent to application with certain PID in the driver?



The short answer is no.  The kernel does not work like that.

Signals are managed by the kernel.  Their impact is always on a user process.  User processes operate in user (unprivileged) mode.

A driver operates strictly inside the kernel (in privileged mode).  It may be entered in the "top half of the kernel" (via system call -- where the "current PID" is the service requestor) or in the "bottom half" (via an interrupt -- where the "current PID" is most usually completely unrelated to the interrupt).

A user process, with appropriate privilege, may ask the kernel to send a signal via the sigaction(2) facility.  This will cause "flags" to be set in the kernel metadata of the target process(es), indicating a signal pending for the process(es).

Any part of the kernel may also set the signal pending "flags" for any number of user processes (or groups of user processes).

So the kernel metadata signal pending "flags" for a process may either be set by another process (inter-process communication), or unilaterally by the kernel (usually for some sort of bad news).

When the scheduler is about to run a user process, it first checks to see if there are any "flags" set that indicate a signal pending for that process.  If there are, the signal is dispatched to the process.  i.e. the process will usually* be terminated unless a signal handler is registered, in which case the process resumes execution inside the signal handler.

[* usually is a strong simplification, see signal(3) for more details.]


----------



## JasonZ (Dec 22, 2021)

gpw928, thanks for comprehensive answer. Actually the application part with "sigaction" works quiet straight forward. The problem appears when SIGKILL or SIGSTOP were sent out to the process.
I have to handle following logic:
1) Application started;
2) Application does SYSCALL to the driver;
3) Driver allocating resources;
4) Application received SIGKILL;
5) Application was killed and did not do SYSCALL to driver;
6) Driver did not release allocated resources.

This logic cannot be changed due to whole architecture complexity.

Catching signals on Application side works fine until we come to this user case:
1) App#1 starts to execute , PID #1000 ;
2) App#2 starts to execute in a while, PID # *5000*;
3) App#2 killed;
4) App#3 starts to execute in a while, PID #10000;
5) App#3 completed and did SYSCALL, driver released resources;
6) App#N starts to execute in a while, PID #N;
7) App#N completed and did SYSCALL, driver released resources;
9) App#1 completed and did SYSCALL, driver released resources;

The question is still there, any ways to subscribe to scheduler events, to be notified that user application with certain PID terminated and resources could be released?


----------



## gpw928 (Dec 22, 2021)

Normally, when a process terminates (due to a signal, or otherwise), the close routine will be automatically called for all "files" it still has open as part of the kernel implementation of _exit(2).

For any special files (devices), the device driver close() routine will be called, and any resources allocated for the specific use of the process, by the device driver, should be released.

I'm not aware of any mechanism to achieve the outcome you want in the way you describe, but expect that you may get a more interest and feedback in one of the mailing lists.


----------



## ralphbsz (Dec 22, 2021)

This whole thread belongs in user development, not FreeBSD development, since we're discussing the API for user-space programs here. I presume you're talking about user space processes ... if not, please explain what you're talking about.



JasonZ said:


> 1) Application started;


In Unix, people typically call them "programs", or "process", not application. But that means the same thing, the user-space process that has one PID.



> 2) Application does SYSCALL to the driver;


What do you mean by driver? You don't syscall into a "driver", you syscall into the kernel.
If you actually mean driver (in the sense of a device-specific part of the kernel, like the ada disk driver or some USB driver), which one? There are dozens.
And: which system call are you worried about? There are dozens or hundreds. Usually, people identify system calls by their name, like "read" or "brk" or "mmap".



> 3) Driver allocating resources;


For lack of knowing what you mean by "driver", and which system call you mean, this is a bit hard to answer.
As gpw928 already hinted at: Any resource allocated to a user process will be automatically freed when the process exits.



> 4) Application received SIGKILL;


Well, that happens, and there is nothing you can do about it; sigkill will definitely kill the program. That's its purpose.



> 5) Application was killed and did not do SYSCALL to driver;


I don't understand. In 2 you said that it the system call was done, now you're saying it was not. Maybe you are talking about a different call here?



> 6) Driver did not release allocated resources.


See above. A process that is gone does not hold any resources.



> The question is still there, any ways to subscribe to scheduler events, to be notified that user application with certain PID terminated and resources could be released?


Actually, there is a way to get notified when a process exits. Start the process from another process, then in the parent process put a handler on SIGCHLD, which is sent whenever the "status" (running or not running, exit code, ...) of a child process changes. Then, reap the child process and collect its exit code with a wait() call. But if you are just worried about releasing resources, that's just not necessary.


----------



## JasonZ (Dec 22, 2021)

Thanks for response ralphbsz.
Let me clarify:
1) Driver means kernel-space , that loaded by "kldload my_driver.ko";
2) Syscall means "ioctl" to "/dev/my_dev" that char device created by "my_driver.ko" ;
3) Application means process/program in user-space ;
4) Every process does SYSCALL to driver, and driver allocates resources for this process only;
5) Every process on completion does SYSCALL to driver to release resources allocated for this process only;

I have case when multiple programs started simultaneously and via "ioctl" "talk" to the "driver" in kernel-space. When SIGKILL/SIGSTOP sent to any of this process , it can not be intercept in process it self to execute proper exit. Even the "_exit" for given process called and "All of    the descriptors    open in    the calling process are    closed" , due to  close(2) "...If this is the last reference to    the underlying object,
     the object    will be    deactivated." - it does not call ".d_close" for my char device , since there are still valid opened descriptors for other processes/programs.

The parent/child scheme could work for this case, but I am looking into logic could be implemented for kernel-space only.


----------



## JasonZ (Dec 22, 2021)

gpw928 said:


> Normally, when a process terminates (due to a signal, or otherwise), the close routine will be automatically called for all "files" it still has open as part of the kernel implementation of _exit(2).
> 
> For any special files (devices), the device driver close() routine will be called, and any resources allocated for the specific use of the process, by the device driver, should be released.
> 
> I'm not aware of any mechanism to achieve the outcome you want in the way you describe, but expect that you may get a more interest and feedback in one of the mailing lists.


Could you tell a bit more about "may get a more interest and feedback in one of the mailing lists" - i am quite new to FreeBSD Development.


----------



## unitrunker (Dec 22, 2021)

dtrace might do what you want.






						dtrace_proc(4)
					






					www.freebsd.org
				




     The proc:::signal-send() probe fires when a signal is about to be sent to
     a process.  The proc:::signal-discard() probe fires when a signal is sent
     to a process that ignores it.  This probe will fire after the
     proc:::signal-send() probe for the signal in question.  The arguments to
     these probes are the thread and process to which the signal will be sent,
     and the signal number of the signal.  Valid signal numbers are defined in
     the signal(3) manual page.  The proc:::signal-clear() probe fires when a
     pending signal has been cleared by one of the sigwait(2),
     sigtimedwait(2), or sigwaitinfo(2) system calls.  Its arguments are the
     signal number of the cleared signal, and a pointer to the corresponding
     signal information.  The siginfo_t for the signal can be obtained from
     args[1]->ksi_info.


----------



## Andriy (Dec 22, 2021)

JasonZ said:


> 1) Application started;
> 2) Application does SYSCALL to the driver;
> 3) Driver allocating resources;
> 4) Application received SIGKILL;
> ...


I suggest that you change your interface from a syscall to a character device and ioctls on it.
Or provide a character device as an auxiliary channel to track application liveness.
You can then use the character device's close handler to free resources allocated for the application.
The close handler would get called regardless of whether the process calls close(2) or exits or crashes or gets killed.
See devfs_set_cdevpriv(9), etc


----------



## ralphbsz (Dec 22, 2021)

I think I'm beginning to understand. Allow me to rewrite your question in more standard terminology, maybe that will help a better understanding.

You have written your own driver, which lives in the lower half of the kernel, and is loaded as a module (that's actually not important). It can be reached from user processes by ioctl.

User processes perform the first ioctl, which I will give the nickname "allocate". It changes the internal state of something to use some resource. Then perhaps they do some work, perhaps not, that's not important. The important thing is that the "allocate" ioctl has changed the internal state of the system, in such a way that a limited resource has been consumed. To fix that, user programs are supposed to perform a second ioctl, which I will give the nickname "free". Until the "free" ioctl has been called, the resource is bound.

Importantly, it seems to me that there is nothing in the upper half of the kernel that knows (keeps track) whether the "allocate" ioctl has been called and the "free" ioctl call has not been called.

Your complaint is that the "free" ioctl will never happen if the user program dies before making that call. There are many ways a user program can die, and receiving SIGKILL is one of them. Given the above design, that statement is sadly true. And given the way the universe works and Unix is written, that's not trivial to fix. The lower half of the kernel (your driver = loadable module) can't get notifications on user-space process status changes. And as we said above, there are no kernel data structure in the upper half that track whether the resource is in use.

How to fix this? You need to rethink this design. One way would be to change the interface of your new design, and not use opaque ioctl's to get from user space through the upper half of the kernel to your module. My first suggestion would be to not use ioctl, but to implement a full device (which creates entries in /dev/something1 through /dev/something999, if there are enough resources to handle 999 things). Then each user process can open one such device. The beauty of this approach is: now you have an open file (a device file) in the upper half of the kernel, and when the user process exits, that device will get closed. You can then perform the "allocate" ioctl in the setup (open) phase of the device, and the "free" ioctl in the close handler of the device.

All my other suggestions are NASTY hacks. For example: Give the "allocate" ioctl another parameter, which is the PID of the calling process. Then keep those PIDs stored in the data structures of your device. Create a new ioctl, which I will call "list_all_used", which returns which PIDs have performed allocate but not yet free. Then write a watcher process which runs every second, calls the "list_all_used" ioctl, sees whether the corresponding PID is still in use, and if not, performs the "free" ioctl on their behalf. Nasty hack, error prone, what if the watcher process dies, and what if a PID is recycled.

Here is another idea: Instead of using whatever resource is being consumed in the kernel driver, make the kernel driver stateless. If they have to store some data structures, demand that callers provide a memory buffer when calling the "allocate" ioctl, and leave the data structures in that user-provided memory. Like that the problem of resource tracking is moved to the user-space callers, and solves itself when their processes exit.

EDIT: I see that Andriy said pretty much the same thing, but a little sooner, and more concisely.


----------



## JasonZ (Dec 22, 2021)

Andriy said:


> I suggest that you change your interface from a syscall to a character device and ioctls on it.
> Or provide a character device as an auxiliary channel to track application liveness.
> You can then use the character device's close handler to free resources allocated for the application.
> The close handler would get called regardless of whether the process calls close(2) or exits or crashes or gets killed.
> See devfs_set_cdevpriv(9), etc


devfs_set_cdevpriv(9) works as I looked for. Thanks for advice.


----------



## gpw928 (Dec 22, 2021)

JasonZ said:


> Could you tell a bit more about "may get a more interest and feedback in one of the mailing lists" - i am quite new to FreeBSD Development.


The starting point is in Appendix C of the handbook. However, the freebsd-drivers list looks appropriate.


----------



## gpw928 (Dec 22, 2021)

JasonZ said:


> 2) Syscall means "ioctl" to "/dev/my_dev" that char device created by "my_driver.ko" ;


It really helps to understand that you are using ioctls in that way...

Using ioctls on a pseudo device used to be a very common way of adding functionality to the kernel, when you didn't have access to the kernel source code, but could re-link the kernel with an extra device driver.

You didn't even need to implement device-specific open, close, read, and write device driver functions.  All you needed was a generic file descriptor for a pseudo device and the rest was done with ioctls.

However, the minute you start attaching resources to a process, you have to deal with detaching those resources, and, as indicated above by myself and Andriy, the correct place to do that, on a per process basis, is as a side effect of _exit(2)

That means you must open at least one pseudo device per process, and you must implement garbage collection for the process in the pseudo device driver's close() routine.

That approach does not preclude the broad approach you intimated -- using another, different, over-arching pseudo device which furnishes a file descriptor upon which ioctls can be used to conduct and co-ordinate kernel activity (often with a daemon co-ordinating things in user space).


----------



## JasonZ (Dec 23, 2021)

gpw928 said:


> That means you must open at least one pseudo device per process, and you must implement garbage collection for the process in the pseudo device driver's close() routine.


Looks like it does not work. When multiple processes open the same device file, why the d_close() is not called when one of them exits?


----------



## gpw928 (Dec 23, 2021)

I expect it's the open file table paradigm.  The device close() will be called when the reference count to the file drops to zero.


----------



## gpw928 (Dec 23, 2021)

I have not actively worked inside a Unix kernel for a long time.  People with current involvement in the kernel will be able to help more, particularly with FreeBSD specifics.  I commend to you Design and Implementation of the FreeBSD Operating System and the mailing list mentioned above.


----------



## Andriy (Dec 23, 2021)

JasonZ said:


> Looks like it does not work. When multiple processes open the same device file, why the d_close() is not called when one of them exits?


Because that's how it works, it gets called only when the very last descriptor that references it gets closed, globally.
Or if D_TRACKCLOSE is set, then it gets called on every close(2) call, regardless of whether it's a last close (even within a process) or not.
That's how it worked historically and that's how it still works.
devfs_set_cdevpriv is the right tool for the job and it was invented because d_close was not useful for per-process resource tracking.


----------

