# Trigger an interrupt when the value of a memory location is modified in FreeBSD



## Avk (Aug 5, 2022)

Is it possible to generate an interrupt when the value of a variable or memory location get modified in FreeBSD or Linux environment using C program ?

In a C application there is an dynamically allocated array which is being used/modified from multiple locations. The application is pretty large and complex, it is difficult to trace all the places the array being used or modified from. The problem is in some condition/flow the array[2] element become 0 which is not expected as per this application. I can't run the application using gdb to debug this issue (because of some constraint). The only way to debug this issue is to modify the source code and run the binary where the issue is happening.

Is it possible to generate an interrupt when the arra[2] element is modified and print the backtrace to know which part of the codebase has modified it ?

Thanks!!!


----------



## zirias@ (Aug 5, 2022)

Avk said:


> I can't run the application using gdb to debug this issue (because of some constraint).


I guess it would make sense to elaborate on that first. Not that I don't believe you, but it's possible you overlooked something. I mean, that's exactly what debuggers are designed for.


----------



## jbo (Aug 5, 2022)

This thread seems more suitable for the _"Userland programming & scripting" _category.


----------



## VladiBG (Aug 5, 2022)

GDB - Conditional Breakpoints — Debugging  documentation


----------



## Andriy (Aug 5, 2022)

I think that this would be a better suggestion given the problem: https://undo.io/resources/gdb-watchpoint/watchpoints-more-than-watch-and-continue/
And, of course, https://sourceware.org/gdb/onlinedocs/gdb/Set-Watchpoints.html


----------



## reddy (Aug 5, 2022)

From what I understand, the problem of the author is that because of a number of business constraints, he is not in a position to run an interactive debugging session in the environment where the problem occurs. His only option is therefore to try to log the issue, thus his question. Apparently they are not trying to debug something at development time on their local machine, they are trying to chase down a production issue. It is very common not to be able to attach a debugger in production, personally even if I could I probably wouldn't.

Since this is a not a crash that would leave an helpful stacktrace with debugging symbol, they are asking for a way to log the changes made to a variable, potentially using interrupts. The situation is made difficult by the fact that the variable is modified by many parts of their complex program.

In terms of solution, my 2 cents is that unless people familiar with C programming can propose an interrupt-based trick, I'd say the best approach may be to wrap the array in a method doing the logging you need before reading or changing the value. Even if you do not use an IDE that would make it convenient to find the places using of the variable, just wrap the variable in a method, this will break the build and the compiler errors will show you all the places using the variable so that you can update the data access code. Encapsulating access to shared-state is a best-practice anyway.

Edit: since C does not have classes or namespaces, just rename the variable to break the build, and moving forward ensure that all data access is done through your method.


----------



## ralphbsz (Aug 5, 2022)

It can be done, and I've worked in a group that had such a tool. It is difficult. In a nutshell, you end up implementing your own debugger. Here is what we did: Modify the OS kernel (which we had control of), to add a special debugging hook. That hook is given the address range of the variables you want to protect against modification. The kernel then takes the VM page(s) containing that address range, and write-protects it. Anytime someone tries to modify anything in those pages, the page fault handler will start. The page fault handler has been modified to identify the pages that are being "watched": It looks at the address the page fault happened at, and checks whether it is the variable(s) being watched. If no, it manually performs the write, and then lets the program continue. If yes, it logs the write to a kernel trace log (including the address of the instruction that caused the trap, and the call stack), then performs the write and restart. The problem with this (other than the sheer complexity, and the need to have experts who understand kernel, VM subsystem, and processor architecture) is that it destroys the program's performance, and leaves very large log files. One way to pre-process the log files is to decorate the call paths that are "allowed" to write to the variable with a "unlock/lock" pair (which go to a separate kernel hook that temporarily disables the page protection).

With a team of good people (perhaps a half dozen), and with existing infrastructure (trace collection, kernel configuration), this could be implemented in a few weeks.

No, I don't know of an existing solution.


----------



## jbo (Aug 6, 2022)

reddy Wow! Please teach me the skills of extracting this much detail from as little information as provided by OP's initial post! I am impressed!

ralphbsz If you're constrained to an environment not allowing to attach a debugger to a userland application, how is modifying the kernel an option (i.e. "allowed")? Surely you wouldn't want to modify the kernel of your production environment either, right? I'm honestly asking/interested.


----------



## ralphbsz (Aug 6, 2022)

Our problem was not that we were not allowed to use a debugger. Debuggers just make the program run too slow. Most debuggers implement watchpoints by running the program one instruction at a time, then checking the value of the watched variable after each instruction. That reduces performance by a large factor (10 or 100), uniformly for all parts of the program. That slowdown may make testing impossible. The technique of using the page protection mechanism to look for unauthorized changes only slows down when there are writes to the area that contains the watched variable is being written to. In many cases, a program first initializes lots of data structures, then uses them. You can turn the "watching" on only after initialization, and then run at reasonable performance.


----------



## kpedersen (Aug 6, 2022)

Possibly mprotect(2) the page (mmap(2) that dynamically allocated array). Handle the signal, and check that memory location on each access?


----------



## ralphbsz (Aug 6, 2022)

Can you restart the code after the signal? Meaning, if you find that the access was valid, unprotect the page, run the write instruction again, then re-protect it? Doing that in user space means you end up implementing the core of the debugger inside your program.

Actually, I saw this morning that gdb can do memory watch points without single-stepping the code, but only on HP-UX with PA-RISC, and on Linux with x86. I bet it uses a page protection technique, probably with help from kernel-based debugging aids. Don't know whether that extends to other FOSS Unixes likes FreeBSD, and to amd64.


----------



## _martin (Aug 6, 2022)

Guts of the gdb is ptrace syscall. If you can't do gdb from administrative point of view (i.e. tracing is prohibited) you won't be able to watch over it.
To add to the Andriy's link: gdb internals: watchpoints. HW watchpoints are HW dependent (kind of stating the obvious, I know).

If you can ptrace you'd fork (or create thread) and call PT_TRACE_ME (Linux: PTRACE_TRACEME) within the code you control. But then if you can ptrace gdb is the way to go.

Now can you hack around it? As kpedersen said - mprotect the page. Catch the sigsegv signal and analyze what is trying to write there. If conditions are true restart (continue) the code. If not you get the answer what you were looking for. setjmp(3) and sigsetjmp(3) (and friends) are very helpful. But if you go this route I bet it would be easier to analyze the actual application than to do this.
Or enable tracing on the host.


----------



## Avk (Aug 8, 2022)

zirias@ said:


> I guess it would make sense to elaborate on that first. Not that I don't believe you, but it's possible you overlooked something. I mean, that's exactly what debuggers are designed for.


In production environment we are not allowed to run using gdb as correctly mentioned by reddy.


----------



## zirias@ (Aug 8, 2022)

That's why you _should_ have an identical testing environment. Identical (virtual) machines, network configuration, operating systems, libraries and services used, and so on. If data is involved, it must be cloned (and, if necessary, anonymized) from production. I know many don't have that, but IMO, it's the only sane way. Trying to debug something "in production" is often attempted with all sorts of trickery, it's sometimes successful, sometimes not, nothing you should ever rely on...


----------



## Avk (Aug 8, 2022)

Unfortunately we don't have repro in local environment ...


----------



## Avk (Aug 8, 2022)

reddy said:


> From what I understand, the problem of the author is that because of a number of business constraints, he is not in a position to run an interactive debugging session in the environment where the problem occurs. His only option is therefore to try to log the issue, thus his question. Apparently they are not trying to debug something at development time on their local machine, they are trying to chase down a production issue. It is very common not to be able to attach a debugger in production, personally even if I could I probably wouldn't.
> 
> Since this is a not a crash that would leave an helpful stacktrace with debugging symbol, they are asking for a way to log the changes made to a variable, potentially using interrupts. The situation is made difficult by the fact that the variable is modified by many parts of their complex program.
> 
> ...


As far as I understand, if we rename the variable/array name it detects the error at compile time. 
The limitations are :
1> If the array value is modified by some pointer operation (one pointer points that array element and then modify this at run time), it won't be detected by compiler. Of course it would tell which pointer points to that array element or that array. We need to trace all such pointers separately.
2> In our case that array is defined as macro and it is referred/used by multiple queues. If we rename then compilation error comes from all the places, including the queue we are concerned with. 

I known there is no straight forward way to debug this. 
Thanks !!!


----------



## Crivens (Aug 8, 2022)

You want the code to stop when array[2] gets written, yes? Maybe drop a core dump for analysis?


----------



## Avk (Aug 8, 2022)

Crivens said:


> You want the code to stop when array[2] gets written, yes? Maybe drop a core dump for analysis?


Not exactly. For that we need to know where in the code array[2] is modified to 0 .... we don't know the place. And it is modified from multiple places. Also any value (address) other than 0 (null) is not an issue.


----------



## elgrande (Aug 8, 2022)

One possibility would be to create a setter/wrapper method for array elements with a trace option and change to using the setter instead of changing the array element directly.


----------



## Crivens (Aug 8, 2022)

That is not a planned programm flow. You should.. no, you HAVE to fix that. Find where these wild pointers are placed and stamp them out.

You may have success with watch registers in the CPU core, you may also place the array on a page boundary so that [1] is on page A and [2] is on page A+1. Then protect A+1 against write, and collect the core dumps. All this is debugging, you need to get that code into gear. It can't be a permanent thing in the program.


----------



## Andriy (Aug 8, 2022)

Avk said:


> Unfortunately we don't have repro in local environment ...


You can cheat and emulate whatever the debugger would do using ptrace(2) interface. I think that if you are on x86 you should be able to set a hardware watchpoint on the memory location that interests you.


----------



## Avk (Aug 9, 2022)

Andriy said:


> You can cheat and emulate whatever the debugger would do using ptrace(2) interface. I think that if you are on x86 you should be able to set a hardware watchpoint on the memory location that interests you.


But for setting a h/w watchpoint, first I need to do run the application using 'gdb' .... correct ? We need some mechanism that we don't loose the performance of the box/application in production environment.


----------



## _martin (Aug 9, 2022)

Avk said:


> But for setting a h/w watchpoint, first I need to do run the application using 'gdb' .... correct ? We need some mechanism that we don't loose the performance of the box/application in production environment.


Yes, that's the benefit of the HW watchpoints, you don't lose performance. 
It doesn't make sense to create your own debugger within the code (using ptrace and logic around HW debug registers). If you need to do that you can simply use gdb instead.

Note if you attach debugger your application will stop. You could create gdb script and attach to application with it (script will have commands to set watchpoint and continue). But still once the watchpoint is hit application will stop. So it's not only about performance but you must keep in mind that application will stop when you hit the watchpoint. It's worth mentioning that watch command in gdb will tell you if HW watchpoint is in place when you set it.

Have you considered printf debugging first? Log with printf anywhere before array is modified, either directly or indirectly with the pointer.


----------



## Crivens (Aug 9, 2022)

I consider writing to an array w.o. being able to find out/know where it is done a bug. Refactor early. Refactor often. So you don't end up in spaghettie code hell.


----------



## Avk (Aug 9, 2022)

_martin said:


> Yes, that's the benefit of the HW watchpoints, you don't lose performance.
> It doesn't make sense to create your own debugger within the code (using ptrace and logic around HW debug registers). If you need to do that you can simply use gdb instead.
> 
> Note if you attach debugger your application will stop. You could create gdb script and attach to application with it (script will have commands to set watchpoint and continue). But still once the watchpoint is hit application will stop. So it's not only about performance but you must keep in mind that application will stop when you hit the watchpoint. It's worth mentioning that watch command in gdb will tell you if HW watchpoint is in place when you set it.
> ...


Sorry, probably I didn't understand your comment completely.

When you talk about HW watchpoint, do you mean to set a HW watchpoint I need gdb, but at the same time it will not lose overall performance ?

I am using the following HW model with FreeBSD 11.2 running.
Intel(R) Xeon(R) CPU

Thanks !!!


----------



## jbo (Aug 9, 2022)

Avk said:


> I am using the following HW model with FreeBSD 11.2 running.


FreeBSD 11.2 has been EOL for a while.


----------



## kpedersen (Aug 9, 2022)

Crivens said:


> I consider writing to an array w.o. being able to find out/know where it is done a bug. Refactor early. Refactor often. So you don't end up in spaghettie code hell.


I am purely guessing but I do suspect that it might not even be due to awkward code but instead some sort of memory error that is overflowing elsewhere "nearby" and writing all over the OP's dynamic array.

AddressSanitizer and Valgrind can only reliably detect access to canaries / bounds padding, whereas if the code is really broken, it could jump right past them!


----------



## Crivens (Aug 9, 2022)

Avk said:


> When you talk about HW watchpoint, do you mean to set a HW watchpoint I need gdb, but at the same time it will not loose overall performance ?


Not neccesrily. There are interfaces to the special registers of modern CPUs, such as performance counters. You may be able to utilize them directly. Gdb may use them for you, just check if the performance drops at all or unacceptably.


----------



## Crivens (Aug 9, 2022)

jbodenmann said:


> FreeBSD 11.2 has been EOL for a while.


To reference a long running insider joke : what game is this about?


----------



## jbo (Aug 9, 2022)

Crivens said:


> To reference a long running insider joke : what game is this about?


I'm most likely not around for long enough to get this - please enlighten me.

In any case, the intention was to simply warn about this (while also aligning with the forum guidelines).
Given that this topic is seemingly unrelated to "a problem with FreeBSD" I don't see an issue with this. However, it might still be worth pointing out as OP appears to run a _production_ server running an OS that is EOL for almost three years.


----------



## kpedersen (Aug 9, 2022)

jbodenmann said:


> I'm most likely not around for long enough to get this - please enlighten me.


Ah, I didn't link the two together before Crivens mentioned it.

Is this the specific version of FreeBSD required? Did the OP just turn into a suspect 

(I especially love that sticky. Asking for help for the illicit software on the very post that is warning against doing so is... art; as is the self referential link to the no-METIN sticky)


----------



## _martin (Aug 9, 2022)

Avk said:


> When you talk about HW watchpoint, do you mean to set a HW watchpoint I need gdb, but at the same time it will not loose overall performance ?


Please check out the link I mentioned above about gdb's internals. When using gdb's command _watch_ gdb will try to use HW assisted breakpoint for the condition. If this can be set you won't have any performance degradation as HW (cpu) is assisting. If not sw breakpoint is used. For watch that means to go step-by-step (instruction by instruction) and verify if condition is met. That is very slow.

Now you said you can't use gdb in production but you can modify the code. The thing is even in the code you'd do what gdb is doing and most likely it won't be that good.

And to reiterate, gdb will stop the application once breakpoint is hit. That's why I mentioned the printf debugging, just to see what is modifying the code and what with.


----------



## Andriy (Aug 9, 2022)

Avk said:


> But for setting a h/w watchpoint, first I need to do run the application using 'gdb' .... correct ? We need some mechanism that we don't loose the performance of the box/application in production environment.


Did you read the manual page I linked?


----------



## Avk (Aug 10, 2022)

kpedersen said:


> Possibly mprotect(2) the page (mmap(2) that dynamically allocated array). Handle the signal, and check that memory location on each access?


So far mprotect() looks like most effective for debugging my issue.  
With this I am able to trace (using signal handler) most of the read/write operation to the memory location I am interested in. 

Whenever someone try to write some data to that memory location a signal is generated. With this it is now flooded with signals. 
I am thinking is there any way to generate signal only when someone tries to write 0 value on that memory location ?
Or when the signal is generated, is there any way to know (from signal handler) what value it was trying to write due to which this signal is generated ?

Thanks.


----------



## kpedersen (Aug 10, 2022)

Glad to hear you successfully have this (potentially fiddly to implement) debugging code in place.

With the mprotect I am assuming that you only allow read access (*PROT_READ*). The signal only gets triggered during a write
So from your signal handler, can you simply read the value (for now just make the dynamic array pointer a global and use something like extern to access it from the signal handler) and ignore if 0 or abort() if not (and then grab the coredump for a stacktrace).


----------



## Avk (Aug 12, 2022)

kpedersen said:


> Glad to hear you successfully have this (potentially fiddly to implement) debugging code in place.
> 
> With the mprotect I am assuming that you only allow read access (*PROT_READ*). The signal only gets triggered during a write
> So from your signal handler, can you simply read the value (for now just make the dynamic array pointer a global and use something like extern to access it from the signal handler) and ignore if 0 or abort() if not (and then grab the coredump for a stacktrace).


Here is a sample program that I tested.  Based on the output it is observed that the interrupt is generated multiple times although I try to write only once (ptr[0] = 0). I was expecting interrupt to be generated only once.

    mprotect(ptr, size, PROT_READ); // Only read permission is provided
    ptr[0] = 0;

Also from the interrupt handler it is printing the value 9 although I tried to write 0. Is there any way to know that SIGSEGV (11) interrupt is generated because application tries to write 0 at specified memory location ?

*1 : Got 11 for address location : 0x801016640 : 0x801016640 => ptr[0] = 9*



```
#include <fcntl.h>
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>

static int size;
static char *m;
int *ptr;
static unsigned long cnt;

void handler(int sig_num, siginfo_t *sig, void *unused) {
        cnt++;
        printf("%d : Got %d for address location : 0x%lx : 0x%lx => ptr[0] = %d\n", cnt, sig_num, (long)sig->si_addr, &ptr[0], ptr[0]);
        if (5 == cnt)  // when count is 5, provide  READ/WRITE permission
                mprotect(ptr, size, PROT_READ | PROT_WRITE);
}

int main()
{
        struct sigaction s;
        memset(&s, 0, sizeof(s));
        s.sa_flags = SA_SIGINFO;
        sigemptyset(&s.sa_mask);
        s.sa_sigaction = handler;
        if (sigaction(SIGSEGV, &s, NULL) == -1)
        {
                perror("sigaction");
                return(1);
        }

        size = sysconf(_SC_PAGE_SIZE) * sizeof(int);
        ptr = (int*)malloc(size);
        printf("Starting ....0x%lx : size = %lu\n", ptr, size);
        sleep(5);
        ptr[0] = 9;

        mprotect(ptr, size, PROT_READ); // Only read permission is provided
        ptr[0] = 0;
        printf("All completed...\n");
        //munmap(ptr, size); ****/
        return 0;
}
```

*Output :
==============================*
[aadhya@dut078-client02 ~]$ ./mprotect
Starting ....0x801016640 : size = 16384
1 : Got 11 for address location : 0x801016640 : 0x801016640 => ptr[0] = 9
2 : Got 11 for address location : 0x801016640 : 0x801016640 => ptr[0] = 9
3 : Got 11 for address location : 0x801016640 : 0x801016640 => ptr[0] = 9
4 : Got 11 for address location : 0x801016640 : 0x801016640 => ptr[0] = 9
5 : Got 11 for address location : 0x801016640 : 0x801016640 => ptr[0] = 9
All completed...
*===============================*


----------



## covacat (Aug 12, 2022)

see 








						How to write a signal handler to catch SIGSEGV?
					

I want to write a signal handler to catch SIGSEGV.  I protect a block of memory for read or write using  char *buffer; char *p; char a; int pagesize = 4096;  mprotect(buffer,pagesize,PROT_NONE) This




					stackoverflow.com


----------



## Paul Floyd (Aug 18, 2022)

kpedersen said:


> I am purely guessing but I do suspect that it might not even be due to awkward code but instead some sort of memory error that is overflowing elsewhere "nearby" and writing all over the OP's dynamic array.
> 
> AddressSanitizer and Valgrind can only reliably detect access to canaries / bounds padding, whereas if the code is really broken, it could jump right past them!



I don't know about asan, but this is a runtime configurable for Valgrind

    --redzone-size=<number>   set minimum size of redzones added before/after
                              heap blocks (in bytes). [16]


----------



## Avk (Aug 19, 2022)

covacat said:


> see
> 
> 
> 
> ...


From the signal handler if we print backtrace, it doesn't give the trace of the code that caused the signal, rather it gives something like this :

0x400c55 <*print_trace*+0x1f> at /data/home/user/mprotect
0x400d2d <*handler*+0x6a> at /data/home/user/mprotect


----------



## Paul Floyd (Aug 19, 2022)

Avk said:


> From the signal handler if we print backtrace, it doesn't give the trace of the code that caused the signal, rather it gives something like this :
> 
> 0x400c55 <*print_trace*+0x1f> at /data/home/user/mprotect
> 0x400d2d <*handler*+0x6a> at /data/home/user/mprotect



That's to be expected. The return address from the signal handler is the "retpoline" (portmanteau word for "return trampoline"). The retpoline is an assembler stub function that just calls the sigreturn syscall. Sigreturn will get the original instruction address from the mcontext that was synthesized when the signal triggered.

Multithread code is a bit different, the user signal handler is not called directly. Instead 'thr_sighandler' gets called. This calls the user sighandler (plus other stuff like masking and under some conditions locking), and on return from the user routine calls sigreturn itself.

It should be possible to detect the retpoline and work out the return address so that more of the stack can be displayed (I believe that this is what lldb/gdb do).


----------



## Avk (Aug 22, 2022)

Paul Floyd said:


> That's to be expected. The return address from the signal handler is the "retpoline" (portmanteau word for "return trampoline"). The retpoline is an assembler stub function that just calls the sigreturn syscall. Sigreturn will get the original instruction address from the mcontext that was synthesized when the signal triggered.
> 
> Multithread code is a bit different, the user signal handler is not called directly. Instead 'thr_sighandler' gets called. This calls the user sighandler (plus other stuff like masking and under some conditions locking), and on return from the user routine calls sigreturn itself.
> 
> It should be possible to detect the retpoline and work out the return address so that more of the stack can be displayed (I believe that this is what lldb/gdb do).



As per my understanding whenever a signal is generated from a process, code inside the trampoline page is executed to move the control in kernel mode . Then it jump to the user mode to execute user-defined-signal-handler as shown in the diagram below . Hence if I print backtrace within signal-hander, it doesn't show the location where signal was generated from.

Now to know the location of the code that caused signal generated, probably I need to hack the trampoline code and print the backtrace there. Is it really possible ?


----------



## Andriy (Aug 22, 2022)

Avk how do you print the backtrace? You never showed that code.


----------



## Avk (Aug 23, 2022)

Andriy  Here is the code :


```
#include <fcntl.h>
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <ucontext.h>

#include <stdio.h>
#include <execinfo.h>
static int size;
static char *m;
int *ptr;
static unsigned long cnt;

void print_trace(void) {
    char **strings;
    size_t i, size;
    enum Constexpr { MAX_SIZE = 1024 };
    void *array[MAX_SIZE];
    size = backtrace(array, MAX_SIZE);
    strings = backtrace_symbols(array, size);
    for (i = 0; i < size; i++)
        printf("%s\n", strings[i]);
    puts("");
    free(strings);
}

void handler(int sig_num, siginfo_t *sig, void *unused) {
        //ucontext_t *u = (ucontext_t *)unused;
        //unsigned char *pc = (unsigned char *)u->uc_mcontext.gregs[REG_RIP];
        cnt++;
        printf("%d : Got %d for address location : 0x%lx : 0x%lx => ptr[0] = %d\n", cnt, sig_num, (long)sig->si_addr, &ptr[0], ptr[0]);
        print_trace();
        if (5 == cnt)
                mprotect(ptr, size, PROT_READ | PROT_WRITE);
}

void test()
{
        printf("-- %s --\n", __FUNCTION__);
        print_trace();
}

int main()
{
        struct sigaction s;
        memset(&s, 0, sizeof(s));
        s.sa_flags = SA_SIGINFO;
        s.sa_sigaction = handler;
        sigemptyset(&s.sa_mask);

        if (sigaction(SIGSEGV, &s, NULL) == -1)
        {
                perror("sigaction");
                return(1);
        }

        size = sysconf(_SC_PAGE_SIZE) * sizeof(int);
        ptr = (int*)malloc(size);
        printf("Starting ....0x%lx : size = %lu\n", ptr, size);
        sleep(5);
        ptr[0] = 9;

        mprotect(ptr, size, PROT_READ);

        ptr[0] = 0;

        printf("All completed...\n");
        //munmap(ptr, size); ****/
        test();
        return 0;
}
```


----------



## Andriy (Aug 23, 2022)

I think that it's a problem with backtrace(3) (or rather libunwind) then. It should be able to walk across a signal frame.
See PR 243746.

How do you build the program?
Maybe try using libunwind from devel/libunwind


----------



## Paul Floyd (Aug 23, 2022)

Avk said:


> Now to know the location of the code that caused signal generated, probably I need to hack the trampoline code and print the backtrace there. Is it really possible ?



As I said, yes it is possible (some gdb on Linux output):

Breakpoint 2, handle_vtalrm (sig=26) at 452274.c:13
13      ticks++;
(gdb) bt
#0  handle_vtalrm (sig=26) at 452274.c:13
#1  <signal handler called>
#2  0x00007ffff7afcfd0 in __write_nocancel () from /lib64/libc.so.6
#3  0x000000000040124e in main (argc=1, argv=0x7fffffffcdf8) at 452274.c:31

Take a look at the lldb or gdb source to try to see how they do it. I don't know exactly how it works, but my guess is that you need to do 2 things


detect the retpoline or thr_sigreturn functions
dig out the mcontext pointer (it's somewhere on the stack) and from that you can get 'addr'


----------



## _martin (Aug 23, 2022)

_martin said:


> The thing is even in the code you'd do what gdb is doing and most likely it won't be that good.





Paul Floyd said:


> Take a look at the lldb or gdb source to try to see how they do it.


 That's what I've mentioned earlier. While this does make an interesting issue in general if the focus is to fix the application that is in prod then using gdb is really the way to go. Anything else will cause more issues in the prod.

Avk Why is it a problem to replicate this in your own environment ? E.g. do a copy of the physical box to a VM and try there?

One irrelevant point from my side but I couldn't unsee this  


Avk said:


> if (5 == cnt) mprotect(ptr, size, PROT_READ | PROT_WRITE);


This drove me crazy when I read code from my French colleagues. I thought it's a "French thing". My brain is not able to process the code like this, my hemispheres were fighting over this. I always had to manually redo it to `if(cnt==5)` in my local copy to make the code readable for me.


----------



## Paul Floyd (Aug 23, 2022)

_martin said:


> This drove me crazy when I read code from my French colleagues. I thought it's a "French thing". My brain is not able to process the code like this, my hemispheres were fighting over this. I always had to manually redo it to `if(cnt==5)` in my local copy to make the code readable for me.



That's an old-timers thing, to prevent against accidental typos causing errors because of assignment in if statements. If you type = instead of == then

if (cnt = 5)

is legal and will always be true

if (5 = cnt)

will not compile

Most compilers will warn about this now and ask you to add parens to make it clear that it is deliberate.


----------



## _martin (Aug 23, 2022)

Never thought about it that way. Anyway it's just terrible, even worse compared to switching from intel to att asm syntax and back (which is an interesting problem too but I kinda got used to that). Condition statements written as that one above make me lose focus and interrupt my flow in the head, if that makes sense to others.


----------



## Avk (Aug 23, 2022)

Andriy said:


> I think that it's a problem with backtrace(3) (or rather libunwind) then. It should be able to walk across a signal frame.
> See PR 243746.
> 
> How do you build the program?
> Maybe try using libunwind from devel/libunwind


I used this to build the program :
 gcc mprotect.c -rdynamic -fno-omit-frame-pointer -g -std=c99 -fno-inline  -lexecinfo -o mprotect1

Also modified the program to include both - dump_trace() and print_trace() functions. 


```
static void dump_trace() {
        size_t max_frames = 1024;
        void *buffer[max_frames];
        size_t calls = backtrace(buffer, max_frames);
        printf("##### %s #####\n", __FUNCTION__);
        fprintf(stderr, "dump_trace - have %zu frames\n", calls);
        backtrace_symbols_fd(buffer, calls, 2);
        //_Exit(EXIT_FAILURE);
}
```

With this I see the same kind of output for backtrace. 

##### print_trace #####
0x400e05 <print_trace+0x1f> at /data/home/aadhya/mprotect1
0x400fcd <handler+0x6a> at /data/home/aadhya/mprotect1

##### dump_trace #####
dump_trace - have 2 frames
0x400f0c <dump_trace+0x85> at /data/home/aadhya/mprotect1
0x400fd7 <handler+0x74> at /data/home/aadhya/mprotect1

Note that I have not tried that patch (PR 243746) or library (devel/libunwind) yet.


----------



## Avk (Aug 23, 2022)

_martin said:


> _*[FONT=monospace]Avk[/FONT]*_ Why is it a problem to replicate this in your own environment ? E.g. do a copy of the physical box to a VM and try there?


Actually we tried a lot to repro this issue in local environment, but it's really difficult sometime to replicate customer environment (when there are huge network traffic of different types and in different sequence). If there was a local repro of this issue, it would have been lot more easier to debug this using gdb.


----------



## _martin (Aug 23, 2022)

Does it mean you did run it in debug environment but you were not able to trigger the issue due to lack of the conditions (traffic) that could cause this ?

Maybe one more angle to look at it - what if the arr[] in the code is victim of the different issue? Is arr[] global variable? Are there any other variables defined in the same scope as arr[] is? Print addresses of those to see which ones end up next to each other. Could the variable next to it be a problem? Such as off-by-one in string operation,etc.


----------



## _martin (Aug 23, 2022)

Btw. you started to get the information from the handler yourself but for some reason you hashed it out. I did a small redo on your handler to demonstrate:

```
void handler(int sig_num, siginfo_t *sig, void *unused) {
        ucontext_t *uc = (ucontext_t *)unused;
        printf("returning to: %p\n", uc->uc_mcontext.mc_rip);

        // or let's be evil ..
        uc->uc_mcontext.mc_rip = 0xcafec0de;

        if (cnt == 10) {
                printf("let's just end this..\n");
                _exit(1);
        }

        cnt++;
        printf("signum: %d, count: %lu\n", sig_num, cnt);

        if (cnt == 5) {
                mprotect(ptr, size, PROT_READ | PROT_WRITE);
        }
}
```

Also if you read the siginfo(3) man page you'll see this:


> In    addition, the following    signal-specific    information is available:
> ..
> ..
> SIGSEGV    _si___addr_         address of faulting memory reference


Which means meaning of sig->si_addr changes depending on the signal.

In my handler above I'm modifying the saved rip to jump somewhere else. Normally one would use setjmp() and friends for cleaner code but this is just a demonstration.


----------



## bakul (Aug 23, 2022)

breakpoint - related to code. You can always replace the instruction at a breakpoint with some thing that will cause a trap and allow a debugger to take control an when you continue or single step, it will put back the original instruction & continue.

watchpoint - related to data. You can (indirectly via mprotect) fiddle with the TLB entry for a page to cause a trap but this is not precise enough. This is why Intel added 4 watchpoint address registers. The h/w must watch every data access and when it matches one of these addresses it does what the associated control bits say.

See `/usr/include/x86/reg.h` for the defn of `__dbreg64` (or 32) & https://en.wikipedia.org/wiki/X86_debug_register
I think you will have to use the ptrace(2) syscall to interface with this. There may be some online tutororial that shows how to use it.


----------



## bakul (Aug 24, 2022)

Avk said:


> In production environment we are not allowed to run using gdb as correctly mentioned by reddy.


You will end up wasting a lot of time as you will have to implement a bunch of things that gab already provides. Talk to your manager or the right muckymuck and point out the time and money cost of avoiding gdb. It is better to implement a privilege / permission system as well as do logging so that gdb is used when absolutely required and under tight control.


----------



## _martin (Aug 24, 2022)

Sometimes is makes sense to read the whole thread and then answer. These all have been already said.


----------



## Avk (Aug 24, 2022)

_martin said:


> Does it mean you did run it in debug environment but you were not able to trigger the issue due to lack of the conditions (traffic) that could cause this ?
> 
> Maybe one more angle to look at it - what if the arr[] in the code is victim of the different issue? Is arr[] global variable? Are there any other variables defined in the same scope as arr[] is? Print addresses of those to see which ones end up next to each other. Could the variable next to it be a problem? Such as off-by-one in string operation,etc.


Once this issue happen a core file is generated and process restarts. In local environment we are not able to generate core file (also no process restart) even with traffic; that's mean we are not able to repro this in local environment. May be the kind of traffic we are running for reproduction purpose in local environment doesn't match with that of customer environment. 

Regarding arr[], it is basically a part of a big structure and it is dynamically allocated. The parent variable is global and there are many structures involved and this array of pointers arr[] is part of one of them.  

The issue always happen specifically for arr[1] element, but arr[0] or arr[2] are intact. Hence there is little chance some other array[] has overflowed into this array.


----------



## Bobi B. (Aug 25, 2022)

Did you consider to run a new thread that basically does

```
volatile uint8_t *ptr = ...;
while (true)
  if (!*ptr)
    force_core_dump();
```
One core will be fully utilised, but with some luck in couple of crashes you might be able to catch the culprit by analysing the core dump.


----------



## _martin (Aug 25, 2022)

Avk said:


> Once this issue happen a core file is generated and process restarts.


Ok, understood. But I see you do have short downtime of a service as process has to be restarted. We don't know what is the justification against gdb in your live environment but seems you do have case for it here.

To elaborate on my code above a bit further. The saved context (3rd param in handler) is the context handler will used to sigreturn (i.e. return from handler). There you can examine and possibly control the flow. But this does require you to know what is the program doing, i.e.:


```
./mprotect
Starting ....0x800a09700 : size = 16384
signum: 11, count: 1, returning to: 0x201cc1
something tried to write to ptr @ 0x201cc1
signum: 11, count: 2, returning to: 0x201cc1
something tried to write to ptr @ 0x201cc1
signum: 11, count: 3, returning to: 0x201cc1
something tried to write to ptr @ 0x201cc1
signum: 11, count: 4, returning to: 0x201cc1
something tried to write to ptr @ 0x201cc1
signum: 11, count: 5, returning to: 0x201cc1
something tried to write to ptr @ 0x201cc1
ptr: 0
All completed...
```

Let's see what that is with `objdump -d mprotect`:

```
201c69:    48 8b 04 25 d8 3f 20     mov    rax,QWORD PTR ds:0x203fd8
  201c70:    00
  201c71:    c7 00 00 00 00 00        mov    DWORD PTR [rax],0x0
```
I've compiled the binary with debug symbols so I can easily verify that with `readelf -Wa mprotect | grep 203fd8`

```
40: 0000000000203fd8     8 OBJECT  GLOBAL DEFAULT   24 ptr
```
 and confirmed that it's the code: 
	
	



```
ptr[0] = 0;
```
From the asm output you can see the value 0 is used as immediate, i.e. you won't find this value in saved context.

If you are using threads you can have race condition in handler and you need to have some logic (mutex) to control the mprotect. With this handler you should also take care of other possibilities of SIGSEGV. Simplest case would be to compare si_addr to ptr, if it's not it do exit.

Handler I used in my code is pretty much the same, pasting for completeness:
	
	



```
void handler(int sig_num, siginfo_t *sig, void *unused) {
        cnt++;
        ucontext_t *uc = (ucontext_t *)unused;

        printf("signum: %d, count: %lu, returning to: 0x%lx\n", sig_num, cnt, uc->uc_mcontext.mc_rip);

        if (sig->si_addr == ptr) {
                printf("something tried to write to ptr @ 0x%lx\n", uc->uc_mcontext.mc_rip);
        }

        if (cnt == 5) {
                mprotect(ptr, size, PROT_READ | PROT_WRITE);
        }
}
```
I like this problem as an idea. I went through this thread again but didn't find the answer: why is it a problem first just to do a printf debugging to see what is being written where? You know all locations in code where arr is being used and written to. Just use printf before it to print within what function what is being written.


----------



## Avk (Aug 26, 2022)

_martin said:


> Ok, understood. But I see you do have short downtime of a service as process has to be restarted. We don't know what is the justification against gdb in your live environment but seems you do have case for it here.
> 
> To elaborate on my code above a bit further. The saved context (3rd param in handler) is the context handler will used to sigreturn (i.e. return from handler). There you can examine and possibly control the flow. But this does require you to know what is the program doing, i.e.:
> 
> ...


This is *excellent post*, though I have few clarifications.
In your case it is returning to address 0x201cc1.  ==>   "signum: 11, count: 4, returning to: 0x201cc1"
Ideally it should point to the address where it is causing the interrupt. But in your case the address 0x201c71 contains the instruction that do the 0 assignment.

201c71:    c7 00 00 00 00 00        mov    DWORD PTR [rax],0x0

In my case the return address points to exactly the position where I do the 0 assignment.

2 : signum -> 11,  returning to address ->* 0x401137*, ptr[0] = 9
something tried to write to ptr @ 0x401137

  401130:       48 8b 05 d9 05 20 00    mov    0x2005d9(%rip),%rax        # 601710 <ptr>
*401137:*       c7 00 00 00 00 00       movl   $0x0,(%rax)
  40113d:       bf 81 12 40 00          mov    $0x401281,%edi

The readelf output for my case is this :

$ readelf -Wa mprotect1 | grep 601710
    13: 0000000000601710     8 OBJECT  GLOBAL DEFAULT   23 ptr
    85: 0000000000601710     8 OBJECT  GLOBAL DEFAULT   23 ptr

What is the meaning of 23 ptr above ?
Thanks !!!


----------



## _martin (Aug 26, 2022)

Avk said:


> Ideally it should point to the address where it is causing the interrupt.


It's exactly that -- instruction that caused the issue and hence interrupt occurred. It can't get any better than this, you have the exact address of the fault.

Addresses can differ as we're most likely using different compiler (and I've slightly different code). Your faulting address is logically the same one (assigning 0 to *ptr) but it's just different virtual address. As you're compiling the code you can choose pretty much whatever (with some exceptions) address you like. If you use this Makefile

```
CFLAGS=-g -O0 -Wall -Wpedantic -Ttext 0xcafe000

mprotect:    mprotect.c
    clang $(CFLAGS) -o mprotect mprotect.c

clean:
    rm -f *.o mprotect
```
Then the faulting address will be somewhere around `0xcafe4a1`
23 is the index number into the symbol table. Not relevant to anything here, it's ELF specific.


----------



## Avk (Aug 28, 2022)

Probably one last challenge with this approach is this :

When your signal handler returns (assuming it doesn't call exit or longjmp or something that prevents it from actually returning), the code will continue at the point the signal occurred, re-executing the same instruction. Since at this point, the memory protection has not been changed, it will just throw the signal again, and you'll be back in your signal handler in an infinite loop.

That's the reason I used this statement in my signal handler. The signal is generated for 5 times before I set read/write permission.

```
if (5 == cnt)
            mprotect(ptr, size, PROT_READ | PROT_WRITE);
```

Now the challenge is in the field we can't block the program execution by generating signal whenever the arr[] is accessed (write operation). If signal is generated we need to unblock it immediately, may be like this :


```
void handler(int sig_num, siginfo_t *sig, void *unused) {
        // inside signal handler
        ucontext_t *uc = (ucontext_t *)unused;
        cnt++;
        printf("\n\n%d : signum -> %d,  returning to address -> %p, ptr[0] = %d\n", cnt, sig_num, uc->uc_mcontext.mc_rip, ptr[0]);

        if (sig->si_addr == ptr) {
           printf("something tried to write to ptr @ 0x%lx\n", uc->uc_mcontext.mc_rip);
        }
        mprotect(ptr, size, PROT_READ | PROT_WRITE);
}
```

After signal handler is executed and returned, the control would re-execute the same instruction (write operation on arr[]) and no signal would be generated this time.  Once the write instruction is passed, somewhere I need to protect the memory again mprotect(ptr, size, PROT_READ);  so that next time tries to write the arr[] again, the signal handler should execute again.

May be from the signal handler I should start a timer so that after few millisecond the timer handler execute this code and enable the memory protection ?

```
mprotect(ptr, size, PROT_READ);
```

I am just guessing this could be one approach, there may better way to achieve this.


----------



## _martin (Aug 28, 2022)

That timer approach would cause you headache and be source of a serious performance degradation.

The safest thing that I could think of right now is what was actually suggested here by two people -- use a wrapper function around write to array. Something like:

```
wrapper_write(int* array, int pos, int val) {
   *(array+pos) = val;
    mprotect(array, size, PROT_READ);
}
```
But if you go through the trouble of changing the code to this yet again, printf debugging will save you lot of issues.


----------



## Avk (Sep 1, 2022)

From the signal handler we can tell what is the returning address using : *sig_num, uc->uc_mcontext.mc_rip*


```
void handler(int sig_num, siginfo_t *sig, void *unused) {
        // inside signal handler
        ucontext_t *uc = (ucontext_t *)unused;
        cnt++;
        printf("\n\n%d : signum -> %d,  returning to address -> %p, ptr[0] = %d\n", cnt, sig_num, uc->uc_mcontext.mc_rip, ptr[0]);

        if (sig->si_addr == ptr) {
           printf("something tried to write to ptr @ 0x%lx\n", uc->uc_mcontext.mc_rip);
        }
        mprotect(ptr, size, PROT_READ | PROT_WRITE);
}
```

If we know the returning address, can we not fetch the whole instruction from that address and print within the signal handler ?


----------



## _martin (Sep 1, 2022)

It's PITA to decode the x86 instructions. Its size is variable (up to 15B) and is a challenge on its own.
You could though print first 15B in the handler (starting at `uc->uc_mcontext.mc_rip`) and manually decide what to do with it.

Note you said that you have problem with arr[1] only. So you could focus on this in handler. Check if `sig->si_addr == (ptr+1)` and then do deeper actions. Or, once again, do the printf debugging first to see what is being modified when.


----------



## _martin (Sep 15, 2022)

Avk: I wonder - were you able to find the bug?


----------



## Avk (Sep 16, 2022)

_martin said:


> Avk: I wonder - were you able to find the bug?


Not yet. This bug was kind of off focus for last few weeks ... but need to actively work on this. Looks like many customers have started reporting this recently.


----------



## Avk (Sep 21, 2022)

One quick question, what happen once we protect a memory and then free the memory ?
Something like this :
ptr = (int*)malloc(size);
mprotect(ptr, size, PROT_READ);  // enable the protection

Now instead of disable the protection if we free the memory, what happen ?
free(ptr);

Does it keep generating the interrupt once someone tries to access the memory from the same process ?
It is possible that if we call the malloc() again, OS might allocate the same memory which was under protection.
In that case will the memory protection remain valid or freeing the memory would disable the protection ?


----------



## _martin (Sep 21, 2022)

Avk said:


> what happen once we protect a memory and then free the memory


It depends whether it was already written to it before (and possibly depends on systems's jemalloc optimization in malloc.conf). 

Generally SIGSEGV would occur as jemalloc (FreeBSD's heap allocator) is not able to write its metadata to the chunk.
Note though mprotect() granularity is PAGE_SIZE (or even the whole region according to mprotect(2), heap chunk can be way smaller (and most likely will not start on page boundary). It was assumed you're getting your memory region from mmap. You should not touch malloc chunks with mprotect.



Avk said:


> Does it keep generating the interrupt once someone tries to access the memory from the same process ?


Yes, SIGSEGV is generated, i.e. your handler is called.



Avk said:


> It is possible that if we call the malloc() again, OS might allocate the same memory which was under protection.


As mentioned above It depends. No if write was done after malloc. Yes if you did malloc(), mprotect(), and free() without any write to ptr (and no other malloc was done in between) -- chunk would be returned to allocator and subsequent malloc() of the similar size would return the same pointer but now pointing to read-only memory. Subsequent write would case SIGSEGV.


----------

