# FreeBSD perf stat equivalent



## eatonphil (Dec 9, 2015)

Is there a FreeBSD equivalent to the Linux command `perf stat -r 10 <your app and arguments>` that will run a command 10 times and give timing data?


----------



## junovitch@ (Dec 12, 2015)

hwpmc(4) or dtrace(1).  See also https://wiki.FreeBSD.org/DTrace.


----------



## diizzy (Dec 13, 2015)

You're probably looking for this:


----------



## Jimmy (Dec 15, 2015)

Is there a video which goes with this presentation?


----------



## tobik@ (Dec 15, 2015)

Jimmy said:


> Is there a video which goes with this presentation?


----------



## Atsuri (Dec 15, 2015)

diizzy 
Thanks for the graph. That's exactly what I need! 

tobik
Thank you for the share. Makes me proud I'm using such a great OS as FreeBSD .

~Andy


----------



## junovitch@ (Dec 16, 2015)

Thanks all. There is also a good treasure trove of information at Brendan's site (http://www.brendangregg.com/index.html). It's well worth bookmarking that site. Atsuri, I went ahead and marked this "solved" for you.


----------



## Atsuri (Dec 17, 2015)

junovitch@ said:


> Thanks all. There is also a good treasure trove of information at Brendan's site (http://www.brendangregg.com/index.html). It's well worth bookmarking that site. Atsuri, I went ahead and marked this "solved" for you.



Thank you, though I am not the OP . Didn't mean to hijack the thread, sorry . Still, I think this answers the original question.


----------



## junovitch@ (Dec 19, 2015)

Atsuri said:


> Thank you, though I am not the OP . Didn't mean to hijack the thread, sorry . Still, I think this answers the original question.


Indeed.  I didn't look closely and assumed by your verbiage you were the original poster.  I'll leave that to eatonphil to clarify if everything that was needed was answered.


----------



## Paul Floyd (Apr 27, 2022)

And is there a sampling profiler available (like perf record or Oracle collect)?

I know about callgrind and cachegrind , which are a bit different.


----------



## ajs (Apr 27, 2022)

The OP's original question seemed to be about timing performance of user-land programs with a benchmarking program.  There's a good cross platform tool in the benchmarks/hyperfine port.   https://github.com/sharkdp/hyperfine


----------



## Paul Floyd (Apr 28, 2022)

I would not call "perf stat" a timing benchmark tool.

It will generate output like


```
0.546392      task-clock:u (msec)       #    0.443 CPUs utilized            ( +-  3.87% )
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
               159      page-faults:u             #    0.291 M/sec
            327450      cycles:u                  #    0.599 GHz                      ( +-  7.86% )  (77.75%)
           1653670      stalled-cycles-frontend:u #  505.01% frontend cycles idle     ( +-  4.26% )
           1585953      stalled-cycles-backend:u  #  484.33% backend cycles idle      ( +-  4.29% )
            187607      instructions:u            #    0.57  insn per cycle
                                                  #    8.81  stalled cycles per insn  ( +-  0.00% )
             41232      branches:u                #   75.463 M/sec                    ( +-  0.00% )
              2965      branch-misses:u           #    7.19% of all branches          ( +- 37.10% )  (22.25%)

       0.001233042 seconds time elapsed                                          ( +- 11.59% )
```


Tools like this* basically do either or both of two things

Use Performance Monitoring Counters (PMCs). These are hardware counters that can monitor events such as cache-misses, branch-misses etc. These can either be aggregated into statistics or recorded and later viewed as a flamegraph or calltree.
Use sampling. In this mode, the tool will run an extra thread with a timer (say 99Hz) that takes a snapshot of the callstack every time. Again with postprocessing this can generate flamegraphs and calltrees.
The OP asked about `perf stat` which corresponds to point 1 in aggregation mode.

hyperfine looks interesting but it seems to be just a benchmarking framework to time multiple runs.

I need to look at pmcstat more closely, I'm not sure th

* Linux perf, Oracle collect, Intel VTune are the ones I know of, there is also gprof but that needs the application to be recompiled


----------

