Commit Graph

405 Commits

Author SHA1 Message Date
Linus Torvalds eeee78cf77 Some clean ups and small fixes, but the biggest change is the addition
of the TRACE_DEFINE_ENUM() macro that can be used by tracepoints.
 
 Tracepoints have helper functions for the TP_printk() called
 __print_symbolic() and __print_flags() that lets a numeric number be
 displayed as a a human comprehensible text. What is placed in the
 TP_printk() is also shown in the tracepoint format file such that
 user space tools like perf and trace-cmd can parse the binary data
 and express the values too. Unfortunately, the way the TRACE_EVENT()
 macro works, anything placed in the TP_printk() will be shown pretty
 much exactly as is. The problem arises when enums are used. That's
 because unlike macros, enums will not be changed into their values
 by the C pre-processor. Thus, the enum string is exported to the
 format file, and this makes it useless for user space tools.
 
 The TRACE_DEFINE_ENUM() solves this by converting the enum strings
 in the TP_printk() format into their number, and that is what is
 shown to user space. For example, the tracepoint tlb_flush currently
 has this in its format file:
 
      __print_symbolic(REC->reason,
         { TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
         { TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
         { TLB_LOCAL_SHOOTDOWN, "local shootdown" },
         { TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })
 
 After adding:
 
      TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
      TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
      TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
      TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);
 
 Its format file will contain this:
 
      __print_symbolic(REC->reason,
         { 0, "flush on task switch" },
         { 1, "remote shootdown" },
         { 2, "local shootdown" },
         { 3, "local mm shootdown" })
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJVLBTuAAoJEEjnJuOKh9ldjHMIALdRS755TXCZGOf0r7O2akOR
 wMPeum7C+ae1mH+jCsJKUC0/jUfQKaMt/UxoHlipDgcGg8kD2jtGnGCw4Xlwvdsr
 y4rFmcTRSl1mo0zDSsg6ujoupHlVYN0+JPjrd7S3cv/llJoY49zcanNLF7S2XLeM
 dZCtWRLWYpBiWO68ai6AqJTnE/eGFIqBI048qb5Eg8dbK243SSeSIf9Ywhb+VsA+
 aq6F7cWI/H6j4tbeza8tAN19dcwenDro5EfCDY8ARQHJu1f6Y3+DLf2imjkd6Aiu
 JVAoGIjHIpI+djwCZC1u4gi4urjfOqYartrM3Q54tb3YWYqHeNqP2ASI2a4EpYk=
 =Ixwt
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "Some clean ups and small fixes, but the biggest change is the addition
  of the TRACE_DEFINE_ENUM() macro that can be used by tracepoints.

  Tracepoints have helper functions for the TP_printk() called
  __print_symbolic() and __print_flags() that lets a numeric number be
  displayed as a a human comprehensible text.  What is placed in the
  TP_printk() is also shown in the tracepoint format file such that user
  space tools like perf and trace-cmd can parse the binary data and
  express the values too.  Unfortunately, the way the TRACE_EVENT()
  macro works, anything placed in the TP_printk() will be shown pretty
  much exactly as is.  The problem arises when enums are used.  That's
  because unlike macros, enums will not be changed into their values by
  the C pre-processor.  Thus, the enum string is exported to the format
  file, and this makes it useless for user space tools.

  The TRACE_DEFINE_ENUM() solves this by converting the enum strings in
  the TP_printk() format into their number, and that is what is shown to
  user space.  For example, the tracepoint tlb_flush currently has this
  in its format file:

     __print_symbolic(REC->reason,
        { TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
        { TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
        { TLB_LOCAL_SHOOTDOWN, "local shootdown" },
        { TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })

  After adding:

     TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
     TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
     TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
     TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);

  Its format file will contain this:

     __print_symbolic(REC->reason,
        { 0, "flush on task switch" },
        { 1, "remote shootdown" },
        { 2, "local shootdown" },
        { 3, "local mm shootdown" })"

* tag 'trace-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (27 commits)
  tracing: Add enum_map file to show enums that have been mapped
  writeback: Export enums used by tracepoint to user space
  v4l: Export enums used by tracepoints to user space
  SUNRPC: Export enums in tracepoints to user space
  mm: tracing: Export enums in tracepoints to user space
  irq/tracing: Export enums in tracepoints to user space
  f2fs: Export the enums in the tracepoints to userspace
  net/9p/tracing: Export enums in tracepoints to userspace
  x86/tlb/trace: Export enums in used by tlb_flush tracepoint
  tracing/samples: Update the trace-event-sample.h with TRACE_DEFINE_ENUM()
  tracing: Allow for modules to convert their enums to values
  tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
  tracing: Update trace-event-sample with TRACE_SYSTEM_VAR documentation
  tracing: Give system name a pointer
  brcmsmac: Move each system tracepoints to their own header
  iwlwifi: Move each system tracepoints to their own header
  mac80211: Move message tracepoints to their own header
  tracing: Add TRACE_SYSTEM_VAR to xhci-hcd
  tracing: Add TRACE_SYSTEM_VAR to kvm-s390
  tracing: Add TRACE_SYSTEM_VAR to intel-sst
  ...
2015-04-14 10:49:03 -07:00
Steven Rostedt (Red Hat) 0c564a538a tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
Several tracepoints use the helper functions __print_symbolic() or
__print_flags() and pass in enums that do the mapping between the
binary data stored and the value to print. This works well for reading
the ASCII trace files, but when the data is read via userspace tools
such as perf and trace-cmd, the conversion of the binary value to a
human string format is lost if an enum is used, as userspace does not
have access to what the ENUM is.

For example, the tracepoint trace_tlb_flush() has:

 __print_symbolic(REC->reason,
    { TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
    { TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
    { TLB_LOCAL_SHOOTDOWN, "local shootdown" },
    { TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })

Which maps the enum values to the strings they represent. But perf and
trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would
not be able to map it.

With TRACE_DEFINE_ENUM(), developers can place these in the event header
files and ftrace will convert the enums to their values:

By adding:

 TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
 TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
 TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
 TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);

 $ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format
[...]
 __print_symbolic(REC->reason,
    { 0, "flush on task switch" },
    { 1, "remote shootdown" },
    { 2, "local shootdown" },
    { 3, "local mm shootdown" })

The above is what userspace expects to see, and tools do not need to
be modified to parse them.

Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org

Cc: Guilherme Cox <cox@computer.org>
Cc: Tony Luck <tony.luck@gmail.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-04-08 09:39:56 -04:00
Steven Rostedt (Red Hat) 8434dc9340 tracing: Convert the tracing facility over to use tracefs
debugfs was fine for the tracing facility as a quick way to get
an interface. Now that tracing has matured, it should separate itself
from debugfs such that it can be mounted separately without needing
to mount all of debugfs with it. That is, users resist using tracing
because it requires mounting debugfs. Having tracing have its own file
system lets users get the features of tracing without needing to bring
in the rest of the kernel's debug infrastructure.

Another reason for tracefs is that debubfs does not support mkdir.
Currently, to create instances, one does a mkdir in the tracing/instance
directory. This is implemented via a hack that forces debugfs to do
something it is not intended on doing. By converting over to tracefs, this
hack can be removed and mkdir can be properly implemented. This patch does
not address this yet, but it lays the ground work for that to be done.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-02-03 12:48:41 -05:00
Steven Rostedt (Red Hat) c602894814 tracing: Make tracing_init_dentry_tr() static
tracing_init_dentry_tr() is not used outside of trace.c, it should
be static.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-02-02 10:22:23 -05:00
Steven Rostedt (Red Hat) cf6ab6d914 tracing: Add ref count to tracer for when they are being read by pipe
When one of the trace pipe files are being read (by either the trace_pipe
or trace_pipe_raw), do not allow the current_trace to change. By adding
a ref count that is incremented when the pipe files are opened, will
prevent the current_trace from being changed.

This will allow for the removal of the global trace_types_lock from
reading the pipe buffers (which is currently a bottle neck).

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-22 15:39:40 -05:00
Steven Rostedt (Red Hat) 0daa230296 tracing: Add tp_printk cmdline to have tracepoints go to printk()
Add the kernel command line tp_printk option that will have tracepoints
that are active sent to printk() as well as to the trace buffer.

Passing "tp_printk" will activate this. To turn it off, the sysctl
/proc/sys/kernel/tracepoint_printk can have '0' echoed into it. Note,
this only works if the cmdline option is used. Echoing 1 into the sysctl
file without the cmdline option will have no affect.

Note, this is a dangerous option. Having high frequency tracepoints send
their data to printk() can possibly cause a live lock. This is another
reason why this is only active if the command line option is used.

Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412121539300.16494@nanos

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-15 10:17:38 -05:00
Steven Rostedt (Red Hat) 5f893b2639 tracing: Move enabling tracepoints to just after rcu_init()
Enabling tracepoints at boot up can be very useful. The tracepoint
can be initialized right after RCU has been. There's no need to
wait for the early_initcall() to be called. That's too late for some
things that can use tracepoints for debugging. Move the logic to
enable tracepoints out of the initcalls and into init/main.c to
right after rcu_init().

This also allows trace_printk() to be used early too.

Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412121539300.16494@nanos
Link: http://lkml.kernel.org/r/20141214164104.307127356@goodmis.org

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-15 10:16:50 -05:00
Byungchul Park 8e1e1df29d tracing: Add additional marks to signal very large time deltas
Currently, function graph tracer prints "!" or "+" just before
function execution time to signal a function overhead, depending
on the time. And some tracers tracing latency also print "!" or
"+" just after time to signal overhead, depending on the interval
between events. Even it is usually enough to do that, we sometimes
need to signal for bigger execution time than 100 micro seconds.

For example, I used function graph tracer to detect if there is
any case that exit_mm() takes too much time. I did following steps
in /sys/kernel/debug/tracing. It was easier to detect very large
excution time with patched kernel than with original kernel.

$ echo exit_mm > set_graph_function
$ echo function_graph > current_tracer
$ echo > trace
$ cat trace_pipe > $LOGFILE
 ... (do something and terminate logging)
$ grep "\\$" $LOGFILE
 3) $ 22082032 us |                      } /* kernel_map_pages */
 3) $ 22082040 us |                    } /* free_pages_prepare */
 3) $ 22082113 us |                  } /* free_hot_cold_page */
 3) $ 22083455 us |                } /* free_hot_cold_page_list */
 3) $ 22083895 us |              } /* release_pages */
 3) $ 22177873 us |            } /* free_pages_and_swap_cache */
 3) $ 22178929 us |          } /* unmap_single_vma */
 3) $ 22198885 us |        } /* unmap_vmas */
 3) $ 22206949 us |      } /* exit_mmap */
 3) $ 22207659 us |    } /* mmput */
 3) $ 22207793 us |  } /* exit_mm */

And then, it was easy to find out that a schedule-out occured by
sub_preempt_count() within kernel_map_pages().

To detect very large function exection time caused by either problematic
function implementation or scheduling issues, this patch can be useful.

Link: http://lkml.kernel.org/r/1416789259-24038-1-git-send-email-byungchul.park@lge.com

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-03 17:10:13 -05:00
Steven Rostedt (Red Hat) 9d9add34ec tracing: Have function_graph use trace_seq_has_overflowed()
Instead of doing individual checks all over the place that makes the code
very messy. Just check trace_seq_has_overflowed() at the end or in
strategic places.

This makes the code much cleaner and also helps with getting closer
to removing the return values of trace_seq_printf() and friends.

Link: http://lkml.kernel.org/r/20141114011410.987913836@goodmis.org

Reviewed-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-19 15:25:42 -05:00
Steven Rostedt (Red Hat) 19a7fe2062 tracing: Add trace_seq_has_overflowed() and trace_handle_return()
Adding a trace_seq_has_overflowed() which returns true if the trace_seq
had too much written into it allows us to simplify the code.

Instead of checking the return value of every call to trace_seq_printf()
and friends, they can all be called normally, and at the end we can
return !trace_seq_has_overflowed() instead.

Several functions also return TRACE_TYPE_PARTIAL_LINE when the trace_seq
overflowed and TRACE_TYPE_HANDLED otherwise. Another helper function
was created called trace_handle_return() which takes a trace_seq and
returns these enums. Using this helper function also simplifies the
code.

This change also makes it possible to remove the return values of
trace_seq_printf() and friends. They should instead just be
void functions.

Link: http://lkml.kernel.org/r/20141114011410.365183157@goodmis.org

Reviewed-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-19 15:25:39 -05:00
Steven Rostedt (Red Hat) 243f7610a6 tracing: Move tracing_sched_{switch,wakeup}() into wakeup tracer
The only code that references tracing_sched_switch_trace() and
tracing_sched_wakeup_trace() is the wakeup latency tracer. Those
two functions use to belong to the sched_switch tracer which has
long been removed. These functions were left behind because the
wakeup latency tracer used them. But since the wakeup latency tracer
is the only one to use them, they should be static functions inside
that code.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-11 12:43:15 -05:00
Oleg Nesterov 632537256e tracing: Kill tracing_{start,stop}_sched_switch_record() and tracing_sched_switch_assign_trace()
tracing_{start,stop}_sched_switch_record() have no callers since
87d80de280 "tracing: Remove obsolete sched_switch tracer".

The last caller of tracing_sched_switch_assign_trace() was removed
by 30dbb20e68 "tracing: Remove boot tracer".

Link: http://lkml.kernel.org/p/20140723193501.GA30214@redhat.com

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-11 12:42:23 -05:00
Stanislav Fomichev 6508fa761c tracing: let user specify tracing_thresh after selecting function_graph
Currently, tracing_thresh works only if we specify it before selecting
function_graph tracer. If we do the opposite, tracing_thresh will change
it's value, but it will not be applied.
To fix it, we add update_thresh callback which is called whenever
tracing_thresh is updated and for function_graph tracer we register
handler which reinitializes tracer depending on tracing_thresh.

Link: http://lkml.kernel.org/p/20140718111727.GA3206@stfomichev-desktop.yandex.net

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-18 15:48:52 -04:00
Steven Rostedt (Red Hat) da9c3413a2 tracing: Fix check of ftrace_trace_arrays list_empty() check
The check that tests if ftrace_trace_arrays is empty in
top_trace_array(), uses the .prev pointer:

  if (list_empty(ftrace_trace_arrays.prev))

instead of testing the variable itself:

  if (list_empty(&ftrace_trace_arrays))

Although it is technically correct, it is awkward and confusing.
Use the proper method.

Link: http://lkml.kernel.org/r/87oay1bas8.fsf@sejong.aot.lge.com

Reported-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-10 13:53:50 -04:00
Yoshihiro YUNOMAE dc81e5e3ab tracing: Return error if ftrace_trace_arrays list is empty
ftrace_trace_arrays links global_trace.list. However, global_trace
is not added to ftrace_trace_arrays if trace_alloc_buffers() failed.
As the result, ftrace_trace_arrays becomes an empty list. If
ftrace_trace_arrays is an empty list, current top_trace_array() returns
an invalid pointer. As the result, the kernel can induce memory corruption
or panic.

Current implementation does not check whether ftrace_trace_arrays is empty
list or not. So, in this patch, if ftrace_trace_arrays is empty list,
top_trace_array() returns NULL. Moreover, this patch makes all functions
calling top_trace_array() handle it appropriately.

Link: http://lkml.kernel.org/p/20140605223517.32311.99233.stgit@yunodevel

Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-06 04:47:46 -04:00
Robert Elliott 607e3a2920 tracing: Add funcgraph_tail option to print function name after closing braces
In the function-graph tracer, add a funcgraph_tail option
to print the function name on all } lines, not just
functions whose first line is no longer in the trace
buffer.

If a function calls other traced functions, its total
time appears on its } line.  This change allows grep
to be used to determine the function for which the
line corresponds.

Update Documentation/trace/ftrace.txt to describe
this new option.

Link: http://lkml.kernel.org/p/20140520221041.8359.6782.stgit@beardog.cce.hp.com

Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-20 23:29:32 -04:00
Robert Elliott ccdb594653 tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx defines
Eliminate duplicate TRACE_GRAPH_PRINT_xx defines
in trace_functions_graph.c that are already in
trace.h.

Add TRACE_GRAPH_PRINT_IRQS to trace.h, which is
the only one that is missing.

Link: http://lkml.kernel.org/p/20140520221031.8359.24733.stgit@beardog.cce.hp.com

Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-20 23:28:34 -04:00
Steven Rostedt (Red Hat) b1169cc69b tracing: Remove mock up poll wait function
Now that the ring buffer has a built in way to wake up readers
when there's data, using irq_work such that it is safe to do it
in any context. But it was still using the old "poor man's"
wait polling that checks every 1/10 of a second to see if it
should wake up a waiter. This makes the latency for a wake up
excruciatingly long. No need to do that anymore.

Completely remove the different wait_poll types from the tracers
and have them all use the default one now.

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-30 08:40:05 -04:00
Jiaxing Wang 7eea4fce02 tracing/stack_trace: Skip 4 instead of 3 when using ftrace_ops_list_func
When using ftrace_ops_list_func, we should skip 4 instead of 3,
to avoid ftrace_call+0x5/0xb appearing in the stack trace:

        Depth    Size   Location    (110 entries)
        -----    ----   --------
  0)     2956       0   update_curr+0xe/0x1e0
  1)     2956      68   ftrace_call+0x5/0xb
  2)     2888      92   enqueue_entity+0x53/0xe80
  3)     2796      80   enqueue_task_fair+0x47/0x7e0
  4)     2716      28   enqueue_task+0x45/0x70
  5)     2688      12   activate_task+0x22/0x30

Add a function using_ftrace_ops_list_func() to test for this while keeping
ftrace_ops_list_func to remain static.

Link: http://lkml.kernel.org/p/1398006644-5935-2-git-send-email-wangjiaxing@insigma.com.cn

Signed-off-by: Jiaxing Wang <wangjiaxing@insigma.com.cn>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-24 13:36:03 -04:00
Steven Rostedt (Red Hat) 0b9b12c1b8 tracing: Move ftrace_max_lock into trace_array
In preparation for having tracers enabled in instances, the max_lock
should be unique as updating the max for one tracer is a separate
operation than updating it for another tracer using a different max.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21 13:59:27 -04:00
Steven Rostedt (Red Hat) 6d9b3fa5e7 tracing: Move tracing_max_latency into trace_array
In preparation for letting the latency tracers be used by instances,
remove the global tracing_max_latency variable and add a max_latency
field to the trace_array that the latency tracers will now use.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21 13:59:26 -04:00
Steven Rostedt (Red Hat) 4104d326b6 ftrace: Remove global function list and call function directly
Instead of having a list of global functions that are called,
as only one global function is allow to be enabled at a time, there's
no reason to have a list.

Instead, simply have all the users of the global ops, use the global ops
directly, instead of registering their own ftrace_ops. Just switch what
function is used before enabling the function tracer.

This removes a lot of code as well as the complexity involved with it.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21 13:59:25 -04:00
Gideon Israel Dsouza 52f5684c8e kernel: use macros from compiler.h instead of __attribute__((...))
To increase compiler portability there is <linux/compiler.h> which
provides convenience macros for various gcc constructs.  Eg: __weak for
__attribute__((weak)).  I've replaced all instances of gcc attributes
with the right macro in the kernel subsystem.

Signed-off-by: Gideon Israel Dsouza <gidisrael@gmail.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:36:11 -07:00
Steven Rostedt (Red Hat) 591dffdade ftrace: Allow for function tracing instance to filter functions
Create a "set_ftrace_filter" and "set_ftrace_notrace" files in the instance
directories to let users filter of functions to trace for the given instance.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20 12:29:07 -05:00
Steven Rostedt (Red Hat) f20a580627 ftrace: Allow instances to use function tracing
Allow instances (sub-buffers) to enable function tracing.
Each instance will have its own function tracing capability.
For now, instances will not have function stack tracing, or will
they be able to pick and choose what functions they can trace.

Picking and choosing their own functions will come later.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20 12:13:18 -05:00
Steven Rostedt (Red Hat) 50512ab576 tracing: Convert tracer->enabled to counter
As tracers will soon be used by instances, the tracer enabled field
needs to be converted to a counter instead of a boolean.
This counter is protected by the trace_types_lock mutex.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20 12:13:17 -05:00
Steven Rostedt (Red Hat) 607e2ea167 tracing: Set up infrastructure to allow tracers for instances
Currently the tracers (function, function_graph, irqsoff, etc) can only
be used by the top level tracing directory (not for instances).

This sets up the infrastructure to allow instances to be able to
run a separate tracer apart from the what the top level tracing is
doing.

As tracers need to adapt for being used by instances, the tracers
must flag if they can be used by instances or not. Currently only the
'nop' tracer can be used by all instances.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20 12:13:10 -05:00
Steven Rostedt (Red Hat) bf6065b5c7 tracing: Pass trace_array to flag_changed callback
As options (flags) may affect instances instead of being global
the flag_changed() callbacks need to receive the trace_array descriptor
of the instance they will be modifying.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20 12:13:08 -05:00
Steven Rostedt (Red Hat) 8c1a49aedb tracing: Pass trace_array to set_flag callback
As options (flags) may affect instances instead of being global
the set_flag() callbacks need to receive the trace_array descriptor
of the instance they will be modifying.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20 12:13:07 -05:00
Steven Rostedt (Red Hat) d8a30f2034 tracing: Fix rcu handling of event_trigger_data filter field
The filter field of the event_trigger_data structure is protected under
RCU sched locks. It was not annotated as such, and after doing so,
sparse pointed out several locations that required fix ups.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-02 16:17:22 -05:00
Steven Rostedt (Red Hat) 098c879e1f tracing: Add generic tracing_lseek() function
Trace event triggers added a lseek that uses the ftrace_filter_lseek()
function. Unfortunately, when function tracing is not configured in
that function is not defined and the kernel fails to build.

This is the second time that function was added to a file ops and
it broke the build due to requiring special config dependencies.

Make a generic tracing_lseek() that all the tracing utilities may
use.

Also, modify the old ftrace_filter_lseek() to return 0 instead of
1 on WRONLY. Not sure why it was a 1 as that does not make sense.

This also changes the old tracing_seek() to modify the file pos
pointer on WRONLY as well.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-02 16:17:12 -05:00
Tom Zanussi bac5fb97a1 tracing: Add and use generic set_trigger_filter() implementation
Add a generic event_command.set_trigger_filter() op implementation and
have the current set of trigger commands use it - this essentially
gives them all support for filters.

Syntactically, filters are supported by adding 'if <filter>' just
after the command, in which case only events matching the filter will
invoke the trigger.  For example, to add a filter to an
enable/disable_event command:

    echo 'enable_event:system:event if common_pid == 999' > \
              .../othersys/otherevent/trigger

The above command will only enable the system:event event if the
common_pid field in the othersys:otherevent event is 999.

As another example, to add a filter to a stacktrace command:

    echo 'stacktrace if common_pid == 999' > \
                   .../somesys/someevent/trigger

The above command will only trigger a stacktrace if the common_pid
field in the event is 999.

The filter syntax is the same as that described in the 'Event
filtering' section of Documentation/trace/events.txt.

Because triggers can now use filters, the trigger-invoking logic needs
to be moved in those cases - e.g. for ftrace_raw_event_calls, if a
trigger has a filter associated with it, the trigger invocation now
needs to happen after the { assign; } part of the call, in order for
the trigger condition to be tested.

There's still a SOFT_DISABLED-only check at the top of e.g. the
ftrace_raw_events function, so when an event is soft disabled but not
because of the presence of a trigger, the original SOFT_DISABLED
behavior remains unchanged.

There's also a bit of trickiness in that some triggers need to avoid
being invoked while an event is currently in the process of being
logged, since the trigger may itself log data into the trace buffer.
Thus we make sure the current event is committed before invoking those
triggers.  To do that, we split the trigger invocation in two - the
first part (event_triggers_call()) checks the filter using the current
trace record; if a command has the post_trigger flag set, it sets a
bit for itself in the return value, otherwise it directly invoks the
trigger.  Once all commands have been either invoked or set their
return flag, event_triggers_call() returns.  The current record is
then either committed or discarded; if any commands have deferred
their triggers, those commands are finally invoked following the close
of the current event by event_triggers_post_call().

To simplify the above and make it more efficient, the TRIGGER_COND bit
is introduced, which is set only if a soft-disabled trigger needs to
use the log record for filter testing or needs to wait until the
current log record is closed.

The syscall event invocation code is also changed in analogous ways.

Because event triggers need to be able to create and free filters,
this also adds a couple external wrappers for the existing
create_filter and free_filter functions, which are too generic to be
made extern functions themselves.

Link: http://lkml.kernel.org/r/7164930759d8719ef460357f143d995406e4eead.1382622043.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-12-21 22:02:17 -05:00
Tom Zanussi 7862ad1846 tracing: Add 'enable_event' and 'disable_event' event trigger commands
Add 'enable_event' and 'disable_event' event_command commands.

enable_event and disable_event event triggers are added by the user
via these commands in a similar way and using practically the same
syntax as the analagous 'enable_event' and 'disable_event' ftrace
function commands, but instead of writing to the set_ftrace_filter
file, the enable_event and disable_event triggers are written to the
per-event 'trigger' files:

    echo 'enable_event:system:event' > .../othersys/otherevent/trigger
    echo 'disable_event:system:event' > .../othersys/otherevent/trigger

The above commands will enable or disable the 'system:event' trace
events whenever the othersys:otherevent events are hit.

This also adds a 'count' version that limits the number of times the
command will be invoked:

    echo 'enable_event:system:event:N' > .../othersys/otherevent/trigger
    echo 'disable_event:system:event:N' > .../othersys/otherevent/trigger

Where N is the number of times the command will be invoked.

The above commands will will enable or disable the 'system:event'
trace events whenever the othersys:otherevent events are hit, but only
N times.

This also makes the find_event_file() helper function extern, since
it's useful to use from other places, such as the event triggers code,
so make it accessible.

Link: http://lkml.kernel.org/r/f825f3048c3f6b026ee37ae5825f9fc373451828.1382622043.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-12-21 22:02:16 -05:00
Tom Zanussi 93e31ffbf4 tracing: Add 'snapshot' event trigger command
Add 'snapshot' event_command.  snapshot event triggers are added by
the user via this command in a similar way and using practically the
same syntax as the analogous 'snapshot' ftrace function command, but
instead of writing to the set_ftrace_filter file, the snapshot event
trigger is written to the per-event 'trigger' files:

    echo 'snapshot' > .../somesys/someevent/trigger

The above command will turn on snapshots for someevent i.e. whenever
someevent is hit, a snapshot will be done.

This also adds a 'count' version that limits the number of times the
command will be invoked:

    echo 'snapshot:N' > .../somesys/someevent/trigger

Where N is the number of times the command will be invoked.

The above command will snapshot N times for someevent i.e. whenever
someevent is hit N times, a snapshot will be done.

Also adds a new tracing_alloc_snapshot() function - the existing
tracing_snapshot_alloc() function is a special version of
tracing_snapshot() that also does the snapshot allocation - the
snapshot triggers would like to be able to do just the allocation but
not take a snapshot; the existing tracing_snapshot_alloc() in turn now
also calls tracing_alloc_snapshot() underneath to do that allocation.

Link: http://lkml.kernel.org/r/c9524dd07ce01f9dcbd59011290e0a8d5b47d7ad.1382622043.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
[ fix up from kbuild test robot <fengguang.wu@intel.com report ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-12-21 22:01:22 -05:00
Tom Zanussi 85f2b08268 tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.

'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.

The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.

The event trigger functionality is built on top of SOFT_DISABLE
functionality.  It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires.  Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that.  Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function.  Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.

The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.

The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.

The standard open, read, and release file operations are implemented
here.

The open() implementation sets up for the various open modes of the
'trigger' file.  It creates and attaches the trigger iterator and sets
up the command parser.  If opened for reading set up the trigger
seq_ops.

The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.

The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.

A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.

also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.

A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations.  They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.

The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event.  It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.

Every event_command func() implementation essentially does the
same thing for any command:

   - choose ops - use the value of param to choose either a number or
     count version of event_trigger_ops specific to the command
   - do the register or unregister of those ops
   - associate a filter, if specified, with the triggering event

The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized.  When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite.  The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.

Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.

The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions.  This allows func()
implementations to use command-specific blobs and supports code
re-use.

The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked.  The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.

This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.

Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-12-20 18:40:22 -05:00
Linus Torvalds b29c8306a3 This batch of changes is mostly clean ups and small bug fixes.
The only real feature that was added this release is from Namhyung Kim,
 who introduced "set_graph_notrace" filter that lets you run the function
 graph tracer and not trace particular functions and their call chain.
 
 Tom Zanussi added some updates to the ftrace multibuffer tracing that
 made it more consistent with the top level tracing.
 
 One of the fixes for perf function tracing required an API change in
 RCU; the addition of "rcu_is_watching()". As Paul McKenney is pushing
 that change in this release too, he gave me a branch that included
 all the changes to get that working, and I pulled that into my tree
 in order to complete the perf function tracing fix.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQEcBAABAgAGBQJSgX5SAAoJEKQekfcNnQGulUAH/jORqJrKaNAulmZ314VsAqfa
 zMtF5UAAPf7kqc3AN/jtFrhJUNEfxWOo7A4r0FsM/rKdWJF+98GA6aqYVD+XoWFt
 +36fg1enxbXUjixQ96Uh+o1+BJUgYDqljuWzqSu/oiXWfWwl8+WL4kcbhb+V9WcF
 SpdzLCWVZRfhyDiN3+0zvyQ8RSG2Pd7CWn9zroI0e4sxGo0Ki6JUnIcXtZGOBDOQ
 IIZdjXvGSfpJ+3u3XvRPXJcltRCtOsVWxYzrmvRlmHDW5QMe1+WmmrlojTePrLaJ
 xn8+3WINqetAR+ZQnazbpt1XzJzKa8QtFgpiN0kT6qL7cg3N1Owc4vLGohl7wok=
 =Nesf
 -----END PGP SIGNATURE-----

Merge tag 'trace-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing update from Steven Rostedt:
 "This batch of changes is mostly clean ups and small bug fixes.  The
  only real feature that was added this release is from Namhyung Kim,
  who introduced "set_graph_notrace" filter that lets you run the
  function graph tracer and not trace particular functions and their
  call chain.

  Tom Zanussi added some updates to the ftrace multibuffer tracing that
  made it more consistent with the top level tracing.

  One of the fixes for perf function tracing required an API change in
  RCU; the addition of "rcu_is_watching()".  As Paul McKenney is pushing
  that change in this release too, he gave me a branch that included all
  the changes to get that working, and I pulled that into my tree in
  order to complete the perf function tracing fix"

* tag 'trace-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Add rcu annotation for syscall trace descriptors
  tracing: Do not use signed enums with unsigned long long in fgragh output
  tracing: Remove unused function ftrace_off_permanent()
  tracing: Do not assign filp->private_data to freed memory
  tracing: Add helper function tracing_is_disabled()
  tracing: Open tracer when ftrace_dump_on_oops is used
  tracing: Add support for SOFT_DISABLE to syscall events
  tracing: Make register/unregister_ftrace_command __init
  tracing: Update event filters for multibuffer
  recordmcount.pl: Add support for __fentry__
  ftrace: Have control op function callback only trace when RCU is watching
  rcu: Do not trace rcu_is_watching() functions
  ftrace/x86: skip over the breakpoint for ftrace caller
  trace/trace_stat: use rbtree postorder iteration helper instead of opencoding
  ftrace: Add set_graph_notrace filter
  ftrace: Narrow down the protected area of graph_lock
  ftrace: Introduce struct ftrace_graph_data
  ftrace: Get rid of ftrace_graph_filter_enabled
  tracing: Fix potential out-of-bounds in trace_get_user()
  tracing: Show more exact help information about snapshot
2013-11-16 12:23:18 -08:00
Steven Rostedt (Red Hat) 3a81a5210b tracing: Add rcu annotation for syscall trace descriptors
sparse complains about the enter/exit_sysycall_files[] variables being
dereferenced with rcu_dereference_sched(). The fields need to be
annotated with __rcu.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-11 11:47:06 -05:00
Peter Zijlstra e5137b50a0 ftrace, sched: Add TRACE_FLAG_PREEMPT_RESCHED
Since the introduction of PREEMPT_NEED_RESCHED in:

  f27dde8dee ("sched: Add NEED_RESCHED to the preempt_count")

we need to be able to look at both TIF_NEED_RESCHED and
PREEMPT_NEED_RESCHED to understand the full preemption behaviour.

Add it to the trace output.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Link: http://lkml.kernel.org/r/20131004152826.GP3081@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-11 12:43:39 +01:00
Steven Rostedt (Red Hat) 6fc84ea70e tracing: Do not use signed enums with unsigned long long in fgragh output
The duration field of print_graph_duration() can also be used
to do the space filling by passing an enum in it:

  DURATION_FILL_FULL
  DURATION_FILL_START
  DURATION_FILL_END

The problem is that these are enums and defined as negative,
but the duration field is unsigned long long. Most archs are
fine with this but blackfin fails to compile because of it:

kernel/built-in.o: In function `print_graph_duration':
kernel/trace/trace_functions_graph.c:782: undefined reference to `__ucmpdi2'

Overloading a unsigned long long with an signed enum is just
bad in principle. We can accomplish the same thing by using
part of the flags field instead.

Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-06 15:26:56 -05:00
Geyslan G. Bem 2e86421deb tracing: Add helper function tracing_is_disabled()
This patch creates the function 'tracing_is_disabled', which
can be used outside of trace.c.

Link: http://lkml.kernel.org/r/1382141754-12155-1-git-send-email-geyslan@gmail.com

Signed-off-by: Geyslan G. Bem <geyslan@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-06 11:06:00 -05:00
Tom Zanussi d562aff93b tracing: Add support for SOFT_DISABLE to syscall events
The original SOFT_DISABLE patches didn't add support for soft disable
of syscall events; this adds it.

Add an array of ftrace_event_file pointers indexed by syscall number
to the trace array and remove the existing enabled bitmaps, which as a
result are now redundant.  The ftrace_event_file structs in turn
contain the soft disable flags we need for per-syscall soft disable
accounting.

Adding ftrace_event_files also means we can remove the USE_CALL_FILTER
bit, thus enabling multibuffer filter support for syscall events.

Link: http://lkml.kernel.org/r/6e72b566e85d8df8042f133efbc6c30e21fb017e.1382620672.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05 17:48:49 -05:00
Tom Zanussi f306cc82a9 tracing: Update event filters for multibuffer
The trace event filters are still tied to event calls rather than
event files, which means you don't get what you'd expect when using
filters in the multibuffer case:

Before:

  # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
  # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
  bytes_alloc > 8192
  # mkdir /sys/kernel/debug/tracing/instances/test1
  # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
  # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
  bytes_alloc > 2048
  # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
  bytes_alloc > 2048

Setting the filter in tracing/instances/test1/events shouldn't affect
the same event in tracing/events as it does above.

After:

  # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
  # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
  bytes_alloc > 8192
  # mkdir /sys/kernel/debug/tracing/instances/test1
  # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
  # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
  bytes_alloc > 8192
  # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
  bytes_alloc > 2048

We'd like to just move the filter directly from ftrace_event_call to
ftrace_event_file, but there are a couple cases that don't yet have
multibuffer support and therefore have to continue using the current
event_call-based filters.  For those cases, a new USE_CALL_FILTER bit
is added to the event_call flags, whose main purpose is to keep the
old behavior for those cases until they can be updated with
multibuffer support; at that point, the USE_CALL_FILTER flag (and the
new associated call_filter_check_discard() function) can go away.

The multibuffer support also made filter_current_check_discard()
redundant, so this change removes that function as well and replaces
it with filter_check_discard() (or call_filter_check_discard() as
appropriate).

Link: http://lkml.kernel.org/r/f16e9ce4270c62f46b2e966119225e1c3cca7e60.1382620672.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05 16:50:20 -05:00
Namhyung Kim 29ad23b004 ftrace: Add set_graph_notrace filter
The set_graph_notrace filter is analogous to set_ftrace_notrace and
can be used for eliminating uninteresting part of function graph trace
output.  It also works with set_graph_function nicely.

  # cd /sys/kernel/debug/tracing/
  # echo do_page_fault > set_graph_function
  # perf ftrace live true
   2)               |  do_page_fault() {
   2)               |    __do_page_fault() {
   2)   0.381 us    |      down_read_trylock();
   2)   0.055 us    |      __might_sleep();
   2)   0.696 us    |      find_vma();
   2)               |      handle_mm_fault() {
   2)               |        handle_pte_fault() {
   2)               |          __do_fault() {
   2)               |            filemap_fault() {
   2)               |              find_get_page() {
   2)   0.033 us    |                __rcu_read_lock();
   2)   0.035 us    |                __rcu_read_unlock();
   2)   1.696 us    |              }
   2)   0.031 us    |              __might_sleep();
   2)   2.831 us    |            }
   2)               |            _raw_spin_lock() {
   2)   0.046 us    |              add_preempt_count();
   2)   0.841 us    |            }
   2)   0.033 us    |            page_add_file_rmap();
   2)               |            _raw_spin_unlock() {
   2)   0.057 us    |              sub_preempt_count();
   2)   0.568 us    |            }
   2)               |            unlock_page() {
   2)   0.084 us    |              page_waitqueue();
   2)   0.126 us    |              __wake_up_bit();
   2)   1.117 us    |            }
   2)   7.729 us    |          }
   2)   8.397 us    |        }
   2)   8.956 us    |      }
   2)   0.085 us    |      up_read();
   2) + 12.745 us   |    }
   2) + 13.401 us   |  }
  ...

  # echo handle_mm_fault > set_graph_notrace
  # perf ftrace live true
   1)               |  do_page_fault() {
   1)               |    __do_page_fault() {
   1)   0.205 us    |      down_read_trylock();
   1)   0.041 us    |      __might_sleep();
   1)   0.344 us    |      find_vma();
   1)   0.069 us    |      up_read();
   1)   4.692 us    |    }
   1)   5.311 us    |  }
  ...

Link: http://lkml.kernel.org/r/1381739066-7531-5-git-send-email-namhyung@kernel.org

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-18 22:23:16 -04:00
Namhyung Kim 9aa72b4bf8 ftrace: Get rid of ftrace_graph_filter_enabled
The ftrace_graph_filter_enabled means that user sets function filter
and it always has same meaning of ftrace_graph_count > 0.

Link: http://lkml.kernel.org/r/1381739066-7531-2-git-send-email-namhyung@kernel.org

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-18 22:15:25 -04:00
Linus Torvalds 7eb69529cb Not much changes for the 3.12 merge window. The major tracing changes
are still in flux, and will have to wait for 3.13.
 
 The changes for 3.12 are mostly clean ups and minor fixes.
 
 H. Peter Anvin added a check to x86_32 static function tracing that
 helps a small segment of the kernel community.
 
 Oleg Nesterov had a few changes from 3.11, but were mostly clean ups
 and not worth pushing in the -rc time frame.
 
 Li Zefan had small clean up with annotating a raw_init with __init.
 
 I fixed a slight race in updating function callbacks, but the race
 is so small and the bug that happens when it occurs is so minor it's
 not even worth pushing to stable.
 
 The only real enhancement is from Alexander Z Lam that made the
 tracing_cpumask work for trace buffer instances, instead of them all
 sharing a global cpumask.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQEcBAABAgAGBQJSLJm1AAoJEOdOSU1xswtMSu0H/0/Uuh0D5VhANZRcTATY4gUO
 n3WH6sm3atOxH+cbeYQcFXxOcvRcR2n90tvCMpiFlPiC0NiNR1yjro3VLS4zWb77
 twq7gABdJf+Tdq7sOBmSzmY5vRKQVHIXvAfC27mBez38nCWZz0BjJGEsPBwoly25
 ZaiCbKlusw/QKIEy40tuKUL/rXF6yEWnQrMujhBbyNm0w7sJVdfnd+HHmCvy15H2
 IQE1g83d/dAMBjFY2BYg77J+oV6qmJxql2itvDivQWXHqFb52Jw3ZTwHwWLZlPYU
 AZcHtYGs2lSUscQLF56LejB7zZyE8taUufExFEVexXxZS5u7nNPXsPrA2LOOK70=
 =JWO6
 -----END PGP SIGNATURE-----

Merge tag 'trace-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "Not much changes for the 3.12 merge window.  The major tracing changes
  are still in flux, and will have to wait for 3.13.

  The changes for 3.12 are mostly clean ups and minor fixes.

  H Peter Anvin added a check to x86_32 static function tracing that
  helps a small segment of the kernel community.

  Oleg Nesterov had a few changes from 3.11, but were mostly clean ups
  and not worth pushing in the -rc time frame.

  Li Zefan had small clean up with annotating a raw_init with __init.

  I fixed a slight race in updating function callbacks, but the race is
  so small and the bug that happens when it occurs is so minor it's not
  even worth pushing to stable.

  The only real enhancement is from Alexander Z Lam that made the
  tracing_cpumask work for trace buffer instances, instead of them all
  sharing a global cpumask"

* tag 'trace-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace/rcu: Do not trace debug_lockdep_rcu_enabled()
  x86-32, ftrace: Fix static ftrace when early microcode is enabled
  ftrace: Fix a slight race in modifying what function callback gets traced
  tracing: Make tracing_cpumask available for all instances
  tracing: Kill the !CONFIG_MODULES code in trace_events.c
  tracing: Don't pass file_operations array to event_create_dir()
  tracing: Kill trace_create_file_ops() and friends
  tracing/syscalls: Annotate raw_init function with __init
2013-09-09 14:42:15 -07:00
Ingo Molnar 7d992feb76 Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:

"
 * Update RCU documentation.  These were posted to LKML at
   https://lkml.org/lkml/2013/8/19/611.

 * Miscellaneous fixes.  These were posted to LKML at
   https://lkml.org/lkml/2013/8/19/619.

 * Full-system idle detection.  This is for use by Frederic
   Weisbecker's adaptive-ticks mechanism.  Its purpose is
   to allow the timekeeping CPU to shut off its tick when
   all other CPUs are idle.  These were posted to LKML at
   https://lkml.org/lkml/2013/8/19/648.

 * Improve rcutorture test coverage.  These were posted to LKML at
   https://lkml.org/lkml/2013/8/19/675.
"

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-03 07:41:11 +02:00
Alexander Z Lam ccfe9e42e4 tracing: Make tracing_cpumask available for all instances
Allow tracer instances to disable tracing by cpu by moving
the static global tracing_cpumask into trace_array.

Link: http://lkml.kernel.org/r/921622317f239bfc2283cac2242647801ef584f2.1375980149.git.azl@google.com

Cc: Vaibhav Nagarnaik <vnagarnaik@google.com>
Cc: David Sharp <dhsharp@google.com>
Cc: Alexander Z Lam <lambchop468@gmail.com>
Signed-off-by: Alexander Z Lam <azl@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-08-22 12:45:24 -04:00
Steven Rostedt (Red Hat) 102c9323c3 tracing: Add __tracepoint_string() to export string pointers
There are several tracepoints (mostly in RCU), that reference a string
pointer and uses the print format of "%s" to display the string that
exists in the kernel, instead of copying the actual string to the
ring buffer (saves time and ring buffer space).

But this has an issue with userspace tools that read the binary buffers
that has the address of the string but has no access to what the string
itself is. The end result is just output that looks like:

 rcu_dyntick:          ffffffff818adeaa 1 0
 rcu_dyntick:          ffffffff818adeb5 0 140000000000000
 rcu_dyntick:          ffffffff818adeb5 0 140000000000000
 rcu_utilization:      ffffffff8184333b
 rcu_utilization:      ffffffff8184333b

The above is pretty useless when read by the userspace tools. Ideally
we would want something that looks like this:

 rcu_dyntick:          Start 1 0
 rcu_dyntick:          End 0 140000000000000
 rcu_dyntick:          Start 140000000000000 0
 rcu_callback:         rcu_preempt rhp=0xffff880037aff710 func=put_cred_rcu 0/4
 rcu_callback:         rcu_preempt rhp=0xffff880078961980 func=file_free_rcu 0/5
 rcu_dyntick:          End 0 1

The trace_printk() which also only stores the address of the string
format instead of recording the string into the buffer itself, exports
the mapping of kernel addresses to format strings via the printk_format
file in the debugfs tracing directory.

The tracepoint strings can use this same method and output the format
to the same file and the userspace tools will be able to decipher
the address without any modification.

The tracepoint strings need its own section to save the strings because
the trace_printk section will cause the trace_printk() buffers to be
allocated if anything exists within the section. trace_printk() is only
used for debugging and should never exist in the kernel, we can not use
the trace_printk sections.

Add a new tracepoint_str section that will also be examined by the output
of the printk_format file.

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-26 13:39:44 -04:00
Oleg Nesterov 9c01fe4593 tracing: Kill trace_cpu struct/members
After the previous changes trace_array_cpu->trace_cpu and
trace_array->trace_cpu becomes write-only. Remove these members
and kill "struct trace_cpu" as well.

As a side effect this also removes memset(per_cpu_memory, 0).
It was not needed, alloc_percpu() returns zero-filled memory.

Link: http://lkml.kernel.org/r/20130723152613.GA23741@redhat.com

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-24 11:22:53 -04:00
Oleg Nesterov a644a7e958 tracing: Kill trace_array->waiter
Trivial. trace_array->waiter has no users since 6eaaa5d5
"tracing/core: use appropriate waiting on trace_pipe".

Link: http://lkml.kernel.org/r/20130719142036.GA1594@redhat.com

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-19 10:56:02 -04:00