linux

Commit Graph

Author	SHA1	Message	Date
Steven Rostedt	e74da5235c	tracing: fix seq read from trace files The buffer used by trace_seq was updated incorrectly. Instead of consuming what was actually read, it consumed the rest of the buffer on reads. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-04 20:31:11 -05:00
Steven Rostedt	2dc5d12b1f	tracing: do not return EFAULT if read copied anything Impact: fix trace read to conform to standards Andrew Morton, Theodore Tso and H. Peter Anvin brought to my attention that a userspace read should not return -EFAULT if it succeeded in copying anything. It should only return -EFAULT if it failed to copy at all. This patch modifies the check of copy_from_user and updates the return code appropriately. I also used H. Peter Anvin's short cut rule to just test ret == count. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-04 19:10:05 -05:00
Steven Rostedt	4f3640f8a3	ring-buffer: fix timestamp in partial ring_buffer_page_read If a partial ring_buffer_page_read happens, then some of the incremental timestamps may be lost. This patch writes the recent timestamp into the page that is passed back to the caller. A partial ring_buffer_page_read is where the full page would not be written back to the user, and instead, just part of the page is copied to the user. A full page would be a page swap with the ring buffer and the timestamps would be correct. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-04 19:01:45 -05:00
Steven Rostedt	e543ad7691	tracing: add cpu_file intialization for ftrace_dump Impact: fix to ftrace_dump output corruption The commit: `b04cc6b1f6` tracing/core: introduce per cpu tracing files added a new field to the iterator called cpu_file. This was a handle to differentiate between the per cpu trace output files and the all cpu "trace" file. The all cpu "trace" file required setting this to TRACE_PIPE_ALL_CPU. The problem is that the ftrace_dump sets up its own iterator but was not updated to handle this change. The result was only CPU 0 printing out on crash and a lot of "<0>"'s also being printed. Reported-by: Thomas Gleixner <tglx@linuxtronix.de> Tested-by: Darren Hart <dvhtc@us.ibm.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-04 18:32:28 -05:00
Peter Zijlstra	efed792d67	tracing: add lockdep tracepoints for lock acquire/release Augment the traces with lock names when lockdep is available: 1) \| down_read_trylock() { 1) \| _spin_lock_irqsave() { 1) \| /* lock_acquire: &sem->wait_lock / 1) 4.201 us \| } 1) \| _spin_unlock_irqrestore() { 1) \| / lock_release: &sem->wait_lock / 1) 3.523 us \| } 1) \| / lock_acquire: try read &mm->mmap_sem / 1) + 13.386 us \| } 1) 1.635 us \| find_vma(); 1) \| handle_mm_fault() { 1) \| __do_fault() { 1) \| filemap_fault() { 1) \| find_lock_page() { 1) \| find_get_page() { 1) \| / lock_acquire: read rcu_read_lock / 1) \| / lock_release: rcu_read_lock / 1) 5.697 us \| } 1) 8.158 us \| } 1) + 11.079 us \| } 1) \| _spin_lock() { 1) \| / lock_acquire: __pte_lockptr(page) / 1) 3.949 us \| } 1) 1.460 us \| page_add_file_rmap(); 1) \| _spin_unlock() { 1) \| / lock_release: __pte_lockptr(page) / 1) 3.115 us \| } 1) \| unlock_page() { 1) 1.421 us \| page_waitqueue(); 1) 1.220 us \| __wake_up_bit(); 1) 6.519 us \| } 1) + 34.328 us \| } 1) + 37.452 us \| } 1) \| up_read() { 1) \| / lock_release: &mm->mmap_sem / 1) \| _spin_lock_irqsave() { 1) \| / lock_acquire: &sem->wait_lock / 1) 3.865 us \| } 1) \| _spin_unlock_irqrestore() { 1) \| / lock_release: &sem->wait_lock */ 1) 8.562 us \| } 1) + 17.370 us \| } Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: =?ISO-8859-1?Q?T=F6r=F6k?= Edwin <edwintorok@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1236166375.5330.7209.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-04 18:49:58 +01:00
Steven Rostedt	2cadf9135e	tracing: add binary buffer files for use with splice Impact: new feature This patch creates a directory of files that correspond to the per CPU ring buffers. These are binary files and are made to be used with splice. This is the fastest way to extract data from the ftrace ring buffers. Thanks to Jiaying Zhang for pushing me to get this code fixed, and to Eduard - Gabriel Munteanu for his splice code that helped me debug my code. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-03 21:01:55 -05:00
Steven Rostedt	474d32b68d	ring-buffer: make ring_buffer_read_page read from start on partial page Impact: dont leave holes in read buffer page The ring_buffer_read_page swaps a given page with the reader page of the ring buffer, if certain conditions are set: 1) requested length is big enough to hold entire page data 2) a writer is not currently on the page 3) the page is not partially consumed. Instead of swapping with the supplied page. It copies the data to the supplied page instead. But currently the data is copied in the same offset as the source page. This causes a hole at the start of the reader page. This complicates the use of this function. Instead, it should copy the data at the beginning of the function and update the index fields accordingly. Other small clean ups are also done in this patch. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-03 20:52:27 -05:00
Steven Rostedt	e3d6bf0a07	ring-buffer: replace sizeof of event header with offsetof Impact: fix to possible alignment problems on some archs. Some arch compilers include an NULL char array in the sizeof field. Since the ring_buffer_event type includes one of these, it is better to use the "offsetof" instead, to avoid strange bugs on these archs. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-03 20:52:01 -05:00
Steven Rostedt	ef7a4a1614	ring-buffer: fix ring_buffer_read_page The ring_buffer_read_page was broken if it were to only copy part of the page. This patch fixes that up as well as adds a parameter to allow a length field, in order to only copy part of the buffer page. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-03 20:51:24 -05:00
Steven Rostedt	41be4da4e8	ring-buffer: reset write field for ring_buffer_read_page Impact: fix ring_buffer_read_page After a page is swapped into the ring buffer, the write field must also be reset. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-03 20:50:54 -05:00
Steven Rostedt	633ddaa7f4	tracing: fix return value to registering events The registering of events had the return value check backwards. A zero returned is success, the check had it as a failure. This patch also fixes a missing "\n" in the warning that the check failed. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-03 09:43:50 -05:00
Steven Rostedt	96ccd21cd1	tracing: add print format to event trace format files This patch adds the internal print format used to print the raw events to the event trace point format file. # cat /debug/tracing/events/sched/sched_switch/format name: sched_switch ID: 29 format: field:unsigned char type; offset:0; size:1; field:unsigned char flags; offset:1; size:1; field:unsigned char preempt_count; offset:2; size:1; field:int pid; offset:4; size:4; field:int tgid; offset:8; size:4; field:pid_t prev_pid; offset:12; size:4; field:int prev_prio; offset:16; size:4; field special:char next_comm[TASK_COMM_LEN]; offset:20; size:16; field:pid_t next_pid; offset:36; size:4; field:int next_prio; offset:40; size:4; print fmt: "prev %d:%d ==> next %s:%d:%d" Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-02 15:22:21 -05:00
Steven Rostedt	c5e4e19271	tracing: add trace name and id to event formats To be able to identify the trace in the binary format output, the id of the trace event (which is dynamically assigned) must also be listed. This patch adds the name of the trace point as well as the id assigned. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-02 15:10:02 -05:00
Steven Rostedt	91729ef966	tracing: add ftrace headers to event format files This patch includes the ftrace header to the event formats files: # cat /debug/tracing/events/sched/sched_switch/format field:unsigned char type; offset:0; size:1; field:unsigned char flags; offset:1; size:1; field:unsigned char preempt_count; offset:2; size:1; field:int pid; offset:4; size:4; field:int tgid; offset:8; size:4; field:pid_t prev_pid; offset:12; size:4; field:int prev_prio; offset:16; size:4; field special:char next_comm[TASK_COMM_LEN]; offset:20; size:16; field:pid_t next_pid; offset:36; size:4; field:int next_prio; offset:40; size:4; A blank line is used as a deliminator between the ftrace header and the trace point fields. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-02 15:03:01 -05:00
Steven Rostedt	981d081ec8	tracing: add format file to describe event struct fields This patch adds the "format" file to the trace point event directory. This is based off of work by Tom Zanussi, in which a file is exported to be tread from user land such that a user space app may read the binary record stored in the ring buffer. # cat /debug/tracing/events/sched/sched_switch/format field:pid_t prev_pid; offset:12; size:4; field:int prev_prio; offset:16; size:4; field special:char next_comm[TASK_COMM_LEN]; offset:20; size:16; field:pid_t next_pid; offset:36; size:4; field:int next_prio; offset:40; size:4; Idea-from: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-02 14:27:27 -05:00
Steven Rostedt	f9520750c4	tracing: make trace_seq_reset global and rename to trace_seq_init Impact: clean up The trace_seq functions may be used separately outside of the ftrace iterator. The trace_seq_reset is needed for these operations. This patch also renames trace_seq_reset to the more appropriate trace_seq_init. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-02 14:08:51 -05:00
Steven Rostedt	11a241a330	tracing: add protection around modify trace event fields The trace event objects are currently not proctected against reentrancy. This patch adds a mutex around the modifications of the trace event fields. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-02 11:49:04 -05:00
Steven Rostedt	d20e3b0384	tracing: add TRACE_FIELD_SPECIAL to record complex entries Tom Zanussi pointed out that the simple TRACE_FIELD was not enough to record trace data that required memcpy. This patch addresses this issue by adding a TRACE_FIELD_SPECIAL. The format is similar to TRACE_FIELD but looks like so: TRACE_FIELD_SPECIAL(type_item, item, cmd) What TRACE_FIELD gave was: TRACE_FIELD(type, item, assign) The TRACE_FIELD would be used in declaring a structure: struct { type item; }; And later assign it via: entry->item = assign; What TRACE_FIELD_SPECIAL gives us is: In the declaration of the structure: struct { type_item; }; And the assignment: cmd; This change log will explain the one example used in the patch: TRACE_EVENT_FORMAT(sched_switch, TPPROTO(struct rq rq, struct task_struct prev, struct task_struct *next), TPARGS(rq, prev, next), TPFMT("task %s:%d ==> %s:%d", prev->comm, prev->pid, next->comm, next->pid), TRACE_STRUCT( TRACE_FIELD(pid_t, prev_pid, prev->pid) TRACE_FIELD(int, prev_prio, prev->prio) TRACE_FIELD_SPECIAL(char next_comm[TASK_COMM_LEN], next_comm, TPCMD(memcpy(TRACE_ENTRY->next_comm, next->comm, TASK_COMM_LEN))) TRACE_FIELD(pid_t, next_pid, next->pid) TRACE_FIELD(int, next_prio, next->prio) ), TPRAWFMT("prev %d:%d ==> next %s:%d:%d") ); The struct will be create as: struct { pid_t prev_pid; int prev_prio; char next_comm[TASK_COMM_LEN]; pid_t next_pid; int next_prio; }; Note the TRACE_ENTRY in the cmd part of TRACE_SPECIAL. TRACE_ENTRY will be set by the tracer to point to the structure inside the trace buffer. entry->prev_pid = prev->pid; entry->prev_prio = prev->prio; memcpy(entry->next_comm, next->comm, TASK_COMM_LEN); entry->next_pid = next->pid; entry->next_prio = next->prio Reported-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-02 10:53:15 -05:00
Steven Rostedt	fd99498989	tracing: add raw fast tracing interface for trace events This patch adds the interface to enable the C style trace points. In the directory /debugfs/tracing/events/subsystem/event We now have three files: enable : values 0 or 1 to enable or disable the trace event. available_types: values 'raw' and 'printf' which indicate the tracing types available for the trace point. If a developer does not use the TRACE_EVENT_FORMAT macro and just uses the TRACE_FORMAT macro, then only 'printf' will be available. This file is read only. type: values 'raw' or 'printf'. This indicates which type of tracing is active for that trace point. 'printf' is the default and if 'raw' is not available, this file is read only. # echo raw > /debug/tracing/events/sched/sched_wakeup/type # echo 1 > /debug/tracing/events/sched/sched_wakeup/enable Will enable the C style tracing for the sched_wakeup trace point. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-28 04:04:03 -05:00
Steven Rostedt	c32e827b25	tracing: add raw trace point recording infrastructure Impact: lower overhead tracing The current event tracer can automatically pick up trace points that are registered with the TRACE_FORMAT macro. But it required a printf format string and parsing. Although, this adds the ability to get guaranteed information like task names and such, it took a hit in overhead processing. This processing can add about 500-1000 nanoseconds overhead, but in some cases that too is considered too much and we want to shave off as much from this overhead as possible. Tom Zanussi recently posted tracing patches to lkml that are based on a nice idea about capturing the data via C structs using STRUCT_ENTER, STRUCT_EXIT type of macros. I liked that method very much, but did not like the implementation that required a developer to add data/code in several disjoint locations. This patch extends the event_tracer macros to do a similar "raw C" approach that Tom Zanussi did. But instead of having the developers needing to tweak a bunch of code all over the place, they can do it all in one macro - preferably placed near the code that it is tracing. That makes it much more likely that tracepoints will be maintained on an ongoing basis by the code they modify. The new macro TRACE_EVENT_FORMAT is created for this approach. (Note, a developer may still utilize the more low level DECLARE_TRACE macros if they don't care about getting their traces automatically in the event tracer.) They can also use the existing TRACE_FORMAT if they don't need to code the tracepoint in C, but just want to use the convenience of printf. So if the developer wants to "hardwire" a tracepoint in the fastest possible way, and wants to acquire their data via a user space utility in a raw binary format, or wants to see it in the trace output but not sacrifice any performance, then they can implement the faster but more complex TRACE_EVENT_FORMAT macro. Here's what usage looks like: TRACE_EVENT_FORMAT(name, TPPROTO(proto), TPARGS(args), TPFMT(fmt, fmt_args), TRACE_STUCT( TRACE_FIELD(type1, item1, assign1) TRACE_FIELD(type2, item2, assign2) [...] ), TPRAWFMT(raw_fmt) ); Note name, proto, args, and fmt, are all identical to what TRACE_FORMAT uses. name: is the unique identifier of the trace point proto: The proto type that the trace point uses args: the args in the proto type fmt: printf format to use with the event printf tracer fmt_args: the printf argments to match fmt TRACE_STRUCT starts the ability to create a structure. Each item in the structure is defined with a TRACE_FIELD TRACE_FIELD(type, item, assign) type: the C type of item. item: the name of the item in the stucture assign: what to assign the item in the trace point callback raw_fmt is a way to pretty print the struct. It must match the order of the items are added in TRACE_STUCT An example of this would be: TRACE_EVENT_FORMAT(sched_wakeup, TPPROTO(struct rq rq, struct task_struct p, int success), TPARGS(rq, p, success), TPFMT("task %s:%d %s", p->comm, p->pid, success?"succeeded":"failed"), TRACE_STRUCT( TRACE_FIELD(pid_t, pid, p->pid) TRACE_FIELD(int, success, success) ), TPRAWFMT("task %d success=%d") ); This creates us a unique struct of: struct { pid_t pid; int success; }; And the way the call back would assign these values would be: entry->pid = p->pid; entry->success = success; The nice part about this is that the creation of the assignent is done via macro magic in the event tracer. Once the TRACE_EVENT_FORMAT is created, the developer will then have a faster method to record into the ring buffer. They do not need to worry about the tracer itself. The developer would only need to touch the files in include/trace/*.h Again, I would like to give special thanks to Tom Zanussi for this nice idea. Idea-from: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-28 03:09:32 -05:00
Steven Rostedt	ef5580d0ff	tracing: add interface to write into current tracer buffer Right now all tracers must manage their own trace buffers. This was to enforce tracers to be independent in case we finally decide to allow each tracer to have their own trace buffer. But now we are adding event tracing that writes to the current tracer's buffer. This adds an interface to allow events to write to the current tracer buffer without having to manage its own. Since event tracing has no "tracer", and is just a way to hook into any other tracer. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-28 03:06:44 -05:00
Steven Rostedt	b628b3e629	tracing: make the set_event and available_events subsystem aware This patch makes the event files, set_event and available_events aware of the subsystem. Now you can enable an entire subsystem with: echo 'irq:' > set_event Note: the '' is not needed. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-28 03:05:40 -05:00
Steven Rostedt	6ecc2d1ca3	tracing: add subsystem level to trace events If a trace point header defines TRACE_SYSTEM, then it will add the following trace points into that event system. If include/trace/irq_event_types.h has: #define TRACE_SYSTEM irq at the top and #undef TRACE_SYSTEM at the bottom, then a directory "irq" will be created in the /debug/tracing/events directory. Inside that directory will contain the two trace points that are defined in include/trace/irq_event_types.h. Only adding the above to irq and not to sched, we get: # ls /debug/tracing/events/ irq sched_process_exit sched_signal_send sched_wakeup_new sched_kthread_stop sched_process_fork sched_switch sched_kthread_stop_ret sched_process_free sched_wait_task sched_migrate_task sched_process_wait sched_wakeup # ls /debug/tracing/events/irq irq_handler_entry irq_handler_exit If we add #define TRACE_SYSTEM sched to the trace/sched_event_types.h then the rest of the trace events will be put in a sched directory within the events directory. I've been playing with this idea of the subsystem for a while, but recently Tom Zanussi posted some patches to lkml that included this method. Tom's approach was clean and got me to finally put some effort to clean up the event trace points. Thanks to Tom Zanussi for demonstrating how nice the subsystem method is. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-28 02:59:43 -05:00
Steven Rostedt	eb594e45f6	tracing: move trace point formats to files in include/trace directory Impact: clean up To further facilitate the ease of adding trace points for developers, this patch creates include/trace/trace_events.h and include/trace/trace_event_types.h. The former file will hold the trace/<type>.h files and the latter will hold the trace/<type>_event_types.h files. To create new tracepoints and to have them automatically appear in the event tracer, a developer makes the trace/<type>.h file which includes <linux/tracepoint.h> and the trace/<type>_event_types.h file. The trace/<type>_event_types.h file will hold the TRACE_FORMAT macros. Then add the trace/<type>.h file to trace/trace_events.h, and add the trace/<type>_event_types.h to the trace_event_types.h file. No need to modify files elsewhere. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-28 02:58:50 -05:00
Steven Rostedt	0cfe82451d	tracing: replace kzalloc with kcalloc Impact: clean up kcalloc is a better approach to allocate a NULL array. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-27 10:51:10 -05:00
Steven Rostedt	5c6a3ae1b4	tracing: use newline separator for trace options list Impact: clean up Instead of listing the trace options like: # cat /debug/tracing/trace_options print-parent nosym-offset nosym-addr noverbose noraw nohex nobin noblock nostacktrace nosched-tree ftrace_printk noftrace_preempt nobranch annotate nouserstacktrace nosym-userobj We now list them like: # cat /debug/tracing/trace_options print-parent nosym-offset nosym-addr noverbose noraw nohex nobin noblock nostacktrace nosched-tree ftrace_printk noftrace_preempt nobranch annotate nouserstacktrace nosym-userobj Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-27 00:22:21 -05:00
Steven Rostedt	85a2f9b46f	tracing: use pointer error returns for __tracing_open Impact: fix compile warning and clean up When I first wrote __tracing_open, instead of passing the error code via the ERR_PTR macros, I lazily used a separate parameter to hold the return for errors. When Frederic Weisbecker updated that function, he used the Linux kernel ERR_PTR for the returns. This caused the parameter return to possibly not be initialized on error. gcc correctly pointed this out with a warning. This patch converts the entire function to use the Linux kernel ERR_PTR macro methods. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-27 00:12:38 -05:00
Steven Rostedt	d8e83d26b5	tracing: add protection around open use of current_tracer Impact: fix to possible race conditions There's some uses of current_tracer that is not protected by the trace_types_lock. There is a small chance that a sysadmin changes the tracer while the current_tracer is being referenced. If the race is hit, it is unlikely to cause any harm since the tracers are constant and are not freed. But some strang side effects may occur. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-26 23:55:58 -05:00
Steven Rostedt	577b785f55	tracing: add tracer dependent options to options directory This patch adds the tracer dependent options dynamically to the options directory when the tracer is activated. These options are removed when the tracer is deactivated. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-26 23:43:05 -05:00
Steven Rostedt	a825907507	tracing: add options directory and core option files This patch creates an options directory in the debugfs, that contains the available tracing options. These files contain 1 or 0, where 1 is the option is enabled and 0 it is disabled. Simply echoing in 1 will enable the option and 0 will disable it. This patch only contains the core options, not the tracer options. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-26 22:22:46 -05:00
Ingo Molnar	14131f2f98	tracing: implement trace_clock_*() APIs Impact: implement new tracing timestamp APIs Add three trace clock variants, with differing scalability/precision tradeoffs: - local: CPU-local trace clock - medium: scalable global clock with some jitter - global: globally monotonic, serialized clock Make the ring-buffer use the local trace clock internally. Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-26 18:44:06 +01:00
Jason Baron	af39241b90	tracing, genirq: add irq enter and exit trace events Impact: add new tracepoints Add them to the generic IRQ code, that way every architecture gets these new tracepoints, not just x86. Using Steve's new 'TRACE_FORMAT', I can get function graph trace as follows using the original two IRQ tracepoints: 3) \| handle_IRQ_event() { 3) \| /* (irq_handler_entry) irq=28 handler=eth0 / 3) \| e1000_intr_msi() { 3) 2.460 us \| __napi_schedule(); 3) 9.416 us \| } 3) \| / (irq_handler_exit) irq=28 handler=eth0 return=handled */ 3) + 22.935 us \| } Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: "Frank Ch. Eigler" <fche@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-26 18:43:50 +01:00
Frederic Weisbecker	8656e7a2fa	tracing/core: make the per cpu trace files in per cpu directories Impact: restructure the VFS layout of per CPU trace buffers The per cpu trace files are all in a single directory: /debug/tracing/per_cpu. In case of a large number of cpu, the content of this directory becomes messy so we create now one directory per cpu inside /debug/tracing/per_cpu which contain each their own trace_pipe and trace files. Ie: /debug/tracing$ ls -R per_cpu per_cpu: cpu0 cpu1 per_cpu/cpu0: trace trace_pipe per_cpu/cpu1: trace trace_pipe Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-26 14:04:08 +01:00
Ingo Molnar	f4abfb8d0d	Merge branch 'tip/tracing/ftrace' of ssh://master.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace	2009-02-26 03:48:44 +01:00
Ingo Molnar	e36b1e136a	Merge branches 'tracing/ftrace', 'tracing/hw-branch-tracing' and 'linus' into tracing/core	2009-02-26 03:47:27 +01:00
Steven Rostedt	eef62a6826	tracing: rename DEFINE_TRACE_FMT to just TRACE_FORMAT There's been a bit confusion to whether DEFINE/DECLARE_TRACE_FMT should be a DEFINE or a DECLARE. Ingo Molnar suggested simply calling it TRACE_FORMAT. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-25 21:44:22 -05:00
Frederic Weisbecker	d7350c3f45	tracing/core: make the read callbacks reentrants Now that several per-cpu files can be read or spliced at the same, we want the read/splice callbacks for tracing files to be reentrants. Until now, a single global mutex (trace_types_lock) serialized the access to tracing_read_pipe(), tracing_splice_read_pipe(), and the seq helpers. Ie: it means that if a user tries to read trace_pipe0 and trace_pipe1 at the same time, the access to the function tracing_read_pipe() is contended and one reader must wait for the other to finish its read call. The trace_type_lock mutex is mostly here to serialize the access to the global current tracer (current_trace), which can be changed concurrently. Although the iter struct keeps a private pointer to this tracer, its callbacks can be changed by another function. The method used here is to not keep anymore private reference to the tracer inside the iterator but to make a copy of it inside the iterator. Then it checks on subsequents read calls if the tracer has changed. This is not costly because the current tracer is not expected to be changed often, so we use a branch prediction for that. Moreover, we add a private mutex to the iterator (there is one iterator per file descriptor) to serialize the accesses in case of multiple consumers per file descriptor (which would be a silly idea from the user). Note that this is not to protect the ring buffer, since the ring buffer already serializes the readers accesses. This is to prevent from traces weirdness in case of concurrent consumers. But these mutexes can be dropped anyway, that would not result in any crash. Just tell me what you think about it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-25 13:40:58 +01:00
Frederic Weisbecker	b04cc6b1f6	tracing/core: introduce per cpu tracing files Impact: split up tracing output per cpu Currently, on the tracing debugfs directory, three files are available to the user to let him extracting the trace output: - trace is an iterator through the ring-buffer. It's a reader but not a consumer It doesn't block when no more traces are available. - trace pretty similar to the former, except that it adds more informations such as prempt count, irq flag, ... - trace_pipe is a reader and a consumer, it will also block waiting for traces if necessary (heh, yes it's a pipe). The traces coming from different cpus are curretly mixed up inside these files. Sometimes it messes up the informations, sometimes it's useful, depending on what does the tracer capture. The tracing_cpumask file is useful to filter the output and select only the traces captured a custom defined set of cpus. But still it is not enough powerful to extract at the same time one trace buffer per cpu. So this patch creates a new directory: /debug/tracing/per_cpu/. Inside this directory, you will now find one trace_pipe file and one trace file per cpu. Which means if you have two cpus, you will have: trace0 trace1 trace_pipe0 trace_pipe1 And of course, reading these files will have the same effect than with the usual tracing files, except that you will only see the traces from the given cpu. The original all-in-one cpu trace file are still available on their original place. Until now, only one consumer was allowed on trace_pipe to avoid racy consuming on the ring-buffer. Now the approach changed a bit, you can have only one consumer per cpu. Which means you are allowed to read concurrently trace_pipe0 and trace_pipe1 But you can't have two readers on trace_pipe0 or trace_pipe1. Following the same logic, if there is one reader on the common trace_pipe, you can not have at the same time another reader on trace_pipe0 or in trace_pipe1. Because in trace_pipe is already a consumer in all cpu buffers in essence. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-25 13:40:58 +01:00
Ingo Molnar	2b1b858f69	Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace	2009-02-25 12:50:07 +01:00
Ingo Molnar	886b5b73d7	tracing: remove /debug/tracing/latency_trace Impact: remove old debug/tracing API /debug/tracing/latency_trace is an old legacy format we kept from the old latency tracer. Remove the file for now. If there's any useful bit missing then we'll propagate any useful output bits into the /debug/tracing/trace output. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-25 11:05:34 +01:00
Ingo Molnar	2d542cf342	tracing/hw-branch-tracing: convert bts-tracer mutex to a spinlock Impact: fix CPU hotplug lockup bts_hotcpu_handler() is called with irqs disabled, so using mutex_lock() is a no-no. All the BTS codepaths here are atomic (they do not schedule), so using a spinlock is the right solution. Cc: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-25 09:16:01 +01:00
Steven Rostedt	1473e4417c	tracing: make event directory structure This patch adds the directory /debug/tracing/events/ that will contain all the registered trace points. # ls /debug/tracing/events/ sched_kthread_stop sched_process_fork sched_switch sched_kthread_stop_ret sched_process_free sched_wait_task sched_migrate_task sched_process_wait sched_wakeup sched_process_exit sched_signal_send sched_wakeup_new # ls /debug/tracing/events/sched_switch/ enable # cat /debug/tracing/events/sched_switch/enable 1 # cat /debug/tracing/set_event sched_switch Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-24 21:54:08 -05:00
Steven Rostedt	f3fe8e4a38	tracing: add schedule events to event trace This patch changes the trace/sched.h to use the DECLARE_TRACE_FMT such that they are automatically registered with the event tracer. And it also adds the tracing sched headers to kernel/trace/events.c Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-24 21:54:07 -05:00
Steven Rostedt	b77e38aa24	tracing: add event trace infrastructure This patch creates the event tracing infrastructure of ftrace. It will create the files: /debug/tracing/available_events /debug/tracing/set_event The available_events will list the trace points that have been registered with the event tracer. set_events will allow the user to enable or disable an event hook. example: # echo sched_wakeup > /debug/tracing/set_event Will enable the sched_wakeup event (if it is registered). # echo "!sched_wakeup" >> /debug/tracing/set_event Will disable the sched_wakeup event (and only that event). # echo > /debug/tracing/set_event Will disable all events (notice the '>') # cat /debug/tracing/available_events > /debug/tracing/set_event Will enable all registered event hooks. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-24 21:54:05 -05:00
Markus Metzger	5e01cb695d	x86, ftrace: fix section mismatch in hw-branch-tracer Fix an invalid memory reference problem when cpu hotplug support is disabled and the hw-branch-tracer is set as current tracer. Initializing the tracer calls bts_trace_init() which has already been freed at this time. Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-24 18:23:34 +01:00
Ingo Molnar	c478f87869	Merge branch 'tip/x86/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace Conflicts: include/linux/ftrace.h kernel/trace/ftrace.c	2009-02-22 18:12:01 +01:00
Steven Rostedt	4377245aa9	ftrace: break out modify loop immediately on detection of error Impact: added precaution on failure detection Break out of the modifying loop as soon as a failure is detected. This is just an added precaution found by code review and was not found by any bug chasing. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-20 14:30:20 -05:00
Steven Rostedt	000ab69117	ftrace: allow archs to preform pre and post process for code modification This patch creates the weak functions: ftrace_arch_code_modify_prepare and ftrace_arch_code_modify_post_process that are called before and after the stop machine is called to modify the kernel text. If the arch needs to do pre or post processing, it only needs to define these functions. [ Update: Ingo Molnar suggested using the name ftrace_arch_code_modify_* over using ftrace_arch_modify_* ] Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-20 13:16:18 -05:00
Frederic Weisbecker	f9349a8f97	tracing/function-graph-tracer: make set_graph_function file support ftrace regex Impact: trace only functions matching a pattern The set_graph_function file let one to trace only one or several chosen functions and follow all their code flow. Currently, only a constant function name is allowed so this patch allows the ftrace_regex functions: - matches all functions that end with "name": echo name > set_graph_function - matches all functions that begin with "name": echo name > set_graph_function - matches all functions that contains "name": echo name > set_graph_function Example: echo mutex* > set_graph_function 0) \| mutex_lock_nested() { 0) 0.563 us \| __might_sleep(); 0) 2.072 us \| } 0) \| mutex_unlock() { 0) 1.036 us \| __mutex_unlock_slowpath(); 0) 2.433 us \| } 0) \| mutex_unlock() { 0) 0.691 us \| __mutex_unlock_slowpath(); 0) 1.787 us \| } 0) \| mutex_lock_interruptible_nested() { 0) 0.548 us \| __might_sleep(); 0) 1.945 us \| } Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-20 11:36:24 +01:00
Ingo Molnar	00a8bf8593	tracing/function-graph-tracer: fix merge Merge artifact: pid got changed to ent->pid meanwhile. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-19 13:01:37 +01:00

1 2 3 4 5 ...

688 Commits