[asan] Avoid instrumenting duplicated memory access in the same basic block
Like what Address Sanitizer does in LLVM, this patch avoids instrumented duplicated memory accesses in the same basic blocks. The approach taken is very conservative, to keep the pass simple, for a start. A memory access is considered to be a pair made of an expression tree representing the beginning of the memory region that is accessed and a the size of the access, in byte. For now that size is either 1, 2, 4, 8 or 16 bytes. The patch builds a hash table of the memory accesses that have been instrumented in the current basic block. Then it walks the gimple statements of the current basic block. For each statement, it tests if the memory regions it references have already been instrumented. If not, the statement is instrumented and each memory references that are actually instrumented are added to the hash table. When a memory region is accessed (usually through builtin functions like memset), then what gets added to the hash table is actually two memory accesses: one for the beginning of the region, and the other for the its end. When the patch crosses a function call that is not a built-in function that we ought to instrument, the hash table is cleared, because that function call can possibly e.g free some memory that was instrumented. Likewise, when a new basic block is visited, the hash table is cleared. I guess we could be smarter than just unconditionally clearing the hash table in this later case, but this is what asan@llvm does, and for now, I thought starting in a conservative manner might have some value. The hash table is destroyed at the end of the pass. Bootstrapped and tested against trunk on x86-64-unknown-linux-gnu. gcc/ * Makefile.in (asan.o): Add new dependency on hash-table.h * asan.c (struct asan_mem_ref, struct mem_ref_hasher): New types. (asan_mem_ref_init, asan_mem_ref_get_end, get_mem_ref_hash_table) (has_stmt_been_instrumented_p, empty_mem_ref_hash_table) (free_mem_ref_resources, has_mem_ref_been_instrumented) (has_stmt_been_instrumented_p, update_mem_ref_hash_table) (get_mem_ref_of_assignment): New functions. (get_mem_refs_of_builtin_call): Extract from instrument_builtin_call and tweak a little bit to make it fit with the new signature. (instrument_builtin_call): Use the new get_mem_refs_of_builtin_call. Use gimple_call_builtin_p instead of is_gimple_builtin_call. (instrument_derefs, instrument_mem_region_access): Insert the instrumented memory reference into the hash table. (maybe_instrument_assignment): Renamed instrument_assignment into this, and change it to advance the iterator when instrumentation actually happened and return true in that case. This makes it homogeneous with maybe_instrument_assignment, and thus give a chance to callers to be more 'regular'. (transform_statements): Clear the memory reference hash table whenever we enter a new BB, when we cross a function call, or when we are done transforming statements. Use maybe_instrument_assignment instead of instrumentation. No more need to special case maybe_instrument_assignment and advance the iterator after calling it; it's now handled just like maybe_instrument_call. Update comment. gcc/testsuite/ * c-c++-common/asan/no-redundant-instrumentation-1.c: New test. * testsuite/c-c++-common/asan/no-redundant-instrumentation-2.c: Likewise. * testsuite/c-c++-common/asan/no-redundant-instrumentation-3.c: Likewise. * testsuite/c-c++-common/asan/inc.c: Likewise. From-SVN: r196008
This commit is contained in:
parent
a50bd22d71
commit
bdcbe80c52
@ -1,3 +1,34 @@
|
||||
2013-02-12 Dodji Seketeli <dodji@redhat.com>
|
||||
|
||||
Avoid instrumenting duplicated memory access in the same basic block
|
||||
* Makefile.in (asan.o): Add new dependency on hash-table.h
|
||||
* asan.c (struct asan_mem_ref, struct mem_ref_hasher): New types.
|
||||
(asan_mem_ref_init, asan_mem_ref_get_end, get_mem_ref_hash_table)
|
||||
(has_stmt_been_instrumented_p, empty_mem_ref_hash_table)
|
||||
(free_mem_ref_resources, has_mem_ref_been_instrumented)
|
||||
(has_stmt_been_instrumented_p, update_mem_ref_hash_table)
|
||||
(get_mem_ref_of_assignment): New functions.
|
||||
(get_mem_refs_of_builtin_call): Extract from
|
||||
instrument_builtin_call and tweak a little bit to make it fit with
|
||||
the new signature.
|
||||
(instrument_builtin_call): Use the new
|
||||
get_mem_refs_of_builtin_call. Use gimple_call_builtin_p instead
|
||||
of is_gimple_builtin_call.
|
||||
(instrument_derefs, instrument_mem_region_access): Insert the
|
||||
instrumented memory reference into the hash table.
|
||||
(maybe_instrument_assignment): Renamed instrument_assignment into
|
||||
this, and change it to advance the iterator when instrumentation
|
||||
actually happened and return true in that case. This makes it
|
||||
homogeneous with maybe_instrument_assignment, and thus give a
|
||||
chance to callers to be more 'regular'.
|
||||
(transform_statements): Clear the memory reference hash table
|
||||
whenever we enter a new BB, when we cross a function call, or when
|
||||
we are done transforming statements. Use
|
||||
maybe_instrument_assignment instead of instrumentation. No more
|
||||
need to special case maybe_instrument_assignment and advance the
|
||||
iterator after calling it; it's now handled just like
|
||||
maybe_instrument_call. Update comment.
|
||||
|
||||
2013-02-13 Richard Biener <rguenther@suse.de>
|
||||
|
||||
* config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc):
|
||||
|
@ -2226,7 +2226,8 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
|
||||
asan.o : asan.c asan.h $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \
|
||||
output.h coretypes.h $(GIMPLE_PRETTY_PRINT_H) \
|
||||
tree-iterator.h $(TREE_FLOW_H) $(TREE_PASS_H) \
|
||||
$(TARGET_H) $(EXPR_H) $(OPTABS_H) $(TM_P_H) langhooks.h
|
||||
$(TARGET_H) $(EXPR_H) $(OPTABS_H) $(TM_P_H) langhooks.h \
|
||||
$(HASH_TABLE_H) alloc-pool.h
|
||||
tsan.o : $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(TREE_INLINE_H) \
|
||||
$(GIMPLE_H) $(DIAGNOSTIC_H) langhooks.h \
|
||||
$(TM_H) coretypes.h $(TREE_DUMP_H) $(TREE_PASS_H) $(CGRAPH_H) $(GGC_H) \
|
||||
|
1054
gcc/asan.c
1054
gcc/asan.c
File diff suppressed because it is too large
Load Diff
@ -1,3 +1,11 @@
|
||||
2013-02-12 Dodji Seketeli <dodji@redhat.com>
|
||||
|
||||
Avoid instrumenting duplicated memory access in the same basic block
|
||||
* c-c++-common/asan/no-redundant-instrumentation-1.c: New test.
|
||||
* testsuite/c-c++-common/asan/no-redundant-instrumentation-2.c: Likewise.
|
||||
* testsuite/c-c++-common/asan/no-redundant-instrumentation-3.c: Likewise.
|
||||
* testsuite/c-c++-common/asan/inc.c: Likewise.
|
||||
|
||||
2013-02-12 Vladimir Makarov <vmakarov@redhat.com>
|
||||
|
||||
PR inline-asm/56148
|
||||
|
21
gcc/testsuite/c-c++-common/asan/inc.c
Normal file
21
gcc/testsuite/c-c++-common/asan/inc.c
Normal file
@ -0,0 +1,21 @@
|
||||
/* { dg-options "-fdump-tree-asan0" } */
|
||||
/* { dg-do compile } */
|
||||
/* { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } */
|
||||
|
||||
void
|
||||
foo(int *a)
|
||||
{
|
||||
(*a)++;
|
||||
}
|
||||
|
||||
int
|
||||
main ()
|
||||
{
|
||||
int a = 0;
|
||||
foo (&a);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "__builtin___asan_report" 1 "asan0" } } */
|
||||
/* { dg-final { scan-tree-dump "__builtin___asan_report_load4" "asan0" } } */
|
||||
/* { dg-final { cleanup-tree-dump "asan0" } } */
|
@ -0,0 +1,66 @@
|
||||
/* This tests that when faced with two references to the same memory
|
||||
location in the same basic block, the second reference should not
|
||||
be instrumented by the Address Sanitizer. */
|
||||
|
||||
/* { dg-options "-fdump-tree-asan0" } */
|
||||
/* { dg-do compile } */
|
||||
/* { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } */
|
||||
|
||||
static char tab[4] = {0};
|
||||
|
||||
static int
|
||||
test0 ()
|
||||
{
|
||||
/* __builtin___asan_report_store1 called 2 times for the two stores
|
||||
below. */
|
||||
tab[0] = 1;
|
||||
tab[1] = 2;
|
||||
|
||||
/* __builtin___asan_report_load1 called 1 time for the store
|
||||
below. */
|
||||
char t0 = tab[1];
|
||||
|
||||
/* This load should not be instrumented because it is to the same
|
||||
memory location as above. */
|
||||
char t1 = tab[1];
|
||||
|
||||
return t0 + t1;
|
||||
}
|
||||
|
||||
static int
|
||||
test1 ()
|
||||
{
|
||||
/*__builtin___asan_report_store1 called 1 time here to instrument
|
||||
the initialization. */
|
||||
char foo[4] = {1};
|
||||
|
||||
/*__builtin___asan_report_store1 called 2 times here to instrument
|
||||
the store to the memory region of tab. */
|
||||
__builtin_memset (tab, 3, sizeof (tab));
|
||||
|
||||
/* There is no instrumentation for the two memset calls below. */
|
||||
__builtin_memset (tab, 4, sizeof (tab));
|
||||
__builtin_memset (tab, 5, sizeof (tab));
|
||||
|
||||
/* There are 2 calls to __builtin___asan_report_store1 and 2 calls
|
||||
to __builtin___asan_report_load1 to instrument the store to
|
||||
(subset of) the memory region of tab. */
|
||||
__builtin_memcpy (&tab[1], foo, sizeof (tab) - 1);
|
||||
|
||||
/* This should not generate a __builtin___asan_report_load1 because
|
||||
the reference to tab[1] has been already instrumented above. */
|
||||
return tab[1];
|
||||
|
||||
/* So for these function, there should be 7 calls to
|
||||
__builtin___asan_report_store1. */
|
||||
}
|
||||
|
||||
int
|
||||
main ()
|
||||
{
|
||||
return test0 () && test1 ();
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "__builtin___asan_report_store1" 7 "asan0" } } */
|
||||
/* { dg-final { scan-tree-dump-times "__builtin___asan_report_load" 2 "asan0" } } */
|
||||
/* { dg-final { cleanup-tree-dump "asan0" } } */
|
@ -0,0 +1,25 @@
|
||||
/* This tests that when faced with two references to the same memory
|
||||
location in the same basic block, the second reference should not
|
||||
be instrumented by the Address Sanitizer. But in case of access to
|
||||
overlapping regions we must be precise. */
|
||||
|
||||
/* { dg-options "-fdump-tree-asan0" } */
|
||||
/* { dg-do compile } */
|
||||
/* { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } */
|
||||
|
||||
int
|
||||
main ()
|
||||
{
|
||||
char tab[5];
|
||||
|
||||
/* Here, we instrument the access at offset 0 and access at offset
|
||||
4. */
|
||||
__builtin_memset (tab, 1, sizeof (tab));
|
||||
/* We instrumented access at offset 0 above already, so only access
|
||||
at offset 3 is instrumented. */
|
||||
__builtin_memset (tab, 1, 3);
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "__builtin___asan_report_store1" 3 "asan0" } } */
|
||||
/* { dg-final { scan-tree-dump-times "__builtin___asan_report" 3 "asan0" } } */
|
||||
/* { dg-final { cleanup-tree-dump "asan0" } } */
|
@ -0,0 +1,18 @@
|
||||
/* { dg-options "-fdump-tree-asan0" } */
|
||||
/* { dg-do compile } */
|
||||
/* { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } */
|
||||
|
||||
char
|
||||
foo (__INT32_TYPE__ *p)
|
||||
{
|
||||
/* This generates a __builtin___asan_report_load1. */
|
||||
__INT32_TYPE__ ret = *(char *) p;
|
||||
/* This generates a __builtin___asan_report_store4 depending on the. */
|
||||
*p = 26;
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "__builtin___asan_report" 2 "asan0" } } */
|
||||
/* { dg-final { scan-tree-dump-times "__builtin___asan_report_load1" 1 "asan0" } } */
|
||||
/* { dg-final { scan-tree-dump-times "__builtin___asan_report_store" 1 "asan0" } } */
|
||||
/* { dg-final { cleanup-tree-dump "asan0" } } */
|
Loading…
x
Reference in New Issue
Block a user