|
|
Created:
13 years, 5 months ago by kcc Modified:
13 years, 5 months ago CC:
gcc-patches_gcc.gnu.org Base URL:
svn://gcc.gnu.org/svn/gcc/branches/google/main/gcc/ Visibility:
Public. |
Patch Set 1 #Patch Set 2 : fix 32 bits #
Total comments: 73
Patch Set 3 : misc fixes #Patch Set 4 : misc fixes #Patch Set 5 : use tree_low_cst #Patch Set 6 : comment fix #Patch Set 7 : use build_addr (not fully functional yet) #
Total comments: 2
Patch Set 8 : remove unused code #Patch Set 9 : move asan pass higher, right after pre and sink_code #Patch Set 10 : set attribute noreturn #Patch Set 11 : support 16-byte accesses #Patch Set 12 : don't touch bitfields #Patch Set 13 : extend the avoid-bit-field hack #
Total comments: 25
Patch Set 14 : style change in toplev.c #Patch Set 15 : style and comments #Patch Set 16 : use FOR_EACH_BB #
Total comments: 1
Patch Set 17 : correct intereation inside FOR_EACH_BB #Patch Set 18 : 80 chars #Patch Set 19 : add entry into ChangeLog.google-main #Patch Set 20 : add entry to doc/invoke.texi #
Total comments: 1
Patch Set 21 : better ChangeLog #Patch Set 22 : better ChangeLog #Patch Set 23 : better ChangeLog #
Total comments: 2
Patch Set 24 : better ChangeLog #
MessagesTotal messages: 40
Index: tree-asan.c =================================================================== --- tree-asan.c (revision 0) +++ tree-asan.c (revision 0) @@ -0,0 +1,512 @@ +/* AddressSanitizer, a fast memory error detector. + Copyright (C) 2011 Free Software Foundation, Inc. + Contributed by Kostya Serebryany <kcc@google.com> + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "tree.h" +#include "tm_p.h" +#include "basic-block.h" +#include "flags.h" +#include "function.h" +#include "tree-inline.h" +#include "gimple.h" +#include "tree-iterator.h" +#include "tree-flow.h" +#include "tree-dump.h" +#include "tree-pass.h" +#include "diagnostic.h" +#include "demangle.h" +#include "langhooks.h" +#include "ggc.h" +#include "cgraph.h" +#include "gimple.h" +#include "tree-asan.h" + +/* + AddressSanitizer finds out-of-bounds and use-after-free bugs + with <2x slowdown on average. + + The tool consists of two parts: + instrumentation module (this file) and a run-time library. + The instrumentation module adds a run-time check before every memory insn. + For a 8 byte load accessing address X: + ShadowAddr = (X >> 3) + Offset + ShadowValue = (char*)ShadowAddr; + if (ShadowValue) + __asan_report_load8(X); + For a load of N bytes (N=1, 2 or 4) from address X: + ShadowAddr = (X >> 3) + Offset + ShadowValue = (char*)ShadowAddr; + if (ShadowValue) + if ((X & 7) + N - 1 > ShadowValue) + __asan_report_loadN(X); + Stores are instrumented similarly, but using __asan_report_storeN functions. + A call too __asan_init() is inserted to the list of module CTORs. + + The run-time library redefines malloc (so that redzone are inserted around + the allocated memory) and free (so that reuse of free-ed memory is delayed), + provides __asan_report* and __asan_init functions. + + Read more: + http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm + + Implementation details: + This is my first code in gcc. I wrote it by copying tree-mudflap.c, + stripping 70% of irrelevant code and modifying the instrumentation routine + build_check_stmt. The code seems to work, but I don't feel I understand it. + In particular, transform_derefs() and transform_statements() seem too complex. + Suggestions are welcome on how to simplify them. + (All I need is to traverse *all* memory accesses and instrument them). + + Future work: + The current implementation supports only detection of out-of-bounds and + use-after-free bugs in heap. + In order to support out-of-bounds for stack and globals we will need + to create redzones for stack and global object and poison them. +*/ + +/* The shadow address is computed as (X>>asan_scale) + (1<<asan_offset_log). + We may want to add command line flags to change these values. */ +static int asan_scale = 3; +static int asan_offset_log_32 = 29; +static int asan_offset_log_64 = 44; +static int asan_offset_log; + + +/* Construct a function tree for __asan_report_{load,store}{1,2,4,8}. */ +static tree +report_error_func (int is_store, int size) +{ + tree fn_type; + tree def; + + char name[100]; + sprintf(name, "__asan_report_%s%d\n", is_store ? "store" : "load", size); + fn_type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE); + def = build_fn_decl (name, fn_type); + TREE_NOTHROW (def) = 1; + DECL_ATTRIBUTES (def) = tree_cons (get_identifier ("leaf"), NULL, DECL_ATTRIBUTES (def)); + DECL_ASSEMBLER_NAME (def); + return def; +} + +/* Construct a function tree for __asan_init(). */ +static tree +asan_init_func(void) +{ + tree fn_type; + tree def; + + fn_type = build_function_type_list (void_type_node, NULL_TREE); + def = build_fn_decl ("__asan_init", fn_type); + TREE_NOTHROW (def) = 1; + DECL_ASSEMBLER_NAME (def); + return def; +} + + +/* perform the instrumentation */ +static void +build_check_stmt (tree base, + gimple_stmt_iterator *instr_gsi, + location_t location, int is_store, int size) +{ + gimple_stmt_iterator gsi; + basic_block cond_bb, then_bb, join_bb; + edge e; + tree cond, t, u; + tree base_addr; + tree shadow_value; + gimple g; + gimple_seq seq, stmts; + tree char_ptr_type = build_pointer_type(char_type_node); + tree shadow_type = char_type_node; + tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode, + /*unsignedp=*/true); + + /* We first need to split the current basic block, and start altering + the CFG. This allows us to insert the statements we're about to + construct into the right basic blocks. */ + + cond_bb = gimple_bb (gsi_stmt (*instr_gsi)); + gsi = *instr_gsi; + gsi_prev (&gsi); + if (! gsi_end_p (gsi)) + e = split_block (cond_bb, gsi_stmt (gsi)); + else + e = split_block_after_labels (cond_bb); + cond_bb = e->src; + join_bb = e->dest; + + /* A recap at this point: join_bb is the basic block at whose head + is the gimple statement for which this check expression is being + built. cond_bb is the (possibly new, synthetic) basic block the + end of which will contain the cache-lookup code, and a + conditional that jumps to the cache-miss code or, much more + likely, over to join_bb. */ + + /* Create the bb that contains the crash block. */ + then_bb = create_empty_bb (cond_bb); + make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE); + make_single_succ_edge (then_bb, join_bb, EDGE_FALLTHRU); + + /* Mark the pseudo-fallthrough edge from cond_bb to join_bb. */ + e = find_edge (cond_bb, join_bb); + e->flags = EDGE_FALSE_VALUE; + e->count = cond_bb->count; + e->probability = REG_BR_PROB_BASE; + + /* Update dominance info. Note that bb_join's data was + updated by split_block. */ + if (dom_info_available_p (CDI_DOMINATORS)) + { + set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb); + set_immediate_dominator (CDI_DOMINATORS, join_bb, cond_bb); + } + + base_addr = make_rename_temp (uintptr_type, "base_addr"); + + seq = gimple_seq_alloc (); + t = fold_convert_loc (location, uintptr_type, + unshare_expr (base)); + t = force_gimple_operand (t, &stmts, false, NULL_TREE); + gimple_seq_add_seq (&seq, stmts); + g = gimple_build_assign (base_addr, t); + gimple_set_location (g, location); + gimple_seq_add_stmt (&seq, g); + + t = build2 (RSHIFT_EXPR, uintptr_type, base_addr, + build_int_cst(uintptr_type, asan_scale) + ); + t = build2 (PLUS_EXPR, uintptr_type, t, + build2 (LSHIFT_EXPR, uintptr_type, + build_int_cst(uintptr_type, 1), + build_int_cst(uintptr_type, asan_offset_log) + ) + ); + t = build1 (INDIRECT_REF, shadow_type, + build1 (VIEW_CONVERT_EXPR, char_ptr_type, t)); + t = force_gimple_operand (t, &stmts, false, NULL_TREE); + gimple_seq_add_seq (&seq, stmts); + shadow_value = make_rename_temp (shadow_type, "__asan_shadow"); + g = gimple_build_assign (shadow_value, t); + gimple_set_location (g, location); + gimple_seq_add_stmt (&seq, g); + + t = build2 (NE_EXPR, boolean_type_node, shadow_value, + build_int_cst(char_type_node, 0)); + + if (size < 8) + { + u = build2 (BIT_AND_EXPR, uintptr_type, + base_addr, + build_int_cst (uintptr_type, 7)); + u = build1 (CONVERT_EXPR, char_type_node, u); + u = build2 (PLUS_EXPR, char_type_node, u, + build_int_cst (char_type_node, size - 1)); + u = build2 (GE_EXPR, uintptr_type, u, shadow_value); + } + else + { + u = build_int_cst (boolean_type_node, 1); + } + + t = build2 (TRUTH_AND_EXPR, boolean_type_node, t, u); + t = force_gimple_operand (t, &stmts, false, NULL_TREE); + gimple_seq_add_seq (&seq, stmts); + cond = make_rename_temp (boolean_type_node, "asan_crash_cond"); + g = gimple_build_assign (cond, t); + gimple_set_location (g, location); + gimple_seq_add_stmt (&seq, g); + + g = gimple_build_cond (NE_EXPR, cond, boolean_false_node, NULL_TREE, + NULL_TREE); + gimple_set_location (g, location); + gimple_seq_add_stmt (&seq, g); + + /* crash */ + gsi = gsi_last_bb (cond_bb); + gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING); + seq = gimple_seq_alloc (); + g = gimple_build_call (report_error_func(is_store, size), 1, base_addr); + gimple_seq_add_stmt (&seq, g); + + + /* Insert the check code in the THEN block. */ + gsi = gsi_start_bb (then_bb); + gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING); + + *instr_gsi = gsi_start_bb (join_bb); +} + +/* asan: do we need this */ +/* Check whether the given decl, generally a VAR_DECL or PARM_DECL, is + eligible for instrumentation. For the mudflap1 pass, this implies + that it should be registered with the libmudflap runtime. For the + mudflap2 pass this means instrumenting an indirection operation with + respect to the object. +*/ +static int +decl_eligible_p (tree decl) +{ + return ((TREE_CODE (decl) == VAR_DECL || TREE_CODE (decl) == PARM_DECL) + /* The decl must have its address taken. In the case of + arrays, this flag is also set if the indexes are not + compile-time known valid constants. */ + /* XXX: not sufficient: return-by-value structs! */ + && TREE_ADDRESSABLE (decl) + /* The type of the variable must be complete. */ + && COMPLETE_OR_VOID_TYPE_P (TREE_TYPE (decl)) + /* The decl hasn't been decomposed somehow. */ + && !DECL_HAS_VALUE_EXPR_P (decl)); +} + +/* asan: this looks too complex. Can this be done simpler? */ +static void +transform_derefs (gimple_stmt_iterator *iter, tree *tp, + location_t location, int is_store) +{ + tree type, base, addr, t; + int size; + + t = *tp; + type = TREE_TYPE (t); + + if (type == error_mark_node) + return; + + size = tree_to_double_int (TYPE_SIZE (type)).low / 8; + + switch (TREE_CODE (t)) + { + case ARRAY_REF: + case COMPONENT_REF: + { + /* This is trickier than it may first appear. The reason is + that we are looking at expressions from the "inside out" at + this point. We may have a complex nested aggregate/array + expression (e.g. "a.b[i].c"), maybe with an indirection as + the leftmost operator ("p->a.b.d"), where instrumentation + is necessary. Or we may have an innocent "a.b.c" + expression that must not be instrumented. We need to + recurse all the way down the nesting structure to figure it + out: looking just at the outer node is not enough. */ + tree var; + int component_ref_only = (TREE_CODE (t) == COMPONENT_REF); + int bitfield_ref_p = (TREE_CODE (t) == COMPONENT_REF + && DECL_BIT_FIELD_TYPE (TREE_OPERAND (t, 1))); + + if (bitfield_ref_p) + { + return; + } + /* Iterate to the top of the ARRAY_REF/COMPONENT_REF + containment hierarchy to find the outermost VAR_DECL. */ + var = TREE_OPERAND (t, 0); + while (1) + { + if (TREE_CODE (var) == ARRAY_REF) + { + component_ref_only = 0; + var = TREE_OPERAND (var, 0); + } + else if (TREE_CODE (var) == COMPONENT_REF) + var = TREE_OPERAND (var, 0); + else if (INDIRECT_REF_P (var) + || TREE_CODE (var) == MEM_REF) + { + base = TREE_OPERAND (var, 0); + break; + } + else if (TREE_CODE (var) == VIEW_CONVERT_EXPR) + { + var = TREE_OPERAND (var, 0); + if (CONSTANT_CLASS_P (var) + && TREE_CODE (var) != STRING_CST) + return; + } + else + { + gcc_assert (TREE_CODE (var) == VAR_DECL + || TREE_CODE (var) == PARM_DECL + || TREE_CODE (var) == RESULT_DECL + || TREE_CODE (var) == STRING_CST); + /* Don't instrument this access if the underlying + variable is not "eligible". This test matches + those arrays that have only known-valid indexes, + and thus are not labeled TREE_ADDRESSABLE. */ + if (! decl_eligible_p (var) || component_ref_only) + return; + else + { + base = build1 (ADDR_EXPR, + build_pointer_type (TREE_TYPE (var)), var); + break; + } + } + } + + addr = build1 (ADDR_EXPR, build_pointer_type (type), t); + + } + break; + + case INDIRECT_REF: + addr = TREE_OPERAND (t, 0); + base = addr; + break; + + case MEM_REF: + addr = build2 (POINTER_PLUS_EXPR, TREE_TYPE (TREE_OPERAND (t, 1)), + TREE_OPERAND (t, 0), + fold_convert (sizetype, TREE_OPERAND (t, 1))); + base = addr; + break; + + case TARGET_MEM_REF: + addr = tree_mem_ref_addr (ptr_type_node, t); + base = addr; + break; + + default: + return; + } + + if (size != 1 && size != 2 && size != 4 && size != 8) + return; + build_check_stmt (base, iter, location, is_store, size); +} + +/* asan: this looks too complex. Can this be done simpler? */ +/* Transform + 1) Memory references. + 2) BUILTIN_ALLOCA calls. +*/ +static void +transform_statements (void) +{ + basic_block bb, next; + gimple_stmt_iterator i; + int saved_last_basic_block = last_basic_block; + enum gimple_rhs_class grhs_class; + + bb = ENTRY_BLOCK_PTR ->next_bb; + do + { + next = bb->next_bb; + for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i)) + { + gimple s = gsi_stmt (i); + + /* Only a few GIMPLE statements can reference memory. */ + switch (gimple_code (s)) + { + case GIMPLE_ASSIGN: + transform_derefs (&i, gimple_assign_lhs_ptr (s), + gimple_location (s), 1); + transform_derefs (&i, gimple_assign_rhs1_ptr (s), + gimple_location (s), 0); + grhs_class = get_gimple_rhs_class (gimple_assign_rhs_code (s)); + if (grhs_class == GIMPLE_BINARY_RHS) + transform_derefs (&i, gimple_assign_rhs2_ptr (s), + gimple_location (s), 0); + break; + + case GIMPLE_RETURN: + if (gimple_return_retval (s) != NULL_TREE) + { + transform_derefs (&i, gimple_return_retval_ptr (s), + gimple_location (s), + 0); + } + break; + + case GIMPLE_CALL: + { + tree fndecl = gimple_call_fndecl (s); + if (fndecl && (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_ALLOCA)) + gimple_call_set_cannot_inline (s, true); + } + break; + + default: + ; + } + } + bb = next; + } + while (bb && bb->index <= saved_last_basic_block); +} + +/* Module-level instrumentation. + - Insert __asan_init() into the list of CTORs. + - TODO: insert redzones around globals. + */ +void +asan_finish_file (void) +{ + tree ctor_statements = NULL_TREE; + append_to_statement_list (build_call_expr (asan_init_func(), 0), + &ctor_statements); + cgraph_build_static_cdtor ('I', ctor_statements, + MAX_RESERVED_INIT_PRIORITY-1); +} + +/* Instrument the current function. */ +static unsigned int +asan_instrument (void) +{ + struct gimplify_ctx gctx; + int is_64 = 1; /*TODO*/ + asan_offset_log = is_64 ? asan_offset_log_64 : asan_offset_log_32; + push_gimplify_context (&gctx); + transform_statements (); + pop_gimplify_context (NULL); + return 0; +} + +static bool +gate_asan (void) +{ + return flag_asan != 0; +} + +struct gimple_opt_pass pass_asan = +{ + { + GIMPLE_PASS, + "asan", /* name */ + gate_asan, /* gate */ + asan_instrument, /* execute */ + NULL, /* sub */ + NULL, /* next */ + 0, /* static_pass_number */ + TV_NONE, /* tv_id */ + PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_verify_flow | TODO_verify_stmts + | TODO_dump_func | TODO_update_ssa /* todo_flags_finish */ + } +}; Index: tree-asan.h =================================================================== --- tree-asan.h (revision 0) +++ tree-asan.h (revision 0) @@ -0,0 +1,26 @@ +/* AddressSanitizer, a fast memory error detector. + Copyright (C) 2011 Free Software Foundation, Inc. + Contributed by Kostya Serebryany <kcc@google.com> + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#ifndef TREE_ASAN +#define TREE_ASAN + +extern void asan_finish_file(void); + +#endif /* TREE_ASAN */ Index: tree-pass.h =================================================================== --- tree-pass.h (revision 179959) +++ tree-pass.h (working copy) @@ -352,6 +352,7 @@ extern struct gimple_opt_pass pass_mudflap_1; extern struct gimple_opt_pass pass_mudflap_2; +extern struct gimple_opt_pass pass_asan; extern struct gimple_opt_pass pass_lower_cf; extern struct gimple_opt_pass pass_refactor_eh; extern struct gimple_opt_pass pass_lower_eh; Index: toplev.c =================================================================== --- toplev.c (revision 179959) +++ toplev.c (working copy) @@ -74,6 +74,7 @@ #include "value-prof.h" #include "alloc-pool.h" #include "tree-mudflap.h" +#include "tree-asan.h" #include "tree-pass.h" #include "gimple.h" #include "tree-ssa-alias.h" @@ -615,6 +616,10 @@ if (flag_mudflap) mudflap_finish_file (); + /* File-scope initialization for AddressSanitizer. */ + if (flag_asan) + asan_finish_file(); + output_shared_constant_pool (); output_object_blocks (); Index: common.opt =================================================================== --- common.opt (revision 179959) +++ common.opt (working copy) @@ -887,6 +887,10 @@ Common Ignore Does nothing. Preserved for backward compatibility. +fasan +Common RejectNegative Report Var(flag_asan) +Enable AddressSanitizer, a memory error detector + fasynchronous-unwind-tables Common Report Var(flag_asynchronous_unwind_tables) Optimization Generate unwind tables that are exact at each instruction boundary Index: Makefile.in =================================================================== --- Makefile.in (revision 179959) +++ Makefile.in (working copy) @@ -1418,6 +1418,7 @@ toplev.o \ tracer.o \ tree-affine.o \ + tree-asan.o \ tree-call-cdce.o \ tree-cfg.o \ tree-cfgcleanup.o \ @@ -2393,6 +2394,10 @@ $(TREE_H) $(PARAMS_H) $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) output.h $(RTL_H) \ $(GGC_H) $(TM_P_H) $(TARGET_H) langhooks.h $(REGS_H) gt-stor-layout.h \ $(DIAGNOSTIC_CORE_H) $(CGRAPH_H) $(TREE_INLINE_H) $(TREE_DUMP_H) $(GIMPLE_H) +tree-asan.o : tree-asan.c tree-asan.h $(CONFIG_H) pointer-set.h \ + $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \ + output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \ + tree-pretty-print.h tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \ $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \ $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) output.h \ Index: passes.c =================================================================== --- passes.c (revision 179959) +++ passes.c (working copy) @@ -1420,6 +1420,7 @@ NEXT_PASS (pass_lower_resx); NEXT_PASS (pass_nrv); NEXT_PASS (pass_mudflap_2); + NEXT_PASS (pass_asan); NEXT_PASS (pass_cleanup_cfg_post_optimizing); NEXT_PASS (pass_warn_function_noreturn); -- This patch is available for review at http://codereview.appspot.com/5272048
Sign in to reply to this message.
fasan option also needs to be documented in doc/invoke.texi. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c File tree-asan.c (right): http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode54 tree-asan.c:54: ShadowValue = (char*)ShadowAddr; *(char*) ShadowAddr; http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode59 tree-asan.c:59: ShadowValue = (char*)ShadowAddr; *(char*) ShadowAddr; http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode79 tree-asan.c:79: (All I need is to traverse *all* memory accesses and instrument them). Two suggestions: 1) You only need to deal with GIMPLE_ASSIGN (lhs and rhs) for all memory references) 2) use get_base_address function to compute base address. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode89 tree-asan.c:89: We may want to add command line flags to change these values. */ two spaces. Similarly for other comments. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode90 tree-asan.c:90: static int asan_scale = 3; Need an empty line after the comment. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode91 tree-asan.c:91: static int asan_offset_log_32 = 29; const int? http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode97 tree-asan.c:97: static tree New empty line after the comment. Similarly for all other functions in the file. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode98 tree-asan.c:98: report_error_func (int is_store, int size) Document IS_STORE and SIZE in the comment. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode102 tree-asan.c:102: Extra line. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode104 tree-asan.c:104: sprintf(name, "__asan_report_%s%d\n", is_store ? "store" : "load", size); Empty line between decls and statements. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode108 tree-asan.c:108: DECL_ATTRIBUTES (def) = tree_cons (get_identifier ("leaf"), NULL, DECL_ATTRIBUTES (def)); line too long. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode128 tree-asan.c:128: /* perform the instrumentation */ Parameter documentation. s/perform/Perform/ http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode154 tree-asan.c:154: if (! gsi_end_p (gsi)) remove extra space http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode187 tree-asan.c:187: base_addr = make_rename_temp (uintptr_type, "base_addr"); May be better "__asan_base_addr" as the temp var name? http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode199 tree-asan.c:199: build_int_cst(uintptr_type, asan_scale) Missing space after function name. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode200 tree-asan.c:200: ); Do not put closing parenthesis in a separate line. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode203 tree-asan.c:203: build_int_cst(uintptr_type, 1), Missing space. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode204 tree-asan.c:204: build_int_cst(uintptr_type, asan_offset_log) Missing space. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode205 tree-asan.c:205: ) Do not start a new line. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode230 tree-asan.c:230: { { } is not needed. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode237 tree-asan.c:237: cond = make_rename_temp (boolean_type_node, "asan_crash_cond"); -> __asan_crash_cond to be consistent. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode238 tree-asan.c:238: g = gimple_build_assign (cond, t); Extra space here. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode251 tree-asan.c:251: g = gimple_build_call (report_error_func(is_store, size), 1, base_addr); Missing space. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode254 tree-asan.c:254: Extra line. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode286 tree-asan.c:286: transform_derefs (gimple_stmt_iterator *iter, tree *tp, You can use get_base_address utility function defined in gimple.c http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode414 tree-asan.c:414: do Can you use FOR_EACH_BB macro? http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode425 tree-asan.c:425: transform_derefs (&i, gimple_assign_lhs_ptr (s), Use get_base_address and then do transformation. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode436 tree-asan.c:436: if (gimple_return_retval (s) != NULL_TREE) The operand of a gimple_return should be a SSA_NAME, so handling it is not needed. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode447 tree-asan.c:447: if (fndecl && (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_ALLOCA)) && DECL_BUILT_IN (fndecl) .. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode469 tree-asan.c:469: append_to_statement_list (build_call_expr (asan_init_func(), 0), missing space. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode472 tree-asan.c:472: MAX_RESERVED_INIT_PRIORITY-1); missing space around '-' http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode481 tree-asan.c:481: int is_64 = tree_to_double_int (TYPE_SIZE(uintptr_type)).low == 64; tree_low_cst (..)
Sign in to reply to this message.
http://codereview.appspot.com/5272048/diff/2001/tree-asan.c File tree-asan.c (right): http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode79 tree-asan.c:79: (All I need is to traverse *all* memory accesses and instrument them). Discard 2) -- it is not correct. What Asan needs is slightly different. David On 2011/10/17 22:26:55, davidxl wrote: > Two suggestions: > 1) You only need to deal with GIMPLE_ASSIGN (lhs and rhs) for all memory > references) > 2) use get_base_address function to compute base address.
Sign in to reply to this message.
PTAL Lots of style issues, sorry. Does gcc have anything like cpplint.py? transform_derefs is still a mess which I don't understand http://codereview.appspot.com/5272048/diff/2001/tree-asan.c File tree-asan.c (right): http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode54 tree-asan.c:54: ShadowValue = (char*)ShadowAddr; On 2011/10/17 22:26:55, davidxl wrote: > *(char*) ShadowAddr; Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode59 tree-asan.c:59: ShadowValue = (char*)ShadowAddr; On 2011/10/17 22:26:55, davidxl wrote: > *(char*) ShadowAddr; Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode79 tree-asan.c:79: (All I need is to traverse *all* memory accesses and instrument them). On 2011/10/17 22:33:17, davidxl wrote: > Discard 2) -- it is not correct. What Asan needs is slightly different. Yea. I actually have a test where this instrumentation does something bad. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode89 tree-asan.c:89: We may want to add command line flags to change these values. */ On 2011/10/17 22:26:55, davidxl wrote: > two spaces. Similarly for other comments. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode90 tree-asan.c:90: static int asan_scale = 3; On 2011/10/17 22:26:55, davidxl wrote: > Need an empty line after the comment. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode91 tree-asan.c:91: static int asan_offset_log_32 = 29; On 2011/10/17 22:26:55, davidxl wrote: > const int? Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode97 tree-asan.c:97: static tree On 2011/10/17 22:26:55, davidxl wrote: > New empty line after the comment. Similarly for all other functions in the file. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode98 tree-asan.c:98: report_error_func (int is_store, int size) On 2011/10/17 22:26:55, davidxl wrote: > Document IS_STORE and SIZE in the comment. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode102 tree-asan.c:102: On 2011/10/17 22:26:55, davidxl wrote: > Extra line. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode104 tree-asan.c:104: sprintf(name, "__asan_report_%s%d\n", is_store ? "store" : "load", size); On 2011/10/17 22:26:55, davidxl wrote: > Empty line between decls and statements. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode108 tree-asan.c:108: DECL_ATTRIBUTES (def) = tree_cons (get_identifier ("leaf"), NULL, DECL_ATTRIBUTES (def)); On 2011/10/17 22:26:55, davidxl wrote: > line too long. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode128 tree-asan.c:128: /* perform the instrumentation */ On 2011/10/17 22:26:55, davidxl wrote: > Parameter documentation. s/perform/Perform/ Done, although I am not sure what location is... http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode154 tree-asan.c:154: if (! gsi_end_p (gsi)) On 2011/10/17 22:26:55, davidxl wrote: > remove extra space Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode187 tree-asan.c:187: base_addr = make_rename_temp (uintptr_type, "base_addr"); On 2011/10/17 22:26:55, davidxl wrote: > May be better "__asan_base_addr" as the temp var name? Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode199 tree-asan.c:199: build_int_cst(uintptr_type, asan_scale) On 2011/10/17 22:26:55, davidxl wrote: > Missing space after function name. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode200 tree-asan.c:200: ); On 2011/10/17 22:26:55, davidxl wrote: > Do not put closing parenthesis in a separate line. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode203 tree-asan.c:203: build_int_cst(uintptr_type, 1), On 2011/10/17 22:26:55, davidxl wrote: > Missing space. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode204 tree-asan.c:204: build_int_cst(uintptr_type, asan_offset_log) On 2011/10/17 22:26:55, davidxl wrote: > Missing space. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode205 tree-asan.c:205: ) On 2011/10/17 22:26:55, davidxl wrote: > Do not start a new line. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode230 tree-asan.c:230: { On 2011/10/17 22:26:55, davidxl wrote: > { } is not needed. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode237 tree-asan.c:237: cond = make_rename_temp (boolean_type_node, "asan_crash_cond"); On 2011/10/17 22:26:55, davidxl wrote: > -> __asan_crash_cond to be consistent. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode238 tree-asan.c:238: g = gimple_build_assign (cond, t); On 2011/10/17 22:26:55, davidxl wrote: > Extra space here. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode251 tree-asan.c:251: g = gimple_build_call (report_error_func(is_store, size), 1, base_addr); On 2011/10/17 22:26:55, davidxl wrote: > Missing space. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode254 tree-asan.c:254: On 2011/10/17 22:26:55, davidxl wrote: > Extra line. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode286 tree-asan.c:286: transform_derefs (gimple_stmt_iterator *iter, tree *tp, On 2011/10/17 22:26:55, davidxl wrote: > You can use get_base_address utility function defined in gimple.c So, do I ignore this for now? http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode414 tree-asan.c:414: do On 2011/10/17 22:26:55, davidxl wrote: > Can you use FOR_EACH_BB macro? I don't know. Can I? The instrumentation code creates new BB while it runs and those should not be instrumented. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode425 tree-asan.c:425: transform_derefs (&i, gimple_assign_lhs_ptr (s), On 2011/10/17 22:26:55, davidxl wrote: > Use get_base_address and then do transformation. Ignoring... http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode436 tree-asan.c:436: if (gimple_return_retval (s) != NULL_TREE) On 2011/10/17 22:26:55, davidxl wrote: > The operand of a gimple_return should be a SSA_NAME, so handling it is not > needed. removed the code http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode447 tree-asan.c:447: if (fndecl && (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_ALLOCA)) On 2011/10/17 22:26:55, davidxl wrote: > && DECL_BUILT_IN (fndecl) .. removed the code. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode469 tree-asan.c:469: append_to_statement_list (build_call_expr (asan_init_func(), 0), On 2011/10/17 22:26:55, davidxl wrote: > missing space. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode472 tree-asan.c:472: MAX_RESERVED_INIT_PRIORITY-1); On 2011/10/17 22:26:55, davidxl wrote: > missing space around '-' Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode481 tree-asan.c:481: int is_64 = tree_to_double_int (TYPE_SIZE(uintptr_type)).low == 64; On 2011/10/17 22:26:55, davidxl wrote: > tree_low_cst (..) Done.
Sign in to reply to this message.
There must be a style lint for gcc -- but I have not used it .. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c File tree-asan.c (right): http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode79 tree-asan.c:79: (All I need is to traverse *all* memory accesses and instrument them). For refs such as component ref 'ap->b[i].c', you need to create the access address expression out of it -- you may want to try to use build_addr interface. There is another interface get_ref_base_and_extent which can return the access base + offset, but it does not work for asan when there is array ref with runtime index -- this is mainly for alias analysis. On 2011/10/17 23:04:50, kcc wrote: > On 2011/10/17 22:33:17, davidxl wrote: > > Discard 2) -- it is not correct. What Asan needs is slightly different. > > Yea. > I actually have a test where this instrumentation does something bad. > http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode128 tree-asan.c:128: /* perform the instrumentation */ On 2011/10/17 23:04:50, kcc wrote: > On 2011/10/17 22:26:55, davidxl wrote: > > Parameter documentation. s/perform/Perform/ > Done, although I am not sure what location is... location is the source location. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode286 tree-asan.c:286: transform_derefs (gimple_stmt_iterator *iter, tree *tp, Look at interface build_addr in tree-nested.c, it may be what you need. On 2011/10/17 23:04:50, kcc wrote: > On 2011/10/17 22:26:55, davidxl wrote: > > You can use get_base_address utility function defined in gimple.c > > So, do I ignore this for now? http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode414 tree-asan.c:414: do Ok, in that case, may be you should use the aux field of BB to mark those new BBs and skip them? On 2011/10/17 23:04:50, kcc wrote: > On 2011/10/17 22:26:55, davidxl wrote: > > Can you use FOR_EACH_BB macro? > > I don't know. Can I? The instrumentation code creates new BB while it runs and > those should not be instrumented.
Sign in to reply to this message.
PTAL http://codereview.appspot.com/5272048/diff/2001/tree-asan.c File tree-asan.c (right): http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode79 tree-asan.c:79: (All I need is to traverse *all* memory accesses and instrument them). On 2011/10/18 06:06:48, davidxl wrote: > For refs such as component ref 'ap->b[i].c', you need to create the access > address expression out of it -- you may want to try to use build_addr interface. > There is another interface get_ref_base_and_extent which can return the access > base + offset, but it does not work for asan when there is array ref with > runtime index -- this is mainly for alias analysis. I tried using build_addr. Works for simple cases, but crashes in verifier on heavier tests. Something else should be wrong, but I don't see what. Isn't there a code somewhere that does something similar? http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode128 tree-asan.c:128: /* perform the instrumentation */ On 2011/10/18 06:06:48, davidxl wrote: > On 2011/10/17 23:04:50, kcc wrote: > > On 2011/10/17 22:26:55, davidxl wrote: > > > Parameter documentation. s/perform/Perform/ > > Done, although I am not sure what location is... > > location is the source location. Done. http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode286 tree-asan.c:286: transform_derefs (gimple_stmt_iterator *iter, tree *tp, On 2011/10/18 06:06:48, davidxl wrote: > Look at interface build_addr in tree-nested.c, it may be what you need. done, see my previous comment http://codereview.appspot.com/5272048/diff/2001/tree-asan.c#newcode414 tree-asan.c:414: do On 2011/10/18 06:06:48, davidxl wrote: > Ok, in that case, may be you should use the aux field of BB to mark those new > BBs and skip them? Tried that. The compiler goes into infinite loop.
Sign in to reply to this message.
http://codereview.appspot.com/5272048/diff/18001/tree-asan.c File tree-asan.c (right): http://codereview.appspot.com/5272048/diff/18001/tree-asan.c#newcode325 tree-asan.c:325: base = build_addr (t, current_function_decl); You need to create a temp var and build as gimple assignment. See init_tmp_var in tree-nested.c.
Sign in to reply to this message.
http://codereview.appspot.com/5272048/diff/18001/tree-asan.c File tree-asan.c (right): http://codereview.appspot.com/5272048/diff/18001/tree-asan.c#newcode325 tree-asan.c:325: base = build_addr (t, current_function_decl); There are issues with creating address expressions from TARGET_MEM_REF in gcc. Since you want compiler to do optimization on instrumented code as much as possible, asan instrumentation is better done as early as possible after ipa inlining -- and this will also solve this problem. I tried moving asan pass before loop opt, and it worked fine.
Sign in to reply to this message.
On 2011/10/18 22:52:33, davidxl wrote: > http://codereview.appspot.com/5272048/diff/18001/tree-asan.c > File tree-asan.c (right): > > http://codereview.appspot.com/5272048/diff/18001/tree-asan.c#newcode325 > tree-asan.c:325: base = build_addr (t, current_function_decl); > There are issues with creating address expressions from TARGET_MEM_REF in gcc. > Since you want compiler to do optimization on instrumented code as much as > possible, asan instrumentation is better done as early as possible after ipa Why? I would actually say that I want the instrumentation to happen as late as possible because this will instrument fewer memory accesses. For example, asan certainly needs to happen after loop invariants are moved out and common subexpressions (including loads) are eliminated. No? > inlining -- and this will also solve this problem. I tried moving asan pass > before loop opt, and it worked fine.
Sign in to reply to this message.
On Tue, Oct 18, 2011 at 3:56 PM, <konstantin.s.serebryany@gmail.com> wrote: > On 2011/10/18 22:52:33, davidxl wrote: >> >> http://codereview.appspot.com/5272048/diff/18001/tree-asan.c >> File tree-asan.c (right): > > > http://codereview.appspot.com/5272048/diff/18001/tree-asan.c#newcode325 >> >> tree-asan.c:325: base = build_addr (t, current_function_decl); >> There are issues with creating address expressions from TARGET_MEM_REF > > in gcc. >> >> Since you want compiler to do optimization on instrumented code as > > much as >> >> possible, asan instrumentation is better done as early as possible > > after ipa > > Why? > I would actually say that I want the instrumentation to happen as late > as possible because this will instrument fewer memory accesses. > For example, asan certainly needs to happen after loop invariants are > moved out and common subexpressions (including loads) are eliminated. > No? yes -- so a good choice would be after PRE and PDE (pre, sink_code) which should handle most of the loop invariant memory loads. David > >> inlining -- and this will also solve this problem. I tried moving asan > > pass >> >> before loop opt, and it worked fine. > > > > http://codereview.appspot.com/5272048/ >
Sign in to reply to this message.
> yes -- so a good choice would be after PRE and PDE (pre, sink_code) > which should handle most of the loop invariant memory loads. how about pass_lim? I think asan should be after lim.
Sign in to reply to this message.
It will be weird to put the instrumentation pass inside loop opt, besides memory loads which are loop invariants and redundant stores in loop should be handled by pre/pde. +cc Richard Guenther You may want to ask the middle-end maintainer to review your code at this point if you want it to be in trunk. thanks, David On Tue, Oct 18, 2011 at 4:12 PM, <konstantin.s.serebryany@gmail.com> wrote: >> yes -- so a good choice would be after PRE and PDE (pre, sink_code) >> which should handle most of the loop invariant memory loads. > > how about pass_lim? I think asan should be after lim. > > http://codereview.appspot.com/5272048/ >
Sign in to reply to this message.
On Tue, Oct 18, 2011 at 19:31, Xinliang David Li <davidxl@google.com> wrote: > It will be weird to put the instrumentation pass inside loop opt, > besides memory loads which are loop invariants and redundant stores in > loop should be handled by pre/pde. > > +cc Richard Guenther > > You may want to ask the middle-end maintainer to review your code at > this point if you want it to be in trunk. For trunk inclusion, we need to decide what to do wrt mudflap. Clearly, if ASAN offers the same protections and considerable better performance, then that should be an easy decision. I still have not looked at the implementation in detail, but I like the fact that it is a pure gimple pass. If the instrumentation can be statically optimized, then that is a clear advantage over mudflap, which we have never been able to optimize properly. More comments on the patch itself coming soon. Diego.
Sign in to reply to this message.
On Tue, Oct 18, 2011 at 4:49 PM, Diego Novillo <dnovillo@google.com> wrote: > On Tue, Oct 18, 2011 at 19:31, Xinliang David Li <davidxl@google.com> > wrote: > > It will be weird to put the instrumentation pass inside loop opt, > > besides memory loads which are loop invariants and redundant stores in > > loop should be handled by pre/pde. > > > > +cc Richard Guenther > > > > You may want to ask the middle-end maintainer to review your code at > > this point if you want it to be in trunk. > > For trunk inclusion, we need to decide what to do wrt mudflap. > Yes, I would like to see asan in gcc trunk. I never really understood what mudflap does and what it does not. For example, how much of leak detection and uninitialized load detection it does (asan does neither at the moment) --kcc > Clearly, if ASAN offers the same protections and considerable better > performance, then that should be an easy decision. > > I still have not looked at the implementation in detail, but I like > the fact that it is a pure gimple pass. If the instrumentation can be > statically optimized, then that is a clear advantage over mudflap, > which we have never been able to optimize properly. > > More comments on the patch itself coming soon. > > > Diego. >
Sign in to reply to this message.
I've run the current variant over cpu2006 benchmarks. The performance difference between asan-gcc and asan-llvm varies between -10% (gcc is faster) and +40% (llvm is faster). Several benchmarks failed to compile with this message: internal compiler error: in expand_expr_addr_expr_1, at expr.c:7381 which stands for gcc_assert ((bitpos % BITS_PER_UNIT) == 0); I am investigating it, but please tell me if you know what this means.
Sign in to reply to this message.
Minimized the crash to this: struct Foo { unsigned bf1:1; unsigned bf2:1; unsigned bf3:1; }; void foo (struct Foo *ob) { ob->bf2 = 1; } D.2731_4 = &ob_1(D)->bf2; __asan_base_addr.2 = (long unsigned int) D.2731_4; D.2732_5 = __asan_base_addr.2 >> 3; D.2733_6 = 1 << 44; D.2734_7 = D.2732_5 + D.2733_6; D.2735_8 = VIEW_CONVERT_EXPR<char *>(D.2734_7); # VUSE <.MEM> __asan_shadow.3 = *D.2735_8; D.2737_9 = __asan_shadow.3 != 0; D.2738_10 = __asan_base_addr.2 & 7; D.2739_11 = (char) D.2738_10; D.2740_12 = D.2739_11 + 0; D.2741_13 = D.2740_12 >= __asan_shadow.3; __asan_crash_cond.4 = D.2737_9 & D.2741_13; if (__asan_crash_cond.4 != 0) ./expand_expr_addr_expr_1_err.c: In function ‘foo’: ./expand_expr_addr_expr_1_err.c:8:11: internal compiler error: in expand_expr_addr_expr_1, at expr.c:7381 How do I avoid instrumenting bitfields?
Sign in to reply to this message.
On Wed, Oct 19, 2011 at 12:02 PM, <konstantin.s.serebryany@gmail.com> wrote: > Minimized the crash to this: > > struct Foo { > unsigned bf1:1; > unsigned bf2:1; > unsigned bf3:1; > }; > > void foo (struct Foo *ob) { > ob->bf2 = 1; > } > > > > D.2731_4 = &ob_1(D)->bf2; > __asan_base_addr.2 = (long unsigned int) D.2731_4; > D.2732_5 = __asan_base_addr.2 >> 3; > D.2733_6 = 1 << 44; > D.2734_7 = D.2732_5 + D.2733_6; > D.2735_8 = VIEW_CONVERT_EXPR<char *>(D.2734_7); > # VUSE <.MEM> > __asan_shadow.3 = *D.2735_8; > D.2737_9 = __asan_shadow.3 != 0; > D.2738_10 = __asan_base_addr.2 & 7; > D.2739_11 = (char) D.2738_10; > D.2740_12 = D.2739_11 + 0; > D.2741_13 = D.2740_12 >= __asan_shadow.3; > __asan_crash_cond.4 = D.2737_9 & D.2741_13; > if (__asan_crash_cond.4 != 0) > ./expand_expr_addr_expr_1_err.c: In function ‘foo’: > ./expand_expr_addr_expr_1_err.c:8:11: internal compiler error: in > expand_expr_addr_expr_1, at expr.c:7381 > > > > How do I avoid instrumenting bitfields? Use get_inner_reference to compute the bitpos, and check if it is multiple of bits_per_unit. David > > http://codereview.appspot.com/5272048/ >
Sign in to reply to this message.
Added code to avoid bitfields.
Sign in to reply to this message.
On 2011/10/19 20:38:35, kcc wrote: > Added code to avoid bitfields. Is there anything I should know about splitting basic blocks in the presence of EH? Currently, asan fails on 483.xalancbmk, 453.povray and 450.soplex, which are known to have EH.
Sign in to reply to this message.
what kind of failures? David On Wed, Oct 19, 2011 at 2:04 PM, <konstantin.s.serebryany@gmail.com> wrote: > On 2011/10/19 20:38:35, kcc wrote: >> >> Added code to avoid bitfields. > > Is there anything I should know about splitting basic blocks in the > presence of EH? > Currently, asan fails on 483.xalancbmk, 453.povray and 450.soplex, which > are known to have EH. > > http://codereview.appspot.com/5272048/ >
Sign in to reply to this message.
On 2011/10/19 21:51:46, davidxl wrote: > what kind of failures? Ah, I asked before actually investigating. One failure on 450.soplex looks like incorrect asan warning while accessing a bitfield which has size and offset dividable by 8. So, my current bitfield-related hack is incorrect. Two failures look like this: ./dealII_base.z447.dealII: /usr/lib/libstdc++.so.6: version `GLIBCXX_3.4.15' not found (required by ./dealII_base.z447.dealII) Something simple, looking.
Sign in to reply to this message.
On 2011/10/19 23:35:31, kcc wrote: > On 2011/10/19 21:51:46, davidxl wrote: > > what kind of failures? > Ah, I asked before actually investigating. > One failure on 450.soplex looks like incorrect asan warning while accessing a > bitfield which has size and offset dividable by 8. > So, my current bitfield-related hack is incorrect. I extended the hack and 450.soplex now passes. > > > Two failures look like this: > ./dealII_base.z447.dealII: /usr/lib/libstdc++.so.6: version `GLIBCXX_3.4.15' not > found (required by ./dealII_base.z447.dealII) > Something simple, looking.
Sign in to reply to this message.
On 2011/10/19 23:52:26, kcc wrote: > On 2011/10/19 23:35:31, kcc wrote: > > On 2011/10/19 21:51:46, davidxl wrote: > > > what kind of failures? > > Ah, I asked before actually investigating. > > One failure on 450.soplex looks like incorrect asan warning while accessing a > > bitfield which has size and offset dividable by 8. > > So, my current bitfield-related hack is incorrect. > > I extended the hack and 450.soplex now passes. Problem with 447.dealII and 483.xalancbmk is solved too. Now all C/C++ benchmarks build and run. > > > > > > > Two failures look like this: > > ./dealII_base.z447.dealII: /usr/lib/libstdc++.so.6: version `GLIBCXX_3.4.15' > not > > found (required by ./dealII_base.z447.dealII) > > Something simple, looking.
Sign in to reply to this message.
Today's data on cpu2006 (gcc -O2 vs gcc -O2 -fasan). Machine: Dell T3500 Average slowdown is 2.1x (today's llvm variant gives ~1.85x slowdown). One of the possible reasons why gcc variant is slower is because in gcc we run instrumentation before loop optimizer. But even these numbers looks quite nice, imho. 400.perlbench, 327.00, 737.00, 2.25 401.bzip2, 466.00, 1100.00, 2.36 403.gcc, 331.00, 521.00, 1.57 429.mcf, 333.00, 539.00, 1.62 445.gobmk, 414.00, 925.00, 2.23 456.hmmer, 396.00, 1880.00, 4.75 458.sjeng, 483.00, 1270.00, 2.63 462.libquantum, 458.00, 454.00, 0.99 464.h264ref, 565.00, 1770.00, 3.13 471.omnetpp, 323.00, 517.00, 1.60 473.astar, 420.00, 606.00, 1.44 483.xalancbmk, 225.00, 516.00, 2.29 433.milc, 420.00, 573.00, 1.36 444.namd, 396.00, 508.00, 1.28 447.dealII, 357.00, 682.00, 1.91 450.soplex, 236.00, 375.00, 1.59 453.povray, 186.00, 298.00, 1.60 470.lbm, 279.00, 421.00, 1.51 482.sphinx3, 564.00, 1420.00, 2.52
Sign in to reply to this message.
First round of comments. I think we should add this to google/main. It's in sufficiently good shape for it. You can keep improving it in the branch. It is now too late for 4.7's stage 1, so I think a reasonable way to proceed is to keep it in google/main and then present it for trunk inclusion when stage 1 opens for 4.8. http://codereview.appspot.com/5272048/diff/30001/toplev.c File toplev.c (right): http://codereview.appspot.com/5272048/diff/30001/toplev.c#newcode621 toplev.c:621: asan_finish_file(); + /* File-scope initialization for AddressSanitizer. */ Two spaces before '*/' + if (flag_asan) + asan_finish_file(); Space before '()' http://codereview.appspot.com/5272048/diff/30001/tree-asan.c File tree-asan.c (right): http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode80 tree-asan.c:80: (All I need is to traverse *all* memory accesses and instrument them). + Implementation details: + This is my first code in gcc. I wrote it by copying tree-mudflap.c, + stripping 70% of irrelevant code and modifying the instrumentation routine + build_check_stmt. The code seems to work, but I don't feel I understand it. + In particular, transform_derefs() and transform_statements() seem too complex. + Suggestions are welcome on how to simplify them. + (All I need is to traverse *all* memory accesses and instrument them). No need to have this in the code. These details usually go in the mail message you submit your patch with. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode140 tree-asan.c:140: 'size' is one of 1, 2, 4, 8, 16. */ +/* Instrument the memory access instruction 'base'. + Insert new statements before 'instr_gsi'. + 'location' is source code location. + 'is_store' is either 1 (for a store) or 0 (for a load). + 'size' is one of 1, 2, 4, 8, 16. */ When documenting arguments, you should refer to the arguments in CAPITALS instead of 'quoted'. The comment needs to be formatted so the lines below the first line are indented 3 spaces. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode155 tree-asan.c:155: tree shadow_type = size > 8 ? short_integer_type_node : char_type_node; + tree shadow_type = size > 8 ? short_integer_type_node : char_type_node; s/8/BITS_PER_UNIT/ http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode219 tree-asan.c:219: build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t)); + t = build2 (RSHIFT_EXPR, uintptr_type, base_addr, + build_int_cst (uintptr_type, asan_scale)); + t = build2 (PLUS_EXPR, uintptr_type, t, + build2 (LSHIFT_EXPR, uintptr_type, + build_int_cst (uintptr_type, 1), + build_int_cst (uintptr_type, asan_offset_log) + )); + t = build1 (INDIRECT_REF, shadow_type, + build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t)); It helps if this adds documentation on what expressions it's building. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode250 tree-asan.c:250: gimple_seq_add_stmt (&seq, g); [ ... ] + g = gimple_build_assign (cond, t); + gimple_set_location (g, location); + gimple_seq_add_stmt (&seq, g); + g = gimple_build_cond (NE_EXPR, cond, boolean_false_node, NULL_TREE, + NULL_TREE); + gimple_set_location (g, location); + gimple_seq_add_stmt (&seq, g); So, instead of open coding all the access checks, how about we created a new GIMPLE instruction? This instruction would take the same parameters, and have the semantics of the check but it would be optimizable. You could for instance make the instruction produce a pointer SSA name that is then used by the memory reference. With this, you can then allow the optimizers do their thing. These instructions could be moved around, eliminated, coalesced. Resulting in a reduced number of checks. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode251 tree-asan.c:251: /* debug_gimple_seq (seq); */ + /* debug_gimple_seq (seq); */ If you want the pass to dump debugging data, use the dump support. See other passes for examples (look for 'if (dump_file)' snippets)'. When -fdump-tree-asan is passed to the compiler, the pass manager will instantiate a 'dump_file' that you can use to emit debug info. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode253 tree-asan.c:253: /* crash */ + /* crash */ ??? http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode272 tree-asan.c:272: location_t location, int is_store) +static void +transform_derefs (gimple_stmt_iterator *iter, tree *tp, + location_t location, int is_store) Needs documentation. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode326 tree-asan.c:326: bb = ENTRY_BLOCK_PTR ->next_bb; + bb = ENTRY_BLOCK_PTR ->next_bb; No space before '->'. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode327 tree-asan.c:327: do FOR_EACH_BB (bb) http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode335 tree-asan.c:335: if (gimple_code (s) == GIMPLE_ASSIGN) + + /* Only a few GIMPLE statements can reference memory. */ + if (gimple_code (s) == GIMPLE_ASSIGN) You could also simply ask if the statement has memory references, though this would get you GIMPLE_CALLs as well that you generally don't want to deal with. Other than that, only GIMPLE_ASSIGN and GIMPLE_ASM may reference memory. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode349 tree-asan.c:349: while (bb && bb->index <= saved_last_basic_block); + bb = next; + } + while (bb && bb->index <= saved_last_basic_block); Not needed. FOR_EACH_BB does what you want.
Sign in to reply to this message.
http://codereview.appspot.com/5272048/diff/30001/toplev.c File toplev.c (right): http://codereview.appspot.com/5272048/diff/30001/toplev.c#newcode621 toplev.c:621: asan_finish_file(); On 2011/10/25 16:46:34, Diego Novillo wrote: > + /* File-scope initialization for AddressSanitizer. */ > > Two spaces before '*/' > > + if (flag_asan) > + asan_finish_file(); > > Space before '()' Done. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c File tree-asan.c (right): http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode80 tree-asan.c:80: (All I need is to traverse *all* memory accesses and instrument them). On 2011/10/25 16:46:34, Diego Novillo wrote: > + Implementation details: > + This is my first code in gcc. I wrote it by copying tree-mudflap.c, > + stripping 70% of irrelevant code and modifying the instrumentation routine > + build_check_stmt. The code seems to work, but I don't feel I understand it. > + In particular, transform_derefs() and transform_statements() seem too complex. > + Suggestions are welcome on how to simplify them. > + (All I need is to traverse *all* memory accesses and instrument them). > > No need to have this in the code. These details usually go in the mail message > you submit your patch with. Done. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode140 tree-asan.c:140: 'size' is one of 1, 2, 4, 8, 16. */ On 2011/10/25 16:46:34, Diego Novillo wrote: > +/* Instrument the memory access instruction 'base'. > + Insert new statements before 'instr_gsi'. > + 'location' is source code location. > + 'is_store' is either 1 (for a store) or 0 (for a load). > + 'size' is one of 1, 2, 4, 8, 16. */ > > When documenting arguments, you should refer to the arguments in CAPITALS > instead of 'quoted'. The comment needs to be formatted so the lines below the > first line are indented 3 spaces. Done. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode155 tree-asan.c:155: tree shadow_type = size > 8 ? short_integer_type_node : char_type_node; On 2011/10/25 16:46:34, Diego Novillo wrote: > + tree shadow_type = size > 8 ? short_integer_type_node : char_type_node; > > s/8/BITS_PER_UNIT/ This is different '8'. This code checks for 16-byte accesses. I changed the parameter name to size_in_bytes and made the code explicitly check for 16. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode219 tree-asan.c:219: build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t)); On 2011/10/25 16:46:34, Diego Novillo wrote: > + t = build2 (RSHIFT_EXPR, uintptr_type, base_addr, > + build_int_cst (uintptr_type, asan_scale)); > + t = build2 (PLUS_EXPR, uintptr_type, t, > + build2 (LSHIFT_EXPR, uintptr_type, > + build_int_cst (uintptr_type, 1), > + build_int_cst (uintptr_type, asan_offset_log) > + )); > + t = build1 (INDIRECT_REF, shadow_type, > + build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t)); > > It helps if this adds documentation on what expressions it's building. Done. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode250 tree-asan.c:250: gimple_seq_add_stmt (&seq, g); On 2011/10/25 16:46:34, Diego Novillo wrote: > [ ... ] > + g = gimple_build_assign (cond, t); > + gimple_set_location (g, location); > + gimple_seq_add_stmt (&seq, g); > + g = gimple_build_cond (NE_EXPR, cond, boolean_false_node, NULL_TREE, > + NULL_TREE); > + gimple_set_location (g, location); > + gimple_seq_add_stmt (&seq, g); > > So, instead of open coding all the access checks, how about we created a new > GIMPLE instruction? This instruction would take the same parameters, and have > the semantics of the check but it would be optimizable. You could for instance > make the instruction produce a pointer SSA name that is then used by the memory > reference. > > With this, you can then allow the optimizers do their thing. These instructions > could be moved around, eliminated, coalesced. Resulting in a reduced number of > checks. There are pros and cons. I have not enough gcc expertise to judge. But I'd expect the change (and the testing) to be far from trivial. Shall we take it off this CL? http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode251 tree-asan.c:251: /* debug_gimple_seq (seq); */ On 2011/10/25 16:46:34, Diego Novillo wrote: > + /* debug_gimple_seq (seq); */ > > If you want the pass to dump debugging data, use the dump support. See other > passes for examples (look for 'if (dump_file)' snippets)'. When > -fdump-tree-asan is passed to the compiler, the pass manager will instantiate a > 'dump_file' that you can use to emit debug info. Removed this line. Let the debug support be a separate CL, if needed. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode253 tree-asan.c:253: /* crash */ On 2011/10/25 16:46:34, Diego Novillo wrote: > + /* crash */ > > ??? Oops. Modified the comment. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode272 tree-asan.c:272: location_t location, int is_store) On 2011/10/25 16:46:34, Diego Novillo wrote: > +static void > +transform_derefs (gimple_stmt_iterator *iter, tree *tp, > + location_t location, int is_store) > > Needs documentation. Done. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode327 tree-asan.c:327: do On 2011/10/25 16:46:34, Diego Novillo wrote: > FOR_EACH_BB (bb) I tried, it did not work (went into infinite loop). Is FOR_EACH_BB supposed to work if we are adding new BBs inside the loop? http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode335 tree-asan.c:335: if (gimple_code (s) == GIMPLE_ASSIGN) On 2011/10/25 16:46:34, Diego Novillo wrote: > + > + /* Only a few GIMPLE statements can reference memory. */ > + if (gimple_code (s) == GIMPLE_ASSIGN) > > You could also simply ask if the statement has memory references, though this > would get you GIMPLE_CALLs as well that you generally don't want to deal with. > > Other than that, only GIMPLE_ASSIGN and GIMPLE_ASM may reference memory. Simplified the code. http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode349 tree-asan.c:349: while (bb && bb->index <= saved_last_basic_block); On 2011/10/25 16:46:34, Diego Novillo wrote: > + bb = next; > + } > + while (bb && bb->index <= saved_last_basic_block); > > Not needed. FOR_EACH_BB does what you want. See above.
Sign in to reply to this message.
http://codereview.appspot.com/5272048/diff/17016/tree-asan.c File tree-asan.c (right): http://codereview.appspot.com/5272048/diff/17016/tree-asan.c#newcode327 tree-asan.c:327: FOR_EACH_BB (bb) Ah, yea, I can use FOR_EACH_BB with a check for bb->index
Sign in to reply to this message.
On Tue, Oct 25, 2011 at 9:46 AM, <dnovillo@google.com> wrote: > First round of comments. > > I think we should add this to google/main. It's in sufficiently good > shape for it. You can keep improving it in the branch. > Also good to backport it to google_46 branch. David > It is now too late for 4.7's stage 1, so I think a reasonable way to > proceed is to keep it in google/main and then present it for trunk > inclusion when stage 1 opens for 4.8. > > > http://codereview.appspot.com/**5272048/diff/30001/toplev.c<http://codereview... > File toplev.c (right): > > http://codereview.appspot.com/**5272048/diff/30001/toplev.c#**newcode621<http... > toplev.c:621: asan_finish_file(); > > + /* File-scope initialization for AddressSanitizer. */ > > Two spaces before '*/' > > > + if (flag_asan) > + asan_finish_file(); > > Space before '()' > > http://codereview.appspot.com/**5272048/diff/30001/tree-asan.c<http://coderev... > File tree-asan.c (right): > > http://codereview.appspot.com/**5272048/diff/30001/tree-asan.**c#newcode80<ht... > tree-asan.c:80: (All I need is to traverse *all* memory accesses and > instrument them). > > + Implementation details: > + This is my first code in gcc. I wrote it by copying tree-mudflap.c, > + stripping 70% of irrelevant code and modifying the instrumentation > routine > + build_check_stmt. The code seems to work, but I don't feel I > understand it. > + In particular, transform_derefs() and transform_statements() seem too > complex. > + Suggestions are welcome on how to simplify them. > + (All I need is to traverse *all* memory accesses and instrument them). > > No need to have this in the code. These details usually go in the mail > message you submit your patch with. > > http://codereview.appspot.com/**5272048/diff/30001/tree-asan.** > c#newcode140<http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode140> > tree-asan.c:140: 'size' is one of 1, 2, 4, 8, 16. */ > +/* Instrument the memory access instruction 'base'. > + Insert new statements before 'instr_gsi'. > + 'location' is source code location. > + 'is_store' is either 1 (for a store) or 0 (for a load). > + 'size' is one of 1, 2, 4, 8, 16. */ > > When documenting arguments, you should refer to the arguments in > CAPITALS instead of 'quoted'. The comment needs to be formatted so the > lines below the first line are indented 3 spaces. > > http://codereview.appspot.com/**5272048/diff/30001/tree-asan.** > c#newcode155<http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode155> > tree-asan.c:155: tree shadow_type = size > 8 ? short_integer_type_node : > char_type_node; > + tree shadow_type = size > 8 ? short_integer_type_node : > char_type_node; > > s/8/BITS_PER_UNIT/ > > http://codereview.appspot.com/**5272048/diff/30001/tree-asan.** > c#newcode219<http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode219> > tree-asan.c:219: build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t)); > > + t = build2 (RSHIFT_EXPR, uintptr_type, base_addr, > + build_int_cst (uintptr_type, asan_scale)); > > + t = build2 (PLUS_EXPR, uintptr_type, t, > + build2 (LSHIFT_EXPR, uintptr_type, > + build_int_cst (uintptr_type, 1), > + build_int_cst (uintptr_type, asan_offset_log) > + )); > > + t = build1 (INDIRECT_REF, shadow_type, > + build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t)); > > It helps if this adds documentation on what expressions it's building. > > http://codereview.appspot.com/**5272048/diff/30001/tree-asan.** > c#newcode250<http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode250> > tree-asan.c:250: gimple_seq_add_stmt (&seq, g); > [ ... ] > > + g = gimple_build_assign (cond, t); > + gimple_set_location (g, location); > + gimple_seq_add_stmt (&seq, g); > + g = gimple_build_cond (NE_EXPR, cond, boolean_false_node, NULL_TREE, > + NULL_TREE); > + gimple_set_location (g, location); > + gimple_seq_add_stmt (&seq, g); > > So, instead of open coding all the access checks, how about we created a > new GIMPLE instruction? This instruction would take the same > parameters, and have the semantics of the check but it would be > optimizable. You could for instance make the instruction produce a > pointer SSA name that is then used by the memory reference. > > With this, you can then allow the optimizers do their thing. These > instructions could be moved around, eliminated, coalesced. Resulting in > a reduced number of checks. > > http://codereview.appspot.com/**5272048/diff/30001/tree-asan.** > c#newcode251<http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode251> > tree-asan.c:251: /* debug_gimple_seq (seq); */ > + /* debug_gimple_seq (seq); */ > > If you want the pass to dump debugging data, use the dump support. See > other passes for examples (look for 'if (dump_file)' snippets)'. When > -fdump-tree-asan is passed to the compiler, the pass manager will > instantiate a 'dump_file' that you can use to emit debug info. > > http://codereview.appspot.com/**5272048/diff/30001/tree-asan.** > c#newcode253<http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode253> > tree-asan.c:253: /* crash */ > + /* crash */ > > ??? > > http://codereview.appspot.com/**5272048/diff/30001/tree-asan.** > c#newcode272<http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode272> > tree-asan.c:272: location_t location, int is_store) > > +static void > +transform_derefs (gimple_stmt_iterator *iter, tree *tp, > + location_t location, int is_store) > > Needs documentation. > > http://codereview.appspot.com/**5272048/diff/30001/tree-asan.** > c#newcode326<http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode326> > tree-asan.c:326: bb = ENTRY_BLOCK_PTR ->next_bb; > > + bb = ENTRY_BLOCK_PTR ->next_bb; > > No space before '->'. > > http://codereview.appspot.com/**5272048/diff/30001/tree-asan.** > c#newcode327<http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode327> > tree-asan.c:327: do > FOR_EACH_BB (bb) > > http://codereview.appspot.com/**5272048/diff/30001/tree-asan.** > c#newcode335<http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode335> > tree-asan.c:335: if (gimple_code (s) == GIMPLE_ASSIGN) > > + > + /* Only a few GIMPLE statements can reference memory. */ > + if (gimple_code (s) == GIMPLE_ASSIGN) > > You could also simply ask if the statement has memory references, though > this would get you GIMPLE_CALLs as well that you generally don't want to > deal with. > > Other than that, only GIMPLE_ASSIGN and GIMPLE_ASM may reference memory. > > http://codereview.appspot.com/**5272048/diff/30001/tree-asan.** > c#newcode349<http://codereview.appspot.com/5272048/diff/30001/tree-asan.c#newcode349> > tree-asan.c:349: while (bb && bb->index <= saved_last_basic_block); > > + bb = next; > + } > + while (bb && bb->index <= saved_last_basic_block); > > Not needed. FOR_EACH_BB does what you want. > > http://codereview.appspot.com/**5272048/<http://codereview.appspot.com/5272048/> >
Sign in to reply to this message.
Diego mentioned that we can move the asan pass somewhere to the very end, just before lowering to RTL. Where would be this blessed place? Does it still have TARGET_MEM_REF?
Sign in to reply to this message.
On 11-11-01 15:11 , konstantin.s.serebryany@gmail.com wrote: > Diego mentioned that we can move the asan pass somewhere to the very > end, just before lowering to RTL. > Where would be this blessed place? > Does it still have TARGET_MEM_REF? Right before pass_expand? In init_optimization_passes(), look for NEXT_PASS (pass_expand). That's RTL generation. Somewhere before that. TARGET_MEM_REFs are converted to RTL mems during RTL expansion. Diego.
Sign in to reply to this message.
On Tue, Nov 1, 2011 at 12:16 PM, Diego Novillo <dnovillo@google.com> wrote: > On 11-11-01 15:11 , konstantin.s.serebryany@gmail.com wrote: >> >> Diego mentioned that we can move the asan pass somewhere to the very >> end, just before lowering to RTL. >> Where would be this blessed place? >> Does it still have TARGET_MEM_REF? > > Right before pass_expand? In init_optimization_passes(), look for NEXT_PASS > (pass_expand). That's RTL generation. Somewhere before that. > Why? > TARGET_MEM_REFs are converted to RTL mems during RTL expansion. > What? they will still be seen by asan which can not be handled (e.g, creating address expression out of it). David > > Diego. > >
Sign in to reply to this message.
On 11-11-01 15:34 , Xinliang David Li wrote: >> Right before pass_expand? In init_optimization_passes(), look for NEXT_PASS >> (pass_expand). That's RTL generation. Somewhere before that. >> > > Why? The idea was to experiment where to best place ASAN to avoid instrumenting too much. If we schedule it really late, then we may save ourselves some unnecessary instrumentation. Though, I still think ASAN should never open code the library calls directly. Rather, it should emit straight-code gimple that can be better understood and optimized away. >> TARGET_MEM_REFs are converted to RTL mems during RTL expansion. >> > > What? they will still be seen by asan which can not be handled (e.g, > creating address expression out of it). So, it needs to run before TMRs are introduced then. *shrug*.
Sign in to reply to this message.
On Tue, Nov 1, 2011 at 12:41 PM, Diego Novillo <dnovillo@google.com> wrote: > On 11-11-01 15:34 , Xinliang David Li wrote: > >>> Right before pass_expand? In init_optimization_passes(), look for >>> NEXT_PASS >>> (pass_expand). That's RTL generation. Somewhere before that. >>> >> >> Why? > > The idea was to experiment where to best place ASAN to avoid instrumenting > too much. If we schedule it really late, then we may save ourselves some > unnecessary instrumentation. > It needs to be balanced -- on one hand it needs to be as late as possible so that as few memory references (dynamically executed) as possible are instrumented. On the other hand, early enough so that the instrumented code can be optimized sufficiently. > Though, I still think ASAN should never open code the library calls > directly. Rather, it should emit straight-code gimple that can be better > understood and optimized away. that depends on the library function themselves -- if they are trivial, inline sequence should be generated. > > >>> TARGET_MEM_REFs are converted to RTL mems during RTL expansion. >>> >> >> What? they will still be seen by asan which can not be handled (e.g, >> creating address expression out of it). > > So, it needs to run before TMRs are introduced then. *shrug*. > yes it should be before ivopt as discussed. David
Sign in to reply to this message.
So, do you have any other suggestions before the first commit?
Sign in to reply to this message.
Ok for google/main when the option is documented in doc/invoke.texi and a Changlog file is provided. David On Tue, Nov 1, 2011 at 5:24 PM, <konstantin.s.serebryany@gmail.com> wrote: > So, do you have any other suggestions before the first commit? > > http://codereview.appspot.com/5272048/ >
Sign in to reply to this message.
On 2011/11/02 04:17:22, davidxl wrote: > Ok for google/main when the option is documented in doc/invoke.texi > and a Changlog file is provided. PTAL
Sign in to reply to this message.
The invoke.texi change looks fine. The ChangeLog entry needs some work. http://codereview.appspot.com/5272048/diff/41001/ChangeLog.google-main File ChangeLog.google-main (right): http://codereview.appspot.com/5272048/diff/41001/ChangeLog.google-main#newcode6 ChangeLog.google-main:6: 1 2011-11-02 Kostya Serebryany <kcc@google.com> 2 Introduce a new option -fasan which enables 3 AddressSanitizer, a fast memory error detector: 4 http://code.google.com/p/address-sanitizer. 5 The current implementation handles only heap accesses. 6 Not quite. A ChangeLog entry needs to have a rigid structure. For every file and function changed, you need to state what changed. The entry below this one is not really a good example. ChangeLog entries never describe how the patch works or what it provides. See examples in gcc/ChangeLog. See http://www.gnu.org/prep/standards/standards.html#Style-of-Change-Logs for details.
Sign in to reply to this message.
PTAL
Sign in to reply to this message.
OK for google/main with the nits below. http://codereview.appspot.com/5272048/diff/42003/ChangeLog.google-main File ChangeLog.google-main (right): http://codereview.appspot.com/5272048/diff/42003/ChangeLog.google-main#newcode1 ChangeLog.google-main:1: 2011-11-02 Kostya Serebryany <kcc@google.com> 1 2011-11-02 Kostya Serebryany <kcc@google.com> Need blank line below the datestamp. http://codereview.appspot.com/5272048/diff/42003/ChangeLog.google-main#newcode3 ChangeLog.google-main:3: AddressSanitizer, a fast memory error detector: 2 Introduce a new option -fasan which enables 3 AddressSanitizer, a fast memory error detector: Not strictly accepted, but it is common to add some verbiage of this nature when adding major features. OK.
Sign in to reply to this message.
|