|
|
Created:
8 years, 8 months ago by Edmund.Grimley.Evans Modified:
8 years, 8 months ago CC:
dynamorio-devs_googlegroups.com Visibility:
Public. |
DescriptionCommit log for first patchset:
---------------
i#1569 AArch64: Reimplement ELF import/export iterators.
The previous implementation assumed that all imported symbols are
excluded from the GNU hash table. This assumption seems to be unsafe.
The new implementation iterates through both parts of the symbol table
in both cases, filtering imports or exports using st_value.
Enable client.modules on ARM and AArch64 as it now passes.
---------------
Patch Set 1 #
Total comments: 4
Patch Set 2 : Committed #
MessagesTotal messages: 16
Sign in to reply to this message.
Will hasnext always return correct answer? https://codereview.appspot.com/306530043/diff/1/core/unix/module_elf.c File core/unix/module_elf.c (right): https://codereview.appspot.com/306530043/diff/1/core/unix/module_elf.c#newcod... core/unix/module_elf.c:1552: return symbol_iterator_cur_symbol(iter) != NULL; should hasnext return true only if there is a next imported symbol? Even symbol_itertor_cur_symbol(iter) not NULL, it could be export instead of import, right? https://codereview.appspot.com/306530043/diff/1/core/unix/module_elf.c#newcod... core/unix/module_elf.c:1602: return symbol_iterator_cur_symbol(iter) != NULL; ditto about hasnext.
Sign in to reply to this message.
https://codereview.appspot.com/306530043/diff/1/core/unix/module_elf.c File core/unix/module_elf.c (right): https://codereview.appspot.com/306530043/diff/1/core/unix/module_elf.c#newcod... core/unix/module_elf.c:1552: return symbol_iterator_cur_symbol(iter) != NULL; On 2016/09/22 18:05:53, zhaoqin wrote: > should hasnext return true only if there is a next imported symbol? > Even symbol_itertor_cur_symbol(iter) not NULL, it could be export instead of > import, right? I think this is correct because the protocol (instrument_api.h:3010) says: "The iterator returned is invalid until after the first call to dr_symbol_import_iterator_next()." Then each call to dr_symbol_import_iterator_next advances to the next imported symbol, if there is one. Iterators are a pain to program because there are so many alternative protocols... I tested this by using the iterators to print out all the imports and all the outputs and comparing with the output from readelf. I even tested with --hash-style=sysv. But I couldn't so easily test the corner case of there not being any symbols... :-)
Sign in to reply to this message.
LGTM https://codereview.appspot.com/306530043/diff/1/core/unix/module_elf.c File core/unix/module_elf.c (right): https://codereview.appspot.com/306530043/diff/1/core/unix/module_elf.c#newcod... core/unix/module_elf.c:1552: return symbol_iterator_cur_symbol(iter) != NULL; On 2016/09/22 18:34:11, Edmund.Grimley.Evans wrote: > On 2016/09/22 18:05:53, zhaoqin wrote: > > should hasnext return true only if there is a next imported symbol? > > Even symbol_itertor_cur_symbol(iter) not NULL, it could be export instead of > > import, right? > > I think this is correct because the protocol (instrument_api.h:3010) says: "The > iterator returned is invalid until after the first call to > dr_symbol_import_iterator_next()." Then each call to > dr_symbol_import_iterator_next advances to the next imported symbol, if there is > one. You are right, I forgot the _next() returns the current sym and move to the next. > > Iterators are a pain to program because there are so many alternative > protocols... > > I tested this by using the iterators to print out all the imports and all the > outputs and comparing with the output from readelf. I even tested with > --hash-style=sysv. But I couldn't so easily test the corner case of there not > being any symbols... :-)
Sign in to reply to this message.
Committed as https://github.com/DynamoRIO/dynamorio/commit/b92660f476eeed0bb7591cdde8c06bf... Final commit log: --------------- i#1569 AArch64: Reimplement ELF import/export iterators. The previous implementation assumed that all imported symbols are excluded from the GNU hash table. This assumption seems to be unsafe. The new implementation iterates through both parts of the symbol table in both cases, filtering imports or exports using st_value. Enable client.modules on ARM and AArch64 as it now passes. Review-URL: https://codereview.appspot.com/306530043 ---------------
Sign in to reply to this message.
Message was sent while issue was closed.
It looks like this broke Android, causing a crash in debug: http://dynamorio.org/CDash/testDetails.php?test=120665&build=11995 <Application /data/local/tmp/build_android-debug-internal-32/suite/tests/bin/client.modules (18916). Tool internal crash at PC 0xb6ee2286. Please report this at your tool's issue tracker. Program aborted. Received SIGSEGV at pc 0xb6ee2286 in thread 18916 Base: 0xb6ca6000 Registers: r0 =0x4fc42140 r1 =0x4fc40c04 r2 =0x00000000 r3 =0x00000000 r4 =0xb6fb4000 r5 =0xb6fdfb30 r6 =0xb6fdfb2c r7 =0xbe90cb40 r8 =0x00000000 r9 =0x00000000 r10 version 6.2.17067, custom build -no_dynamic_options -client_lib '/data/local/tmp/build_android-debug-internal-32/suite/tests/bin/libclient.modules.dll.so;0;' -code_api -stderr_mask 12 -stack_size 56K -max_elide_jmp 0 -max_elide_call 0 -early_inject -emulate_brk -no_inline_ignored_syscalls -native_exec_default_list '' -no_native_exec_managed_code -no_in> And failure in release: http://dynamorio.org/CDash/testDetails.php?test=120682&build=11996 ERROR: didn't find imported symbol dlopen Given the crash, unless you can fix in the next few hours, it may be best to revert this.
Sign in to reply to this message.
Or one could disable the test on Android. It depends on whether Android on AArch32 is more important than Linux on AArch32 and AArch64... The crash is interesting... but I'm not likely to get set up for testing on Android before I go home for the weekend in about 40 minutes.... Any guesses as to what's different about Android? Have all the ARM Linux test bots run? It would be helpful if an ARM Linux system would hit the same problem. Another thing: would it be possible to grab the file suite/tests/bin/client.modules from the builds that failed? I think that's the file whose symbol table is being iterated through, so I might be able to spot something odd about it using readelf and od.
Sign in to reply to this message.
I don't see how disabling the test helps if this is a crash that happens if anyone uses the import/export iterators: it's presumably not a test-only issue. On Fri, Sep 23, 2016 at 11:59 AM, <edmund.grimley.evans@gmail.com> wrote: > Or one could disable the test on Android. It depends on whether Android on > AArch32 is more important than Linux on AArch32 and AArch64... > > The crash is interesting... but I'm not likely to get set up for testing > on Android before I go home for the weekend in about 40 minutes.... > > Any guesses as to what's different about Android? > > Have all the ARM Linux test bots run? It would be helpful if an ARM Linux > system would hit the same problem. > > Another thing: would it be possible to grab the file > suite/tests/bin/client.modules from the builds that failed? I think that's > the file whose symbol table is being iterated through, so I might be able > to spot something odd about it using readelf and od. >
Sign in to reply to this message.
I.e., this may crash existing tools like Dr. Memory (pegged to certain DR version, may crash on Android on next update as it does a lot of symbol examination) or other users' tools out there. On Fri, Sep 23, 2016 at 12:01 PM, Derek Bruening <bruening@google.com> wrote: > I don't see how disabling the test helps if this is a crash that happens > if anyone uses the import/export iterators: it's presumably not a test-only > issue. > > On Fri, Sep 23, 2016 at 11:59 AM, <edmund.grimley.evans@gmail.com> wrote: > >> Or one could disable the test on Android. It depends on whether Android >> on AArch32 is more important than Linux on AArch32 and AArch64... >> >> The crash is interesting... but I'm not likely to get set up for testing >> on Android before I go home for the weekend in about 40 minutes.... >> >> Any guesses as to what's different about Android? >> >> Have all the ARM Linux test bots run? It would be helpful if an ARM Linux >> system would hit the same problem. >> >> Another thing: would it be possible to grab the file >> suite/tests/bin/client.modules from the builds that failed? I think that's >> the file whose symbol table is being iterated through, so I might be able >> to spot something odd about it using readelf and od. >> > >
Sign in to reply to this message.
On Fri, Sep 23, 2016 at 11:59 AM, <edmund.grimley.evans@gmail.com> wrote: > It depends on whether Android on AArch32 is more important than Linux on > AArch32 and AArch64... > The way to look at it is, is there existing code being broken: and there is existing code for Android and A32, but A64 is new and nobody has tools on it that are released and in use.
Sign in to reply to this message.
On Fri, Sep 23, 2016 at 11:59 AM, <edmund.grimley.evans@gmail.com> wrote: > Another thing: would it be possible to grab the file > suite/tests/bin/client.modules from the builds that failed? I think that's > the file whose symbol table is being iterated through, so I might be able > to spot something odd about it using readelf and od. > Attached.
Sign in to reply to this message.
It would be interesting to check whether the symbol iterators really do work on Android, or whether people have just been lucky so far. One could try, for example, using the iterators to print out all the imports and exports and compare that with the output from objdump/readelf. Ditto for x86 Linux with recent binutils, I suppose. Could you revert b92660f for me about 26 hours after it was committed, so that all the test bots can try it first?
Sign in to reply to this message.
It seems Linux on ARM works fine. I will revert it now. On Fri, Sep 23, 2016 at 12:13 PM, <edmund.grimley.evans@gmail.com> wrote: > It would be interesting to check whether the symbol iterators really do > work on Android, or whether people have just been lucky so far. One could > try, for example, using the iterators to print out all the imports and > exports and compare that with the output from objdump/readelf. Ditto for > x86 Linux with recent binutils, I suppose. > > Could you revert b92660f for me about 26 hours after it was committed, so > that all the test bots can try it first? > > -- > You received this message because you are subscribed to the Google Groups > "DynamoRIO Devs" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to dynamorio-devs+unsubscribe@googlegroups.com. > To post to this group, send email to dynamorio-devs@googlegroups.com. > Visit this group at https://groups.google.com/group/dynamorio-devs. > For more options, visit https://groups.google.com/d/optout. >
Sign in to reply to this message.
> The way to look at it is, is there existing code being broken: and there > is existing code for Android and A32, but A64 is new and nobody has tools > on it that are released and in use. > Do you have any evidence that the symbol iterators previously worked on Android? Is there anything that failed on Android other than the one test that was enabled by the same patch? I rather suspect that b92660f merely exposed an existing, perhaps totally unrelated, bug. The code is full of safe_reads. The segfault quoted above was perhaps always happening, but hidden by a safe_read. If this is true, then a few simple changes on top of b92660f would make it work on Android just as well as it has always worked on Android: - disable the client.modules test on Android - remove some of the ASSERT_CURIOSITYs or make them conditional on !ANDROID Any use of the iterator on Android would then result in a safe_read silently failing and the iterator returning an empty list, which is what the existing code did, as far as I can tell. Thoughts?
Sign in to reply to this message.
I'm not sure: you may be right, and it is possible that DrMemory is only using direct symbol lookup and not any code used by the iterator (I don't recall how much code is shared there nor how much is touched by this CL). But, if effort is put into this, it seems much better to analyze the crash rather than guess, as for all we know it could happen on other platforms, and to understand why no symbols are found on Android. This patch was changing how the symbol iteration was done and challenging some ELF assumptions that were in the code, but if the new approach doesn't work in some cases then it seems much better to understand why and fix what may also be inaccurate ELF assumptions in the new code. On Sat, Sep 24, 2016 at 2:56 AM, <edmund.grimley.evans@gmail.com> wrote: > > The way to look at it is, is there existing code being broken: and there >> is existing code for Android and A32, but A64 is new and nobody has tools >> on it that are released and in use. >> > > Do you have any evidence that the symbol iterators previously worked on > Android? Is there anything that failed on Android other than the one test > that was enabled by the same patch? > > I rather suspect that b92660f merely exposed an existing, perhaps totally > unrelated, bug. The code is full of safe_reads. The segfault quoted above > was perhaps always happening, but hidden by a safe_read. > > If this is true, then a few simple changes on top of b92660f would make it > work on Android just as well as it has always worked on Android: > > - disable the client.modules test on Android > - remove some of the ASSERT_CURIOSITYs or make them conditional on !ANDROID > > Any use of the iterator on Android would then result in a safe_read > silently failing and the iterator returning an empty list, which is what > the existing code did, as far as I can tell. > > Thoughts? > > -- > You received this message because you are subscribed to the Google Groups > "DynamoRIO Devs" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to dynamorio-devs+unsubscribe@googlegroups.com. > To post to this group, send email to dynamorio-devs@googlegroups.com. > Visit this group at https://groups.google.com/group/dynamorio-devs. > For more options, visit https://groups.google.com/d/optout. >
Sign in to reply to this message.
Yes, I don't think there's much that can be done without testing on Android. I've created https://github.com/DynamoRIO/dynamorio/issues/2008
Sign in to reply to this message.
|