|
|
Created:
7 years, 9 months ago by Edmund.Grimley.Evans Modified:
7 years, 9 months ago CC:
dynamorio-devs_googlegroups.com Visibility:
Public. |
DescriptionCommit log for first patchset:
---------------
i#1569 AArch64: Add new API funcs dr_insert_{save,restore}_stack_pointer.
These functions are required on AArch64, where it is not possible to
directly load or save the stack pointer.
---------------
Patch Set 1 #
Total comments: 2
Patch Set 2 : with test #
Total comments: 16
MessagesTotal messages: 17
Sign in to reply to this message.
Please add a test for any new feature. However, I'm not sure what the use case is for these: wouldn't a client be using dr_swap_to_clean_stack or the various clean call functions? Are there any cases in any existing sample or tool that need this?
Sign in to reply to this message.
On 2016/06/27 17:31:42, bruening wrote: > Please add a test for any new feature. > > However, I'm not sure what the use case is for these: wouldn't a client be using > dr_swap_to_clean_stack or the various clean call functions? Are there any cases > in any existing sample or tool that need this? Kevin used these new functions for implementing some of the clean call functions, in mangle_shared.c, so they get exercised by that. I'm not sure about other uses. Kevin might know...
Sign in to reply to this message.
On 2016/06/27 17:31:42, bruening wrote: > Please add a test for any new feature. > > However, I'm not sure what the use case is for these: wouldn't a client be using > dr_swap_to_clean_stack or the various clean call functions? Are there any cases > in any existing sample or tool that need this? I have observed a series of common use cases for dr internals and clients: instr_create_save_to_dc(DR_REG_SP, DC_OFFSET) instr_create_save_to_tls(DR_REG_SP, TLS_SLOT) dr_save_reg(DR_REG_SP, TLS_SLOT) Basically all related APIs would emit code as a single store XINST_CREATE_store(dc, dst, DR_REG_SP) And this would make the AArch64 encoder to fail since we could not encode a single instruction to move the stack pointer to a memory operand directly. We need an additional spare register to temporarily store the stack pointer and perform the memory write. Due to this unfornate feature, a lot of dr extension library needs to consider AArch64 specifically. For example, drreg needs to use this API when reserving/unreserving SP. drwarp needs it when replacing native bb and from native finish to DR ibl procedure. Same for drutil if the user uses SP as a scratch register. And of course it would cause a lot harzards if users might think DR_REG_XSP is a gpr and free for load/store .. For the moment, the encoder just failed without telling anything if we create a store of SP to mem. Originally I thought I could put an ASSERT in the INSTR_CREATE macros to check SP specifically and remind people to use this API. However it seems the ASSERT fails in the scripts of copying API. Another approach is to add the assert in instr_create function so that users can immediately know which macro caused the problem. I'm using this API in the dr clean call and various dr extension libraries. All related dr and dr ext tests passed in my local repository. It might be enough exercises for tests cases.
Sign in to reply to this message.
OK, but please add a regression test to suite/tests/client-interface/ https://codereview.appspot.com/301080043/diff/1/core/lib/instrument.c File core/lib/instrument.c (right): https://codereview.appspot.com/301080043/diff/1/core/lib/instrument.c#newcode... core/lib/instrument.c:5590: "dr_insert_write_tls_field: drcontext cannot be NULL"); wrong function name https://codereview.appspot.com/301080043/diff/1/core/lib/instrument.c#newcode... core/lib/instrument.c:5629: "dr_insert_write_tls_field: drcontext cannot be NULL"); wrong function name
Sign in to reply to this message.
When I was trying to implement a regression test for dr_insert_{save,restore}_stack_pointer, I found that the current implementation doesn't support PC-relative memory opnd as the src/dst argument on AArch64. Unfortunately an additional scratch register is required. And also since this API is not useful anyway on other architectures, I'm planning to call back this change and implement a new and useful API on all archs. The proposed API is: void dr_insert_save_reg(void *drcontext, instrlist_t *ilist, instr *where, reg_id_t reg, opnd_t dst, reg_id_t scratch1, reg_id_t scratch2); void dr_insert_restore_reg(void *drcontext, instrlist_t *ilist, instr *where, reg_id_t reg, opnd_t src, reg_id_t scratch1, reg_id_t scratch2); The API is very generic, reg_id_t reg could be any register listed in opnd.h and the opnd dst/src could be a memory reference, register and immediate(src only) The API wrapper function would be in instrument.c and it would call architecture dependent implementation in arch/xx/emit_utils.c. Individual test cases covering all different combinations will be provided. I'm planning to implement the API and tests on AArch64 first and then on x86 and arm. I'm wondering if you like this idea or not before I start working on it. On Friday, 1 July 2016 01:30:00 UTC+1, Derek Bruening wrote: > > OK, but please add a regression test to suite/tests/client-interface/ > > > https://codereview.appspot.com/301080043/diff/1/core/lib/instrument.c > File core/lib/instrument.c (right): > > https://codereview.appspot.com/301080043/diff/1/core/lib/instrument.c#newcode... > > core/lib/instrument.c:5590 > <https://codereview.appspot.com/301080043/diff/1/core/lib/instrument.c#newcode...>: > "dr_insert_write_tls_field: drcontext cannot > be NULL"); > wrong function name > > https://codereview.appspot.com/301080043/diff/1/core/lib/instrument.c#newcode... > > core/lib/instrument.c:5629 > <https://codereview.appspot.com/301080043/diff/1/core/lib/instrument.c#newcode...>: > "dr_insert_write_tls_field: drcontext cannot > be NULL"); > wrong function name > > https://codereview.appspot.com/301080043/ >
Sign in to reply to this message.
Maybe I'm mis-understanding, but it does not sound like a good idea to change the API to take a second scratch reg just because of a not-yet-implemented piece of the A64 port which will presumably be finished soon -- if the second scratch reg is not needed once that piece is implemented, it does not seem worth adding to an interface, right? The proposed names are too close to dr_{save,restore}_reg() and IMHO will cause confusion. These are not "saving" and "restoring", which implies placing into spill slots -- your proposal is a general "load" and "store". I guess I'd want to see us take a step back and review all the existing code that tools use to load or store a GPR and then decide whether it's better to go replace all of that with an API routine or to instead keep it all the same and just have the earlier proposed API for the stack pointer, which will depend on several factors: how much more awkward and extra work it is for the code is to have an insertion routine with a scratch reg instead of just XINST_CREATE_load (and if there's a check for the stack pointer to avoid the scratch reg, maybe we may as well have that check instead decide whether to use one routine or another); how many uses are already behind an API routine that can handle the stack pointer (e.g., drreg in general); how many uses can guarantee they aren't touching the stack pointer today. On Fri, Jul 1, 2016 at 7:22 AM, Kevin Zhou <zhoubot@gmail.com> wrote: > When I was trying to implement a regression test for > dr_insert_{save,restore}_stack_pointer, I found that the current > implementation doesn't support PC-relative memory opnd as the src/dst > argument on AArch64. Unfortunately an additional scratch register is > required. And also since this API is not useful anyway on other > architectures, I'm planning to call back this change and implement a new > and useful API on all archs. > > The proposed API is: > void dr_insert_save_reg(void *drcontext, instrlist_t *ilist, instr *where, > reg_id_t reg, opnd_t dst, reg_id_t scratch1, reg_id_t scratch2); > void dr_insert_restore_reg(void *drcontext, instrlist_t *ilist, instr > *where, reg_id_t reg, opnd_t src, reg_id_t scratch1, reg_id_t scratch2); > The API is very generic, reg_id_t reg could be any register listed in > opnd.h and the opnd dst/src could be a memory reference, register and > immediate(src only) > > The API wrapper function would be in instrument.c and it would call > architecture dependent implementation in arch/xx/emit_utils.c. Individual > test cases covering all different combinations will be provided. I'm > planning to implement the API and tests on AArch64 first and then on x86 > and arm. > > I'm wondering if you like this idea or not before I start working on it. > > On Friday, 1 July 2016 01:30:00 UTC+1, Derek Bruening wrote: >> >> OK, but please add a regression test to suite/tests/client-interface/ >> >> >> https://codereview.appspot.com/301080043/diff/1/core/lib/instrument.c >> File core/lib/instrument.c (right): >> >> https://codereview.appspot.com/301080043/diff/1/core/lib/instrument.c#newcode... >> >> core/lib/instrument.c:5590 >> <https://codereview.appspot.com/301080043/diff/1/core/lib/instrument.c#newcode...>: >> "dr_insert_write_tls_field: drcontext cannot >> be NULL"); >> wrong function name >> >> https://codereview.appspot.com/301080043/diff/1/core/lib/instrument.c#newcode... >> >> core/lib/instrument.c:5629 >> <https://codereview.appspot.com/301080043/diff/1/core/lib/instrument.c#newcode...>: >> "dr_insert_write_tls_field: drcontext cannot >> be NULL"); >> wrong function name >> >> https://codereview.appspot.com/301080043/ >> >
Sign in to reply to this message.
On 2016/07/01 18:44:13, bruening wrote: > Maybe I'm mis-understanding, but it does not sound like a good idea to > change the API to take a second scratch reg just because of a > not-yet-implemented piece of the A64 port which will presumably be finished > soon -- if the second scratch reg is not needed once that piece is > implemented, it does not seem worth adding to an interface, right? There is a misunderstanding, I think. A second scratch register will always be required for storing the SP or PC to an absolute or PC-relative address because you'll need one register for the value to be stored and one for the address it's being stored to. You'd also need a second scratch register for storing to a register-plus-offset address if the offset is too big, but I'm not sure how important it would be to recognise that case. I don't personally have enough experience with writing clients to know what API would be most convenient...
Sign in to reply to this message.
On 2016/07/01 20:59:18, Edmund.Grimley.Evans wrote: > On 2016/07/01 18:44:13, bruening wrote: > > Maybe I'm mis-understanding, but it does not sound like a good idea to > > change the API to take a second scratch reg just because of a > > not-yet-implemented piece of the A64 port which will presumably be finished > > soon -- if the second scratch reg is not needed once that piece is > > implemented, it does not seem worth adding to an interface, right? > > There is a misunderstanding, I think. A second scratch register will always be > required for storing the SP or PC to an absolute or PC-relative address because > you'll need one register for the value to be stored and one for the address it's > being stored to. You'd also need a second scratch register for storing to a > register-plus-offset address if the offset is too big, but I'm not sure how > important it would be to recognise that case. I'm still confused: If an addressing mode is not present in the ISA, it should not be an operand type in the IR, so I don't see how such an address would be passed to these API routines? > I don't personally have enough experience with writing clients to know what API > would be most convenient... It is always more painful and complex to have to get a scratch register. Given that it's rare to use sp as a GPR, and that we can have drreg not hand it out for aarch64 at all, and many clients will use drreg for all regs they'll manipulate, it seems better to me to not replace the existing routines with a very general interface.
Sign in to reply to this message.
> I'm still confused: If an addressing mode is not present in the ISA, it > should not be an operand type in the IR, so I don't see how such an > address would be passed to these API routines? > PC-relative is present as an addressing more; it's just not available for a store instruction. I think this is about cases where the client code might have wanted to manipulate the app's SP with an INSTR_CREATE_ macro. It is always more painful and complex to have to get a scratch register. > Given that it's rare to use sp as a GPR, and that we can have drreg not > hand it out for aarch64 at all, and many clients will use drreg for all > regs they'll manipulate, it seems better to me to not replace the > existing routines with a very general interface. > I'm not sure which routines you think are candidates for being replaced... This isn't about coping with drreg having handed out SP instead of a proper GPR; this is about clients that really want to do something with the stack pointer.
Sign in to reply to this message.
On 2016/07/02 08:48:34, edmund.grimley.evans_gmail.com wrote: > It is always more painful and complex to have to get a scratch register. > > Given that it's rare to use sp as a GPR, and that we can have drreg not > > hand it out for aarch64 at all, and many clients will use drreg for all > > regs they'll manipulate, it seems better to me to not replace the > > existing routines with a very general interface. > > > > I'm not sure which routines you think are candidates for being replaced... > > This isn't about coping with drreg having handed out SP instead of a proper > GPR; this is about clients that really want to do something with the stack > pointer. It was, but the new proposal was to create general routines for loading and storing of any GPR, which implies that we'd recommend that they be used everywhere and that nobody should use XINST_CREATE_load or other routines for any GPR.
Sign in to reply to this message.
Users can have the freedom to use either XINST_CREATE_load or a general API provided by us. Some macros such as XINST_CREATE_load could be a bit flaky for A64. I've been writing x86 clients before and now I realize it is very easy to get wrong for writing clients on A64 or even different with A32 as well, such as the load/store of stack pointer, push/pop registers, PC-relative load/store, move of a pointer-sized immed, most of them required more than one instruction. We could either assume users are fully aware of these differences then it is not a necessary to have the new API, or we provide a convenience API for new users to write arch-portable client codes. For the moment probably it is better to just keep this API and forbids users to pass in PC-relative memory operand. > It is always more painful and complex to have to get a scratch register. A64 has twice number of registers than x64-64 and it is easier to pick a scratch from the least used register list. On Sunday, 3 July 2016 18:08:59 UTC+1, Derek Bruening wrote: > > On 2016/07/02 08:48:34, edmund.grimley.evans_gmail.com wrote: > > > > Given that it's rare to use sp as a GPR, and that we can have > drreg not > > > hand it out for aarch64 at all, and many clients will use drreg for > all > > > regs they'll manipulate, it seems better to me to not replace the > > > existing routines with a very general interface. > > > > > > I'm not sure which routines you think are candidates for being > replaced... > > > This isn't about coping with drreg having handed out SP instead of a > proper > > GPR; this is about clients that really want to do something with the > stack > > pointer. > > It was, but the new proposal was to create general routines for loading > and storing of any GPR, which implies that we'd recommend that they be > used everywhere and that nobody should use XINST_CREATE_load or other > routines for any GPR. > > https://codereview.appspot.com/301080043/ >
Sign in to reply to this message.
Commit log for latest patchset: --------------- i#1569 AArch64: Add new API funcs dr_insert_{save,restore}_stack_pointer. These functions are required on AArch64, where it is not possible to directly load or save the stack pointer. Review-URL: https://codereview.appspot.com/301080043 ---------------
Sign in to reply to this message.
I guess I'm still not sure about these: looking at the test, using these involves ifdef AARCH64 code to get a scratch register and to pass it in, with more ifdef code for x86_64 vs other platforms: so the API does not feel like a cross-platform solution, as the caller is basically special-casing code anyway. As there's nothing special here (vs, say, dr_swap_to_clean_stack(), which the user cannot implement on his own) and this is just for convenience, are we sure the resulting code is simpler with these routines? Are the internal uses going to be able to use something like this? They're mostly using things like instr_create_save_to_{dc,tls}. What will that look like for A64? Could we see what they would look like? > For example, drreg needs to use this API when reserving/unreserving SP. It may be best for drreg to simply never hand out SP on A64. > Same for drutil if the user uses SP as a scratch register. I don't follow this one: the user passing SP as a scratch register to a library routine that doesn't use it as a load/store base (the typical case is to not use it as a load/store base) would work fine, and for library routines that do perform a load/store, it seems like user error (restrictions should be documented of course).
Sign in to reply to this message.
https://codereview.appspot.com/301080043/diff/20001/core/lib/instrument.c File core/lib/instrument.c (right): https://codereview.appspot.com/301080043/diff/20001/core/lib/instrument.c#new... core/lib/instrument.c:5582: * be DR_REG_NULL. optional: it is arguably cleaner to not paste the header comment here (IMHO duplicating the comment just means more maintenance burden and "comment rot" as one side is changed but not the other over time) https://codereview.appspot.com/301080043/diff/20001/core/lib/instrument.c#new... core/lib/instrument.c:5596: "relative address operand is not allowed on AARCH64"); This should be in the docs. https://codereview.appspot.com/301080043/diff/20001/core/lib/instrument.c#new... core/lib/instrument.c:5626: * be DR_REG_NULL. Ditto https://codereview.appspot.com/301080043/diff/20001/core/lib/instrument.c#new... core/lib/instrument.c:5641: "relative address operand is not allowed on AARCH64"); Ditto https://codereview.appspot.com/301080043/diff/20001/core/lib/instrument.c#new... core/lib/instrument.c:5647: if(opnd_is_reg(src)) code style violation https://codereview.appspot.com/301080043/diff/20001/suite/tests/client-interf... File suite/tests/client-interface/move-stack.c (right): https://codereview.appspot.com/301080043/diff/20001/suite/tests/client-interf... suite/tests/client-interface/move-stack.c:46: if(my_stack == is_this_my_stack) code style violation https://codereview.appspot.com/301080043/diff/20001/suite/tests/client-interf... suite/tests/client-interface/move-stack.c:47: print("I got printed in my own stack!\n"); nit: s/in/on/ here and below https://codereview.appspot.com/301080043/diff/20001/suite/tests/client-interf... suite/tests/client-interface/move-stack.c:56: if(my_stack == is_this_my_stack) code style violation https://codereview.appspot.com/301080043/diff/20001/suite/tests/client-interf... File suite/tests/client-interface/move-stack.dll.c (right): https://codereview.appspot.com/301080043/diff/20001/suite/tests/client-interf... suite/tests/client-interface/move-stack.dll.c:36: #define ALIGN_FORWARD(x, alignment) \ Use client_tools.h to avoid duplicating this https://codereview.appspot.com/301080043/diff/20001/suite/tests/client-interf... suite/tests/client-interface/move-stack.dll.c:39: static byte new_stack[STK_SIZE+16]; Use ALIGN_VAR https://codereview.appspot.com/301080043/diff/20001/suite/tests/client-interf... suite/tests/client-interface/move-stack.dll.c:41: static int encounter = 0; Please add a comment explaining what "encounter" is for https://codereview.appspot.com/301080043/diff/20001/suite/tests/client-interf... suite/tests/client-interface/move-stack.dll.c:50: # define SCRATCH1 DR_REG_XAX Alternatively, you could use drreg and avoid hardcoding https://codereview.appspot.com/301080043/diff/20001/suite/tests/client-interf... suite/tests/client-interface/move-stack.dll.c:73: ptr_int_t new_stack_ptr = ALIGN_FORWARD(new_stack + STK_SIZE, 16); See above: this can be removed https://codereview.appspot.com/301080043/diff/20001/suite/tests/client-interf... suite/tests/client-interface/move-stack.dll.c:102: dr_restore_reg(drcontext, bb, instr, SCRATCH0, SPILL_SLOT_2); nit: restore in reverse order is typical https://codereview.appspot.com/301080043/diff/20001/suite/tests/client-interf... suite/tests/client-interface/move-stack.dll.c:114: encounter ++; inconsistent style https://codereview.appspot.com/301080043/diff/20001/suite/tests/client-interf... suite/tests/client-interface/move-stack.dll.c:120: encounter ++; inconsistent style
Sign in to reply to this message.
> I guess I'm still not sure about these: looking at the test, using these > involves ifdef AARCH64 code to get a scratch register and to pass it in, with > more ifdef code for x86_64 vs other platforms: so the API does not feel like a > cross-platform solution, as the caller is basically special-casing code anyway. Yes it is a problem if it is not allowed to pass two scratch regs as arguments. Probably it is easier to just leave in the documentation warning on the AArch64 restrictions. Another possible approach is to add an extra wrapper on instr_create_Ndst_Msrc such as #define XINST_CREATE_load(dcontext, reg, mem) instr_create_load(dcontext, reg, mem) In the wrapper we could put a CLIENT_ASSERT which checks the stack usage. The advantage is it could directly reflect the stack trace when debugging instead of checking it in the encoder. It would reflect which function it causes the problem, dr_insert_write_tls_field, dr_save_reg ... etc... In our local repositories, the dr internals uses this API in the {prepare_for,clean_after}_clean_call. It means a bit more ifdefs if we don't use this API.
Sign in to reply to this message.
On 2016/07/07 16:21:23, Kevin Zhou wrote: > > I guess I'm still not sure about these: looking at the test, using these > > involves ifdef AARCH64 code to get a scratch register and to pass it in, with > > more ifdef code for x86_64 vs other platforms: so the API does not feel like a > > cross-platform solution, as the caller is basically special-casing code > anyway. > > Yes it is a problem if it is not allowed to pass two scratch regs as arguments. > Probably it is easier to just leave in the documentation warning on the AArch64 > restrictions. Another possible approach is to add an extra wrapper on > instr_create_Ndst_Msrc such as > #define XINST_CREATE_load(dcontext, reg, mem) instr_create_load(dcontext, reg, > mem) > > In the wrapper we could put a CLIENT_ASSERT which checks the stack usage. The > advantage is it could directly reflect the stack trace when debugging instead of > checking it in the encoder. It would reflect which function it causes the > problem, dr_insert_write_tls_field, dr_save_reg ... etc... > > In our local repositories, the dr internals uses this API in the > {prepare_for,clean_after}_clean_call. It means a bit more ifdefs if we don't use > this API. As internal helper routines these seem fine, as I'm guessing the internal code already has an extra reg it can pass in. But to decide whether to make public API routines, where we should take more care than when adding an internal helper, seeing some examples of use of the API in ext/ or api/samples/ would help to decide.
Sign in to reply to this message.
|