Guys,
There is a new branch in the CVS with tag inline-opcode-map, where I
have redone the "inline opcode map" optimization. The purpose of this
optimization is to provide an O(1) inverse mapping from jump address
to original opcode for threaded emulation. This is achieved by
storing the original opcode just before the jump label for the code
block implementing the opcode, in such a manner that it can be
retrieved trivially and reliably using the address of the jump label.
The point of this optimization is that it speeds up GC and marshaling.
As far as I can tell, Christian's implementation must never have
worked because his optimized version of CodeArea::adressToOpcode
contains something that should be a cast but is missing the parens and
is therefore a syntax error: i.e. it could never even compile :-)
In any case, the old design could not work because GCC segments the
code into small blocks headed by each explicit labels, and freely
reorders these blocks, adding jumps to preserve the intended control
flow where necessary. As a consequence, if you put some data just
before an explicit C++ label, there is no guarantee that you will also
find it there at run time.
My new design uses explicit labels only to convince GCC that the code
blocks are reachable. The rest of the trick is all done in assembler.
At the moment, I only have support for GCC on x86 (using the AT&T
assembler syntax - which is the default for gas).
The new implementation passes the entire test suite and benchmarks
show that it is consistenly faster than current devel (i.e. without
the inline opcode map optimization). A quick test suggests that GC is
now faster by at least 6.85%.
I would really appreciate a hand in porting the trick to new
platforms. This involves determining how to load the address of a
jump label. On GCC/x86, this is done using something like:
asm("movl $LABEL,%0" : "=m" (ptr))
Another issue is the alignment of jump labels (for sparc, this is
multiple of 4 - I also use that for x86, although I was not able to
find documentation on the issue - if anybody has accurate knowledge on
this topic, please share).
It would be nice to port it at least to sparc and ppc. Could someone
using these platforms please lend me a hand. If you don't know what
the assembler instruction is for loading an address, you can easily
find out. Write a C program with code like this:
foo:
ptr = && foo;
then invoke gcc -S -fverbose-asm on the file and look at the assembly
code.
Cheers,
-- Dr. Denys Duchier Équipe Calligramme LORIA, Nancy, FRANCE - Please send submissions to hackers@mozart-oz.org and administriva mail to hackers-request@mozart-oz.org. The Mozart Oz web site is at http://www.mozart-oz.org/.