Dynamic Recompilation Resources
Collected by Michael König, adjusted by David Sharp and Michael König
First of all let me note that recompilation isn't the right term,
although it is often used. Compilation is connected with high level languages,
which we don't have in this case. A more precise term would be binary
code translation.
Introduction
- The interview
with Jeremy Chadwick on Archaic Ruins provides a good introductory
warning how complex code translation is.
- And the page that shows a part of the Embra documentation gives
a good overview how dynamic
binary translation works.
Applications
It's always useful to see how others conquered the problems that arouse
with code translation. The main parts of the following product whitepapers
only provide general information, but if you read between the lines you'll
be able to find some useful hints.
Those who want to have even more in-depth information, or want to design their own binary translator,
should take a look at the free source code of the emulators collected below.
Documentation
- The technotes on the DR
emulator for PowerMac make a good distinction between interpretive
and recompiling emulation, as it follows both approaches for compatability
purposes.
- The BYTE Magazine features an article about
Building the Virtual PC, which explains how the emulation
of a complete PC on PowerMacs was achieved, including dynamic translation of Pentium to PowerPC code.
- Different versions of
Shade
simulate and trace SPARC V8/V9 and MIPS I binaries on a SPARC V8 host. Many of todays terminology orginate from
this work. The paper gives very good hints on various topics and problems connected with dynamic code translation,
but slight knowledge of SPARC assembler is recommended.
- FX!32,
a hybrid interpreter/profiler combined with a static binary translator for ix86-NT code on Alpha-NT machines,
uses larger translation units for better optimisation and utilises a profiling mechanisms that might be interesting
for dynamic code translation, too.
- Probably the most referenced document for this topic is that on the
Executor internals,
because of the impressive performance of Executor. In my opinion the whitepaper
is often not percise enough and provides some code that doesn't help much,
but it might be useful anyway.
- The new CPU by Transmeta called
Crusoe
uses dynamic Code Morphing to translate x86 code into it's internal VLIW encoding.
The technology documentation and also the
US Patent
for the Crusoe CPU reveal some nice information about Code Morphing.
- Rather new is the Java
HotSpot Compiler. At first it interprets the code and collects profiling
information. With that data the JIT compiler is only started for those
parts which really need speed up. That way the overhead of code generation
is reduced, which is especially large when code is produced for parts of
the program that are only executed once.
Source Code
- Some documents on a dynamically recompiling Megadrive emulator called
Generator, written as a
final year University Project by James Ponder (an old Acorn coder) which
give a good overview of how to code a portable 68000 dynamic recompiler,
as well as full C source code.
- The Unix Playstation emulator Sope
features a dynarec for Alpha CPU's.
- FPSE is yet another Playstation emulator, which has an additional dynarec for x86 code, and it is the only open source PSX emulator that is still developed.
- The code for the dynarec (from R4300i to x86) of the N64 emulator
1964 is now open source too.
- Even UAE features a x86 recompiler for its 68K core now.
- Also pex should have been a full PSX emulator for DOS,
but still doesn't emulate much more than the R3000A.
- NEStra is a dynamic recompiling
NES emulator for Linux, which is obviously available for different platforms,
few documents are available, but full C source code is provided as well
as the backend for an x86 platform.
- YAE, an Apple II emulator for Unix, also has a WIP version
of a dynamic recompiling CPU core for MIPS and SPARC.
- There is also a Spectrum 48k emulator for DOS with lazy code translation, called
SpectrEm-Dr.
Dynamic Code Generation
The last step in binary translation is the generation of machine code.
This part is also used for "serious" applications and Dawson
Engler did vast research on that topic.
His VCODE
shows how retargetable and fast dynamic code generation can be achieved.
But as the VCODE definition isn't totally suitable for emulation, a special
abstract machine code might have to be defined. For that purpose a comparision
between various RISC architectures can be very helpful, which is provided
by the Web Extension
for Computer Organization & Design by Patterson and Hennessy.
The New Jersey Machine-Code Toolkit
helps to write disassemblers and code generators and might therefore be interesting for code translation as well.
Other Collections
If you need further information you can try to search via the following
other collections.
Mailing List
Neil Bradley created a mailing list for discussion about dynamic recompilation.
- To subscribe to dynarec: Send a message to
[email protected]
with the word
subscribe
in the message body.
- To unsubscribe from dynarec: Send a message to
[email protected]
with the word
unsubscribe
in the message body.
- To post a message to the dynarec list, send a message to
[email protected].
The Acorn Emulation Page - David
Sharp
© Copyright David Sharp 1997,1998,1999,2000