Selected Publications

Embedded systems, as typified by modern mobile phones, are already seeing a drive toward using multi-core processors. The number of cores will likely increase rapidly in the future. Engineers and researchers need to be able to simulate systems, as they are expected to be in a few generations time, running simulations of many-core devices on today's multi-core machines. These requirements place heavy demands on the scalability of simulation engines, the fastest of which have typically evolved from just-in-time (JIT) dynamic binary translators (DBT) ...

Abstract Dynamic Binary Translation (DBT) is the key technology behind cross-platform virtualization and allows software compiled for one Instruction Set Architecture (ISA) to be executed on a processor supporting a different ISA. Under the hood, DBT is typically implemented using Just-In-Time (JIT) compilation of frequently executed program regions, also called traces. The main challenge is translating frequently executed program regions as fast as possible into highly efficient native code. As time for JIT compilation adds to the ...

Instruction set simulators (Iss) are vital tools for compiler and processor architecture design space exploration and verification. State-of-the-art simulators using just-in-time (Jit) dynamic binary translation (Dbt) techniques are able to simulate complex embedded processors at speeds above 500 Mips. However, these functional Iss do not provide microarchitectural observability. In contrast, low-level cycle-accurate Iss are too slow to simulate full-scale applications, forcing developers to revert to FPGA-based simulations. In this paper we demonstrate that it is possible to run ultra-high speed cycle-accurate instruction set simulations surpassing...

For memory constrained embedded systems code size is at least as important as performance. One way of increasing code density is to exploit compact instruction formats, e.g. ARM Thumb2, where the processor either operates in standard or compact instruction mode. The ARCompact ISA considered in this paper is different in that it allows freeform mixing of 16- and 32-bit instructions without a mode switch. Compact 16-bit instructions can be used anywhere in the code given that additional register constraints are satisfied. In this paper we present an integrated instruction selection and register allocation methodology and develop two approaches for mixed-mode code generation: a simple opportunistic ...


Recent Posts

Very happy to see our simulation research made it into a very successful Synopsys Inc. product making customers happy.

Check out the Synopsys Insight Newsletter Article about the DesignWare ARC nSIM simulator titled DesignWare ARC nSIM: Speed, Accuracy and Visibility – Instruction Set Simulation without Compromise!

Read more

The official case study about the research impact of The EnCore Microprocessor and the ArcSim Simulator project has been released. We are all very happy to see industry value our ideas and work.

The following is the summary of the Impact Report (you can view the full report here)

Read more

I was lucky to get an opportunity to present my PhD work at the Euro LLVM’12 conference. It seems that at that time we were the first ones to have built a production ready concurrent JIT compiler using the LLVM framework.

Read more

The second silicon implementation of an extended EnCore processor is a test-chip codenamed Castle, fabricated in a generic 90nm CMOS process. All of the EnCore test chips are named after hills in Edinburgh; Castle is named after the rock on which Edinburgh Castle is built. The Castle chip contains an extended version of the EnCore processor, together with a 32KB 4-way set-associative Instruction Cache, and a 32KB 4-way set-associative Data Cache.

Read more