Processor

The 4stack processor project is a research project for high performance, low cost computing. Find the following informations:

Market position
The processor architecture (64k gzipped Postscript), The processor architecture as 200k PDF.
User manual including instruction set architecture and examples (80k gzipped Postscript), The User Manual as 230k PDF.
Simulator including assembler, debugger and sample code; 44k gzipped, tar file, requires Unix (or derivate) and X window system, to use the assembler, install Gforth. For Windows users, there's a precompiled self installing archive (750k).
Compiler discussion
Diploma thesis: Implementation of the 4stack Processor Using Verilog
FAQ. Yes, there have been questions, and here are the answers.

The 4stack Processor uses stack based instructions for a four way VLIW processor. If implemented in state of the art technology, the 4stack processor would significantly outperform high-end DSPs like the TMS 320C6x or the TigerSHARC, and also allow to run the complete application on a single processor (no additional RISC core required).

Increasing performance demands for modern applications require increasing level of parallel execution. The 4stack processor architecture focus on fine grain parallelism. A number of tradeoff considerations led to the architectural concept of a VLIW with four stacks and two data units:

Stack-like instructions with implicit addessing reduce memory bandwidth. Reduced memory bandwidth improve cache usage and speed up large programs.
Multiple stacks overcome the limited parallelism of single-stack machines. Using several stack-like organized register files instead of one huge multiported register file minimizes conflicts and greatly reduces gate count for these units.
Procedure calls are cheaper on stack machines, since register spills and fills are done automatically. This makes high level languages (especially object oriented) and modern OSes faster.

Bernd Paysan, 21may1997, 26nov2000