Automated Code Generation using Dynamic Programming

Book - Automatic Code Generation
  • Type: Book
  • Author: Igor Böhm
  • Publisher: VDM Verlag Dr. Mueller e.K. (8 Feb 2008)
  • ISBN-10: 3836461587
  • ISBN-13: 978-3836461580
  • Abstract:

    Building compiler back ends from declarative specifications that map tree structured intermediate representations onto target machine code is the topic of this book. Although many tools and approaches have been devised to tackle the problem of automated code generation, there is still room for improvement. In this context we present HBURG, an implementation of a code generator generator that emits compiler back ends from concise tree pattern specifications written in our code generator description language. The language features attribute grammar style specifications and allows for great flexibility with respect to the placement of semantic actions. Our main contribution is to show that these language features can be integrated into automatically generated code generators that perform optimal instruction selection based on tree pattern matching combined with dynamic programming. In order to substantiate claims about the usefulness of our language we provide two complete examples that demonstrate how to specify code generators for RISC and CISC architectures. Compiler writers are the primary target audience of this book.

Automated Code Generation using Dynamic Programming Techniques

Master Thesis - Automatic Code Generation Using Dynamic Programming Techniques
  • Abstract:

    Building compiler back ends from declarative specifications that map tree structured intermediate representations onto target machine code is the topic of this thesis. Although many tools and approaches have been devised to tackle the problem of automated code generation, there is still room for improvement. In this context we present HBURG, an implementation of a code generator generator that emits compiler back ends from concise tree pattern specifications written in our code generator description language. The language features attribute grammar style specifications and allows for great flexibility with respect to the placement of semantic actions. Our main contribution is to show that these language features can be integrated into automatically generated code generators that perform optimal instruction selection based on tree pattern matching combined with dynamic programming. In order to substantiate claims about the usefulness of our language we provide two complete examples that demonstrate how to specify code generators for RISC and CISC architectures.

  • Kurzfassung:

    Diese Diplomarbeit beschreibt HBURG, ein Werkzeug das aus einer Spezifikation des abstrakten Syntaxbaums eines Programms und der Spezifikation der gewünschten Zielmaschine automatisch einen Codegenerator für diese Maschine erzeugt. Abbildungen zwischen abstrakten Syntaxbäumen und einer Zielmaschine werden durch Baummuster definiert. Für diesen Zweck haben wir eine deklarative Beschreibungssprache entwickelt, die es ermöglicht den Baummustern Attribute beizugeben, wodurch diese gleichsam parametrisiert werden können. Darüber hinaus hat man die Möglichkeit Baummuster an beliebigen Stellen mit semantischen Aktionen zu versehen. Wir zeigen, dass diese Spracheigenschaften in solchen Codegeneratoren anwendbar sind, die auf der Technik der Baummustererkennung und dynamischer Programmierung basieren. Zwei Beispiele der Codegenerator Spezifikationen für RISC und CISC Zielmaschinen sollen die Mächtigkeit der entwickelten Sprache aufzeigen.


Rule Based vs. Statistical Chunking of CoNLL Data Sets

Seminal Paper - Rule Based vs. Statistical Chunking of CoNLL Data Sets
  • Abstract:

    One of the most common operations in language processing are segmentation and labeling [7]. Chunking is a popular representative of a segmentation process which aims to segment tagged tokens into meaningful structures. This paper compares two chunking approaches, namely an approach based on regular expression rules developed by a human, and a machine based chunking approach based on a N-gram statistical tagger. Experimental results show that the performance of the machine based chunker is very similar to the results obtained by the regular expression chunker. Another interesting fact is that it was considerably harder to define regular expressions which capture noun phrases than verb phrases. Obviously this difficulty was caused by the fact that the structure of noun phrases may be very complex.


Unigram Backoff vs. TnT: Evaluating Part of Speech Taggers

Seminal Paper - Unigram Backoff vs. TnT: Evaluating Part of Speech Taggers
  • Abstract:

    Automated statistical part-of-speech (POS) tagging has been a very active research area for many years and is the foundation of natural language processing systems. This paper introduces and analyzes the performance of two part-of-speech taggers, namely the NLTK unigram backoff tagger and the TnT tagger, a trigram tagger. Experimental results show that the TnT tagger outperforms the NLTK unigram backoff tagger. However, the tagging accuracy of both taggers decreases almost by the same number when the taggers are tested on a corpus which differs from the one they were trained on.


RAIDframe & Distributed RAID

Seminal Paper - RAIDframe & Distributed RAID
  • Abstract:

    This paper is divided into two parts, both related to interesting developments in the area of computer storage. The first part deals with RAID and introduces the RAIDframe framework, a rapid tool for prototyping RAID architectures. In the second part, two distributed storage concepts, namely Storage Area Networks (SAN) and Networked Attached Storage (NAS), are introduced and compared to Direct Attached Storage.
    Over the past decade several trends in computer history have driven the design of storage subsystems towards increasing parallelism. An innovation that improves both dependability and performance of storage systems is disk arrays. Since disk arrays consist of many disk drives and, hence, many disk arms, rather than one large drive with one disk arm, potential throughput can be increased, thus improving performance.
    After having covered the basics, the need for a framework, namely RAIDframe, which enables rapid RAID prototyping will be elaborated. Many key ideas and concepts implemented in RAIDframe will be explained, in order to justify the power and flexibility of RAIDframe. Finally a section describing the most important steps which are necessary to extend the RAIDframe framework with a new architecture, shows the extensibility of the RAIDframe framework.
    While storage prices keep on dropping the cost of data remains invaluable. High speed computer networks make it possible to aggregate servers into central organization wide serverfarms. Keeping reliable storage on a per server basis can soon become a maintenance and cost problem. Distributed storage helps organizations to aggregate reliable storage into central managed places, thus reducing the amount of per server maintenance overhead. Currently there exist two approaches for doing distributed storage. Storage Area Networks and Network Attached Storage. This paper introduces the main concepts behind them, and compares distributed to local storage. Two concrete distributed storage protocols are presented to help the reader in getting a better overview.


PortBrowser: A user interface for the BSD ports system

PortBrowser: A user interface for the BSD ports system
  • Abstract:

    BSD operating systems offer a very powerful and flexible framework for installing third party packages, called the BSD ports system. This framework provides a complete environment coupled with a rich set of commands, which is able to build almost any kind of software from source code and package it. Thus any piece of software which has been incorporated into the BSD ports system, can easily be installed, deleted or upgraded. The PortBrowser, a graphical front end for the BSD ports system, is a light weight tool which offers a simple, portable, and secure environment for the most frequent tasks provided by the ports system. It is a very useful tool for novice as well as experienced users, since it allows for easy browsing through the BSD ports tree, and offers search facilities to a certain extent. This paper covers the basics and some details of various flavors of BSD ports tree implementations, followed by a description of the PortBrowser project. Implementation details as well as the GUI design of the PortBrowser are covered and described in an easy to understand manner.


Integration of Security Measures and Techniques in an Operating System (considering OpenBSD as an example)

Integration of Security Measures and Techniques in an Operating System
  • Abstract:

    This paper covers defensive security technologies and secure programming techniques employed in the OpenBSD operating system. Privilege separation, a secure design pattern which is implemented in many privileged processes, is discussed based on the network time protocol daemon OpenNTPD. Afterwards a closer look at address space and memory protection techniques like W^X and the Stack Smashing Protector is taken. Finally the process of gathering entropy in order to produce high quality random numbers is discussed based on examples found in the OpenBSD operating system.