Candidacy Exam Paper List

PhD Candidate: Marc Eaddy

Advisor: Al Aho

1. Program Transformation

1.1 General/Theory

A Survey of Rewriting Strategies in Program Transformation Systems, Eelco Visser, Institute of Information and Computing Sciences, Utrecht University, Technical Report UU-CS-2001-31 (2001)
Program transformation is used in a wide range of applications including compiler construction, optimization, program synthesis, refactoring, software renovation, and reverse engineering. In the realization of a program transformation system for a certain type of transformation, design choices must be made regarding the representation of programs and the paradigm for implementation of transformations. This paper gives a taxonomy of the application areas of program transformation, discusses considerations to be made in the implementation of program transformation systems, especially focussing on the specification of transformation strategies.
Complex program transformations are achieved through a number of consecutive modifications of a program. Basic modifications can be modeled by rewrite rules, i.e., rules that rewrite a program representation. A transformation strategy is an algorithm for choosing a path in the rewrite relation induced by a set of rules. The development of strategies in program transformation systems is illustrated by a discussion of the support for strategies in several typical systems.

Pointer Analysis for Source-to-Source Transformations, Marcio Buss, Stephen A. Edwards, Bin Yao, and Daniel Waddington, SCAM (2005)
We present a pointer analysis algorithm designed for source-to-source transformations. Existing techniques for pointer analysis apply a collection of inference rules to a dismantled intermediate form of the source program, making them difficult to apply to source-to-source tools that generally work on abstract syntax trees to preserve details of the source program.
Our pointer analysis algorithm operates directly on the abstract syntax tree of a C program and uses a form of standard dataflow analysis to compute the desired points-to information. We have implemented our algorithm in a sourceto- source translation framework and experimental results show that it is practical on real-world examples.

1.2 Aspect-Oriented Programming

An Overview of AspectJ, Gregor Kiczales, Erik Hilsdale, Jim Hugunin, Mik Kersten, Jeffrey Palm and William G. Griswold, ECOOP (2001)
AspectJ™ is a simple and practical aspect-oriented extension to Java™. With just a few new constructs, AspectJ provides support for modular implementation of a range of crosscutting concerns. In AspectJ’s dynamic join point model, join points are well-defined points in the execution of the program; pointcuts are collections of join points; advice are special method-like constructs that can be attached to pointcuts; and aspects are modular units of crosscutting implementation, comprising pointcuts, advice, and ordinary Java member declarations. AspectJ code is compiled into standard Java bytecode. Simple extensions to existing Java development environments make it possible to browse the crosscutting structure of aspects in the same kind of way as one browses the inheritance structure of classes. Several examples show that AspectJ is powerful, and that programs written using it are easy to understand.

Virtual Machine Support for Dynamic Join Points, Christoph Bockisch, Michael Haupt, Mira Mezini, and Klaus Ostermann, AOSD (2004)
A widespread implementation approach for the join point mechanism of aspect-oriented languages is to instrument areas in code that match the static part of pointcut designators, inserting dynamic checks for that part of matching that depends on run-time conditions, if needed. For performance reasons, such dynamic checks should be avoided whenever possible. One way to do so is to postpone weaving of advice calls until run-time, when conditions determining the emergence of join points hold. This calls for fluid code—code that adapts itself to the join point emergence at run-time, and suggests that AOP concepts should be integrated into the execution model underlying a VM. In this paper, we present first steps toward such an integration in Steamloom, an extension of IBM’s Jikes Research Virtual Machine. Steamloom is fairly restricted, but our initial experimental results indicate that aspect-aware VMs and fluid code are promising w.r.t performance. While the focus in this paper is on performance, there are other advantages of aspect-aware VMs to be investigated in the future.

1.3 Dynamic Software Updating

Binary Component Adaptation, Ralph Keller, Urs Hölzle, ECOOP (1998)
Binary component adaptation (BCA) allows components to be adapted and evolved in binary form and on-the-fly (during program loading). BCA rewrites component binaries before (or while) they are loaded, requires no source code access and guarantees release-to-release compatibility. That is, an adaptation is guaranteed to be compatible with a new binary release of the component as long as the new release itself is compatible with clients compiled using the earlier release. We describe our implementation of BCA for Java and demonstrate its usefulness by showing how it can solve a number of important integration and evolution problems. Even though our current implementation was designed for easy integration with Sun’s JDK 1.1 VM rather than for ultimate speed, the load-time overhead introduced by BCA is small, in the range of one or two seconds. With its flexibility, relatively simple implementation, and low overhead, binary component adaptation could significantly improve the reusability of Java components.

Dynamic Software Updating, Michael Hicks, Jonathan T. Moore, PLDI (2001)
Many important applications must run continuously and without interruption, yet must be changed to fix bugs or upgrade functionality. No prior general-purpose methodology for dynamic updating achieves a practical balance between exibility, robustness, low overhead, and ease of use.
We present a new approach for C-like languages that provides type-safe dynamic updating of native code in an extremely exible manner (code, data, and types may be updated, at programmer-determined times) and permits the use of automated tools to aid the programmer in the updating process. Our system is based on dynamic patches that both contain the updated code and the code needed to transition from the old version to the new. A novel aspect of our patches is that they consist of veri able native code (e.g. Proof-Carrying Code [17] or Typed Assembly Language [16]), which is native code accompanied by annotations that allow on-line veri cation of the code's safety. We discuss how patches are generated mostly automatically, how they are applied using dynamic-linking technology, and how code is compiled to make it updateable.
To concretely illustrate our system, we have implemented a dynamically-updateable web server, FlashEd. We discuss our experience building and maintaining FlashEd. Performance experiments show that for FlashEd, the overhead due to updating is typically less than 1%.

Mutatis Mutandis: Safe and Predicatable Dynamic Software Updating, Gareth Stoyle, Michael Hicks, Gavin Bierman, Peter Sewell, and Iulian Neamtiu, POPL (2005)
Dynamic software updates can be used to fix bugs or add features to a running program without downtime. Essential for some applications and convenient for others, low-level dynamic updating support has been used for many years. However, there is little high-level understanding that would support programmers in writing dynamic updates effectively.
In an effort to bridge this gap, we present a formal calculus called Proteus for modeling dynamic software updating in C-like languages that is flexible, safe, and predictable. Proteus supports dynamic updates to functions (even those that are active) and types, allowing on-line evolution to match source-code evolution as we have observed it in practice. All updates are provably type-safe and representation-consistent, meaning that only one version of a given type may exist in the program at any time, simplifying reasoning and avoiding unintuitive copy-based semantics. Finally, Proteus's novel and efficient static updateability analysis allows a programmer to automatically prove that an update is independent of the on-line program state, and thus predict it will not fail dynamically. Proteus admits a straightforward implementation, and we sketch how it could be extended to more advanced language features including threads.

Towards Flexible and Safe Technology for Runtime Evolution of Java Language Applications, Misha Dmitriev, Workshop on Engineering Complex Object-Oriented Systems for Evolution (in conjunction with OOPSLA) (2001)
There is a class of important computer applications that must run without interruption and yet must be changed from time to time to fix bugs or upgrade functionality. In this paper, we present the initial runtime evolution framework which we have developed for the HotSpot Java Virtual Machine, that allows us to change running applications on-the-fly, without interruption. We describe our staged implementation plan, where stages correspond to increasing levels of implementation complexity, yet on each stage a reasonably complete set of facilities is provided. The first stage—support for changes to method bodies only— has already been implemented and is to be included in the forthcoming Java 2 Platform release. We discuss multiple policies for dealingwith activemethods of running applications, present our thoughts on instance conversion implementation, and suggest that runtime evolution technology can be used for dynamic fine-grain profiling of applications.

OPUS: Online Patches and Updates for Security, Gautam Altekar, Ilya Bagrak, Paul Burstein, Andrew Schultz, USENIX Security (2005)
We present OPUS, a tool for dynamic software patching capable of applying fixes to a C program at runtime. OPUS’s primary goal is to enable application of security patches to interactive applications that are a frequent target of security exploits. By restricting the type of patches admitted by our system, we are able to significantly reduce any additional burden on the programmer beyond what would normally be required in developing and testing a conventional stop-and-restart patch. We hand-tested 26 real CERT vulnerabilities, of which 22 were dynamically patched with our current OPUS prototype, doing so with negligible runtime overhead and no prior knowledge of the tool’s existence on the patch programmer’s part.

1.4 Code Matching

BMAT -- A Binary Matching Tool for Stale Profile Propagation, Zhen Wang, Ken Pierce, Scott McFarling, Journal of Instruction-Level Parallism 2 (2000)
A major challenge of applying profile-based optimization on large real-world applications is how to capture adequate profile information. A large program, especially a GUI-based application, may be used in a large variety of ways by different users on different machines. Extensive collection of profile data is necessary to fully characterize this type of program behavior. Unfortunately, in a realistic software production environment, many developers and testers need fast access to the latest build, leaving little time for collecting profiles. To address this dilemma, we would like to re-use stale profile information from a prior program build. In this paper we present BMAT, a fast and effective tool that matches two versions of a binary program without knowledge of source code changes. BMAT enables the propagation of profile information from an older, extensively profiled build to a newer build, thus greatly reducing or even eliminating the need for re-profiling. We use two metrics to evaluate the quality of the results using propagated profile information: static branch prediction and the accuracy of code coverage. These metrics measure how well the matching algorithm works for the frequently executed core code and across the whole program, respectively. Experiments on a set of large DLLs from Microsoft Windows 2000 and Internet Explorer show that compared to freshly collected profiles, propagated information using BMAT is typically over 99% as effective in branch prediction and over 98% as accurate in code coverage information.

2. Programming Languages

2.1 General/Theory

Program Slicing, Mark Weiser, ICSE (1981)
Program slicing is a method used by experienced computer programmers for abstracting from programs. Starting from a subset of a program's behavior, slicing reduces that program to a minimal form which still produces that behavior. The reduced program, called a “slice”, is an independent program guaranteed to faithfully represent the original program within the domain of the specified subset of behavior. Finding a slice is in general unsolvable. A dataflow algorithm is presented for approximating slices when the behavior subset is specified as the values of a set of variables at a statement. Experimental evidence is presented that these slices are used by programmers during debugging. Experience with two automatic slicing tools is summarized. New measures of program complexity are suggested based on the organization of a program's slices.

A Survey of Program Slicing Techniques, Frank Tip, Journal of Programming Languages (1995)
A program slice consists of the parts of a program that (potentially) affect the values computed at some point of interest, referred to as a slicing criterion. The task of computing program slices is called program slicing. The original definition of a program slice was presented by Weiser in 1979. Since then, various slightly different notions of program slices have been proposed, as well as a number of methods to compute them. An important distinction is that between a static and a dynamic slice. The former notion is computed without making assumptions regarding a program's input, whereas the latter relies on some specific test case. Procedures, arbitrary control flow, composite datatypes and pointers, and interprocess communication each require a specific solution. We classify static and dynamic slicing methods for each of these features, and compare their accuracy and efficiency. Moreover, the possibilities for combining solutions for different features are investigated.

Programming as an Experience: The Inspiration for Self, Randall B. Smith and David Ungar, ECOOP (1995)
The Self system attempts to integrate intellectual and non-intellectual aspects of programming to create an overall experience. The language semantics, user interface, and implementation each help create this integrated experience. The language semantics embed the programmer in a uniform world of simple objects that can be modified without appealing to definitions of abstractions. In a similar way, the graphical interface puts the user into a uniform world of tangible objects that can be directly manipulated and changed without switching modes. The implementation strives to support the world-of-objects illusion by minimizing perceptible pauses and by providing true source-level semantics without sacrificing performance. As a side benefit, it encourages factoring. Although we see areas that fall short of the vision, on the whole, the language, interface, and implementation conspire so that the Self programmer lives and acts in a consistent and malleable world of objects.

2.2 Typing

Static Typing Where Possible, Dynamic Typing When Needed: The End of the Cold War Between Programming Languages, Erik Meijer and Peter Drayton, RDL (2004)
This paper argues that we should seek the golden middle way between dynamically and statically typed languages.

Semantic Type Qualifiers, Brian Chin, Shane Markstrum, Todd Millstein, PLDI (2005)
We present a new approach for supporting user-defined type refinements, which augment existing types to statically ensure additional invariants of interest to programmers. We provide an expressive language in which users define explicit type rules for new refinements. These rules are automatically incorporated by our framework’s extensible typechecker during static typechecking. Separately, our framework’s soundness checker automatically guarantees, once and for all, that a refinement’s type rules ensure the intended invariant, for all possible type-correct programs. We have formalized our approach and have instantiated it as a framework for adding new type qualifiers to C programs. We have used this framework to define and automatically prove sound a host of type qualifiers of different sorts, including nonnull, nonzero, untainted, and unique, and we have applied our qualifiers to ensure important invariants on open-source C programs.

3. Software Engineering

3.1 General/Theory - 1 paper

Foundations for the Study of Software Architecture, Dewayne Perry, Alexander L. Wolf, ACM SIGSOFT Software Engineering Notes (1992)
The purpose of this paper is to build the foundation for software architecture. We first develop an intuition for software architecture by appealing to several wellestablished architectural disciplines. On the basis of this intuition, we present a model of software architecture that consists of three components: elements, form, and rationale. Elements are either processing, data, or connecting elements. Form is defined in terms of the properties of, and the relationships among, the elements --- that is, the constraints on the elements. The rationale provides the underlying basis for the architecture in terms of the system constraints, which most often derive from the system requirements. We discuss the components of the model in the context of both architectures and architectural styles and present an extended example to illustrate some important architecture and style considerations. We conclude by presenting some of the benefits of our approach to software architecture, summarizing o...

The Evolving Philosophers Problem: Dynamic Change Management, Jeff Kramer and Jeff Magee, IEEE Transactions on Software Engineering (1990)
One of the major challenges in the provision of distributed systems is the accomodation of evolutionary change. This may involve modifications or extensions to the system which were not envisaged at design time. Furthermore, in many application domains there is a requirement that the system accomodate such change dynamically, without stopping or disturbing the operation of those parts of the system unaffected by the change. Since the description of software structure (components and interconnections) provides a clear means for both system comprehension and construction, it seems appropriate that changes should also be specified as structural change, in terms of component creation/deletion and connection/disconnection. These changes are then applied to the operational system itself to produce the modified system. This paper presents a model for dynamic change management which separates structural concerns from component application concerns. This separation of concerns permits the formulation of general structural rules for change at concerns permits the formulation of general structural rules for change at the configuration level without the need to consider application state, and the specification of application component actions without prior knowledge of the actual changes which may be introduced. In addition, the changes can be applied structural changes which may be introduced. In addition, the changes can be applied in such a way as to leave the modified system in a consistent state, and cause no disturbance to the unaffected part of the operational system. The model is applied to an example problem, "evolving philosophers". The principles described in this model have been implemented and tested in the Conic environment for distributed systems.

Software Architecture: a Roadmap, David Garlan, "The Future of Software Engineering", Anthony Finkelstein (Ed.), ACM Press (2000)
Over the past decade software architecture has received increasing attention as an important subfield of software engineering. During that time there has been considerable progress in developing the technological and methodological base for treating architectural design as an engineering discipline. However, much remains to be done to achieve that goal. Moreover, the changing face of technology raises a number of new challenges for software architecture. This paper examines some of the important trends of software architecture in research and practice, and speculates on the important emerging trends, challenges, and aspirations.

Toward a formal theory of extensible software, Shriram Krishnamurthi, Matthias Felleisen, FSE (1998)
As software projects continue to grow in scale and scope, it becomes important to reuse software. An important kind of reuse is extensibility, i.e., the extension of software without accessing existing code to edit or copy it. In this paper, we propose a rigorous, semantics-based definition of software extensibility. Then we illustrate the utility of our definitions by applying them to several programs. The examination shows how programming style affects extensibility and also drives the creation of a variant of an existing design pattern. We consider programs in both object-oriented and functional languages to prove the robustness of our definitions.

3.2 Software Process

A Spiral Model of Software Development and Enhancement, Barry W. Boehm, IEEE Computer (1988)
The spiral model presented in this article is one candidate for improving the software process model situation. The major distinguishing feature of the spiral model is that it creates a risk-driven approach to the software process rather than a primarily document-driven or code-driven process. It incorporates many of the strengths of other models and resolves many of their difficulties.

3.3 Metrics

An Empirical Validation of Software Cost Estimation Models, Chris F. Kemerer, Communications of the ACM (1987)
Practitioners have expressed concern over their inability to accurately estimate costs associated with software development. This concern has become even more pressing as costs associated with development continue to increase. As a result, considerable research attention is now directed at gaining a better understanding of the software-development process as well as constructing and evaluating software cost estimating tools. This paper evaluates four of the most popular algorithmic models used to estimate software costs [SLIM, COCOMO, Function Points, and ESTIMACS). Data on 15 large completed business dataprocessing projects were collected and used to test the accuracy of the models‘ ex post effort estimation. One important result was that Albrecht’s Function Points effort estimation model was validated by the independent data provided in this study. The models not developed in business data-processing environments showed significant need for calibration. As models of the software-development process, all of the models tested failed to sufficiently reflect the underlying factors affecting productivity. Further research will be required to develop understanding in this area.

Design and code inspections to reduce errors in program development, M. E. Fagan, IBM Systems Journal (1976)
Substantial net improvements in programming quality and productivity have been obtained through the use of formal inspections of design ana of code. Improvements are made possible by a systematic and efficient design and code verification process, with well-defined roles for inspection participants. The manner in which inspection data is categorized and made suitable for process analysis is an important factor in attaining the improvements. It is shown that by using inspection results, a mechanism for initial error reduction followed by ever-improving error rates can be achieved.

3.4 Tools

The Inscape Environment, Dewayne E. Perry, ICSE (1989)
The Inscape Environment is an integrated software development enviroment for building large software systems by large groups of developers. It provides tools that are knowledgeable about the process of system construction and evolution and that work in symbiosis with the system builders and evolvers. These tools are integrated around the constructive use of formal module interface specifications. We first discuss the problems that Inscape addresses, outline our research strategies and approaches to solving these problems, and summarize the contributions of the Inscape Environment. We then discuss the major aspects of the Inscape Environment: the specification language, system construction, system evolution, use and reuse, and validation. We illustrate these various components with examples and discussions.

3.5 Software reliability and testing

Countering Network Worms Through Automatic Patch Generation, Stelios Sidiroglou, Angelos D. Keromytis, IEEE Security & Privacy (2005)
We propose the first end-point architecture for automatically repairing software flaws such as buffer overflows that are exploited by zero-day worms. Our approach relies on source code transformations to quickly apply automatically created (and tested) patches to vulnerable segments of the targeted applications, exploiting the fact that a worm must reveal its infection vector to achieve further infection. Preliminary experimental results indicate a success rate of 82%, and a repair time of 3 seconds.

N-version Programming: a Fault Tolerance Approach to Reliability of Software Operation, Liming Chen and Algirdas Avizienis, FTSC (1978)
N-version programming is defined as the independent generation of N >= 2 functionally equivalent programs from the same initial specification. A methodology of N-version programming has been devised and three types of special mechanisms have been identified that are needed to coordinate the execution of an N-version software unit and to compare the correspondent results generated by each version. Two experiments have been conducted to test the feasibility of N-version programming. The results of the experiments are discussed. In addition, constraints are identified that must be met for effective application of N-version programming.

3.6 Programming in the Large

A model of large program development, L. Belady and M. Lehman, IBM Systems Journal (1976)
As a need for a discipline of software engineering has been recognized, the design, implementation, and maintenance of computer software has come into the forefront. The formulation of concepts of programming methodology, exemplified by Dijkstra's structured programming, strikes at the roots of the problem. The realization is that a program, much as a mathematical theorem, should and can be provable. Recognition that a program can be proved correct as it is developed and maintained, and before its results are used, may ultimately change the nature of the programming task and the face of the programming world. Clearly these developments are of fundamental importance. They appear to point to long-term solutions to problems that will be encountered in creating the great amount of program text that the world appears to require. But even though progress in mastering the science of program creation, maintenance, and expansion has also been made, there is still a long way to go.

Hints for computer system design, Butler Lampson, IEEE Software (1984)
Studying the design and implementation of a number of computer has led to some general hints for system design. They are described here and illustrated by many examples, ranging from hardware such as the Alto and the Dorado to application programs such as Bravo and Star.

4. Compilers

4.1 General/Theory

Machine-Independent Optimizations, Compilers: Principles, Techniques, and Tools (2nd Ed), Alfred Aho, Monica Lam, Ravi Sethi, and Jeffrey Ullman (Not Yet Published)
The book provides a thorough introduction to compiler design and covers topics such as context-free grammars, fine state machines, and syntax-directed translation.

4.2 Optimizing compilers - 1 paper

Optimizing AspectJ, Pavel Avgustinov, Aske Simon Christensen, Laurie Hendren, Sascha Kuzins, Jennifer Lhoták, Ondrej Lhoták, Oege de Moor, Damien Sereni, Ganesh Sittampalam, Julian Tibble, PLDI (2004)
AspectJ, an aspect-oriented extension of Java, is becoming increasingly popular. However, not much work has been directed at optimising compilers for AspectJ. Optimising AOP languages provides many new and interesting challenges for compiler writers, and this paper identifies and addresses three such challenges.

Design, Implementation and Evaluation of Adaptive Recompilation with On-Stack Replacement, Stephen J. Fink, Feng Qian, CGO (2003)
Modern virtual machines often maintain multiple compiled versions of a method. An on-stack replacement (OSR) mechanism enables a virtual machine to transfer execution between compiled versions, even while a method runs. Relying on this mechanism, the system can exploit powerful techniques to reduce compile time and code space, dynamically de-optimize code, and invalidate speculative optimizations.

This paper presents a new, simple, mostly compilerindependent mechanism to transfer execution into compiled code. Additionally, we present enhancements to an analytic model for recompilation to exploit OSR for more aggressive optimization. We have implemented these techniques in Jikes RVM and present a comprehensive evaluation, including a study of fully automatic, online, profile-driven deferred compilation.

An Infrastructure for Adaptive Dynamic Optimization, Derek Bruening, Timothy Garnett, and Saman Amarasinghe, CGO (2003)
Dynamic optimization is emerging as a promising approach to overcome many of the obstacles of traditional static compilation. But while there are a number of compiler infrastructures for developing static optimizations, there are very few for developing dynamic optimizations. We present a framework for implementing dynamic analyses and optimizations. We provide an interface for building external modules, or clients, for the DynamoRIO dynamic code modification system. This interface abstracts away many low-level details of the DynamoRIO runtime system while exposing a simple and powerful, yet efficient and lightweight, API. This is achieved by restricting optimization units to linear streams of code and using adaptive levels of detail for representing instructions. The interface is not restricted to optimization and can be used for instrumentation, profiling, dynamic translation, etc.

To demonstrate the usefulness and effectiveness of our framework, we implemented several optimizations. These improve the performance of some applications by as much as 40% relative to native execution. The average speedup relative to base DynamoRIO performance is 12%.

4.3 Debugging Optimized Code

Debugging Optimized Code with Dynamic Deoptimization, Urs Hölzle, Craig Chambers, David Ungar, PLDI (1992)
SELF’s debugging system provides complete source-level debugging (expected behavior) with globally optimized code. It shields the debugger from optimizations performed by the compiler by dynamically deoptimizing code on demand. Deoptimization only affects the procedure activations that are actively being debugged; all other code runs at full speed. Deoptimization requires the compiler to supply debugging information at discrete interrupt points; the compiler can still perform extensive optimizations between interrupt points without affecting debuggability. At the same time, the inability to interrupt between interrupt points is invisible to the user. Our debugging system also handles programming changes during debugging. Again, the system provides expected behavior: it is possible to change a running program and immediately observe the effects of the change. Dynamic deoptimization transforms old compiled code (which may contain inlined copies of the old version of the changed procedure) into new versions reflecting the current source-level state. To the best of our knowledge, SELF is the first practical system providing full expected behavior with globally optimized code.

Generic Techniques for Source-Level Debugging and Dynamic Program Slicing, Frank Tip, TAPSOFT (1995)
Algebraic specifications have been used successfully for the generation of a variety of software development tools, such as typecheckers, interpreters, and program analysis tools. The present paper discusses how two previously developed, language-independent techniques, origin tracking and dynamic dependence tracking can be used to derive powerful language-specific debugging tools from algebraic specifications of interpreters. In particular, we show that—in addition to “standard” debugger features such as single-stepping, state inspection, and breakpoints—a variation of dynamic program slicing can be defined with surprisingly little effort. The main contribution of this paper is to showthat the information required to construct such debugging tools is to a very large extent language-independent and implicitly present in a language’s specification. We assume that specifications are executed by conditional term rewriting. Specifically, an algebraic specification of an interpreter expresses the execution of a program as the rewriting of a term consisting of a function execute applied to the abstract syntax tree (AST) of that program. Rewriting this term produces a sequence of terms that effectively represent the consecutive internal states of the interpreter. Origin tracking is a method for tracing occurrences of the same subterm in a sequence of terms, and is used for the definition of single-stepping and breakpoints. Dependence tracking establishes certain minimal dependence

Tdb: A Source-level Debugger for Dynamically Translated Programs, Naveen Kumar, Bruce R. Childers, and Mary Lou Soffa, AADEBUG (2005)
Debugging techniques have evolved over the years in response to changes in programming languages, implementation techniques, and user needs. A new type of implementation vehicle for software has emerged that, once again, requires new debugging techniques. Software dynamic translation (SDT) has received much attention due to compelling applications of the technology, including software security checking, binary translation, and dynamic optimization. Using SDT, program code changes dynamically, and thus, debugging techniques developed for statically generated code cannot be used to debug these applications. In this paper, we describe a new debug architecture for applications executing with SDT systems. The architecture provides features that create the illusion that the source program is being debugged, while allowing the SDT system to modify the executing code. We incorporated this architecture in a new tool, called tdb, that integrates a SDT system, Strata, with a widely used debugger, gdb. We evaluated tdb in the context of a code security checker. The results show that a dynamically translated program can be debugged at the source level and that the approach does not overly increase

OPTVIEW: A New Approach for Examining Optimized Code, Caroline Tice, Susan L. Graham, PASTE (1998)
The task of mapping between source programs and machine code, once the code has been optimized and transformed by a compiler is often difficult. Yet there are many instances, such as debugging optimized code or attributing performance analysis data to source lines, when it is useful or necessary to understand at the source level what is occurring in the binary. The standard approach has been for tools to attempt to map directly from the optimized binary to the original source. Such mappings are often fragile, and sometimes inaccurate or misleading. We suggest an alternative approach. Rather than mapping directly between the original source and the binary, we create a modified version of the source program, still recognizable, but updated to reflect some of the effects of optimizations, thus facilitating mapping from the binary. We have implemented a tool, Optview, to demonstrate and test these ideas.