Using the ASM framework to implement common Java
bytecode transformation patterns
Eugene Kuleshov, eu@javatx.org
ABSTRACT
Most AOP frameworks targeting the Java platform use a bytecode
weaving approach as it is currently considered the most practical
solution. It allows applying cross-cutting concerns to Java
applications when source code is not available, is portable and
works on existing JVMs, in comparison to VM-level AOP
implementations.
Load-time bytecode weaving (LTW), which happens right before
the application code is loaded into the Java VM, significantly
simplifies development environment, but also raises the bar for
the performance and memory requirements for these
transformations. Such requirements directly apply to the toolkit
that will be used to perform these transformations. In this paper,
we examine how Java bytecode transformations, typical for AOP
implementations can be done efficiently, using the ASM
1
bytecode manipulation framework [1].
Transformations used by general-use AOP frameworks and
similar applications can be categorized, and the common patterns
can be reused to implement specific transformations. These
patterns can also be combined to implement more complex
transformations.
General Terms
Languages, Design, Performance, Experimentation,
Standardization
Keywords
Aspect-Oriented Programming, Java, bytecode, weaving, ASM
1. INTRODUCTION
The ASM bytecode framework was designed at France Telecom
R&D by Eric Bruneton, Romain Lenglet and Thierry Coupaye
[2]. After evaluating several existing frameworks, including
BCEL [3], Serp [4] and JOIE [5], they designed a more efficient
approach, providing better performance and memory foot print.
Today ASM is used in many applications and has become the de-
facto standard for bytecode processing.
The main idea of the ASM API [6] is not to use an object
representation of the bytecode. This made it possible expressing
the same transformations using only a few classes comparing to
approximately 80 classes in Serp and 270 in BCEL API. Those
frameworks create lots of objects during class deserialization,
which takes a lot of time and memory. ASM avoids this overhead
to keep transformation fast and to use very little memory. This is
done by using the Visitor design pattern [7], without representing
1
The ASM name does not mean anything: it is just a reference to
the keyword in C which allows some functions to be
implemented in assembly language.
the visited tree with objects. Visitors can change call chains and
therefore transform the visited code. Using the Adapter design
pattern [7] visitors can be chained in order to implement complex
transformation from smaller building blocks. A similar approach
is also used in the SAX API for XML processing [8].
ASM hides all the complexity of the serialization and
deserialization of the class bytecode, using the following
techniques:
• Automatic management of the class constant pool,
therefore the user does not have to manipulate indexes
of these constants.
• Automatic management of the class structure, including
annotations, fields, methods, method code and other
standard bytecode attributes.
• Labels are used to manage instruction addresses, so it is
easy to insert new code in between existing instructions
• Computation of maximum stack and local variables, as
well as StackMapFrames
The event-based interaction between event producers and event
consumers is defined by several interfaces: ClassVisitor,
FieldVisitor, MethodVisitor, and AnnotationVisitor. Event
producers, like ClassReader fire visit*() calls to those interfaces.
On the other hand, event receivers like writers (ClassWriter,
FieldWriter, MethodWriter, and AnnotationWriter), adapters
(ClassAdapter and MethodAdapter) or classes from the tree
package (ClassNode, MethodNode, etc) implementing those
interfaces.
The following code demonstrates how this looks from the
developer’s point of view:
ClassReader cr = new ClasReader(bytecode);
ClassWriter cw = new ClassWiter(cr,
ClassWriter.COMPUTE_MAXS |
ClassWriter.COMPUTE_FRAMES);
FooClassAdapter cv = new FooClassAdapter(cw);
cr.accept(cv, 0);
Here ClassReader reads the bytecode. On accept() method call
ClassReader fires all visiting events corresponding to the
bytecode structure. FooClassAdapter will receive those events
and can change the event flow before passing them to the
ClassWriter. Once ClassWriter receive all the events it will have
transformed bytecode. You may notice that the ClassReader
instance is passed to the ClassWriter that allows performance
optimizations based on the assumption that the transformations
mostly add new code.
The following sections will show a number of practical examples
that should help you to better understanding of the ASM
framework.