Fast response time. Although having a bounded
response time is a must, the response time has to be
fast to be usable. A bounded DSA algorithm which is
10 times slower than a conventional one, is not practical.
Memory requests have to be always satisfied.
In non-real time systems applications can receive a null
pointer or are just killed by the OS when the system
runs out of memory. It is obvious that is not possible to
always grant all the memory requested. But the DSA
algorithm has to minimise the chances of exhausting
the memory pool by minimising the fragmentation and
wasted memory.
Although there is large range of real-time systems
with different memory constraints and hardware sup-
port, the study presented in this work in progress is fo-
cussed on embedded systems where memory is a scarce
resource and there is no MMU support.
3 DSA in real-time systems
This section analyses some of the most relevant DSA
available implementations used in real-time and non-
real-time environments. A comparison with other im-
plementations (like QNX, Lynxs, or VxWorks) were not
possible because that code is not public or only released
under non-disclosure agreements.
3.1 Doug Lea implementation
Doug Lea’s allocator [4] is one of the most used allo-
cators, both in standard applications and in real-time
systems (eCos, glibc, etc.). M.S. Johnstone and P.R.
Wilson [2] concluded that it is an excellent allocator in
relation to fragmentation; and widely considered to be
among the best uniprocessor allocators [1].
Doug Lea’s allocator is an implementation of several
strategies combined to provide good performance in s ev-
eral parameters: speed, space-conserving, portability,
block locality and tunable. It uses different strategies
depending on the size of the requested blocks. Also, it
is important to note that free blocks are immediately
coalesced.
This algorithm has some limitations that make it not
suitable in real-time systems. It focusses optimisations
on most used block sizes (from 16 to 512 bytes) using
a fine grain (8 bytes) segregated fit structure; but for
larger blocks, it implements a simple free list with best-
fist search, which can degenerate in lengthy searches
along a simple linked list. One important area where
this allocator do not perform well is with network or disk
drivers that work with medium size blocks. Although it
do not provides the required performance for real-time
applications, this implementation has been widely used
and studied, and it is considered as one of the best al-
locators.
3.2 BGET
This a portable and customisable implementation of the
first-fit and best-fit strategies, the strategy used is se-
lected via conditional compilation. It was written in
1972 and used in a wider range of application from small
embedded systems to large mainframes. When bget is
initialised, it is possible to register several user call back
functions that the allocator can call to compact, grow
and shrink the managed memory pool.
As exp ec ted, when BGET is compiled to use first-fit
strategy it provides a good mean execution time un-
der real workloads. But the worst case execution time
depends on the size of the memory pool, which is not
acceptable for real-time systems.
3.3 RTEMS
RTEMS is a real-time executive which provides a high
performance environment for embedded applications.
RTEMS is a full featured but small and compact RTOS.
The DSA policy used in RTEMS is the first-fit, im-
plemented using a single linked list. It never c oalesc e
blocks, that is, free blocks are just inserter into the free
list. It uses the system facility sbrk() to enlarge the
memory poll.
3.4 Half-Fit
Proposed by T. Ogasawara [5] this algorithm is the only
DSA designed –known by the authors– to fulfil the spe-
cific requirements of real-time systems. It is based on
segregated lists, where the lists are power of 2. It also
suggests the use of advanced bit instructions (“find first
bit set” FFS() that are available on most modern proces-
sors and libraries) to find in constant time the best free
queue to allocate the requested block. The asymptotic
complexity of the malloc() function is O(1).
Deallocated blocks are immediately coalesced, there-
fore in the worst case, the free() function will merge
tree block (current block with physically previous and
next blocks). Although the asymptotic complexity of
the free() function is constant, the free function per-
form more operations and the time host is higher than
the malloc function.
4 TLSF strategy
As described before, there is not a single DSA algorithm
suitable for all application types. As already stated,
real-time applications are quiet different from conven-
tional applications. This sections analyses how real-
time applications requirements determined the design
of TLSF to fullfill most of their requirements.
Real-time applications requirements:
• Bounded response time.
• Fast response time.
• Bounded fragmentation.
• Low fragmentation.
2