231x Filetype PDF File size 0.12 MB Source: www2.hawaii.edu
Blueprint for an Embedded Systems Programming Language
Paul Soulier Depeng Li
Dept. of Information and Computer Sciences Dept. of Information and Computer Sciences
University of Hawaii, Manoa University of Hawaii, Manoa
Honolulu, Hawaii 96822 Honolulu, Hawaii 96822
Email: psoulier@hawaii.edu Email: depengli@hawaii.edu
Abstract Given the significant role embedded systems have,
the implications of faulty software is clearly evident.
Embedded systems have become ubiquitous and are However, software logic errors are not the only manner
found in numerous application domains such as sen- in which a system can malfunction. The Stuxnet virus
sor networks, medical devices, and smart appliances. [14] is an example of a failure caused by a malicious
Software flaws in such systems can range from minor software attack that ultimately caused an industrial
nuisances to critical security failures and malfunc- control system to destroy itself. As Internet connectiv-
tions. Additionally, the computational power found in ity becomes increasingly common in embedded sys-
these devices has seen tremendous growth and will tems, they too will be susceptible to software-based
likely continue to advance. With increasingly powerful security exploits.
hardware, the ability to express complex ideas and Manyareas of software development have benefitted
concepts in code becomes more important. Given the from the improvements made to programming lan-
importance of developing safe and secure software guages. Modern languages are more capable of detect-
for these applications, it is interesting to observe that ing errors at compile-time through their type system
the vast majority of software for these devices is and many of the low-level and error prone aspects of
written in the C programming language —an inher- programming have been abstracted away. This enables
ently unsafe language as compared to other modern the development of more reliable and complex applica-
languages. This paper examines the characteristics tions. Embedded systems are an exception to this. The
and requirements that uniquely differentiate embedded vast majority of embedded systems are still developed
systems from other application domains. The result using the decades-old C programming language —an
is a blueprint for a modern, high-level programming inherently unsafe language with only the basic features
language specifically designed for embedded systems. of the imperative paradigm.
Despite its flaws, C has stubbornly remained the
1. Introduction de facto standard for embedded system development.
The reasons for this are difficult to identify (al-
though a massive existing code base and the lack
Embedded systems exist in a multitude of appli- of a compelling replacement could be factors). Vari-
cations and advances in hardware technology will ous language-based approaches have been created to
continue to make them capable of greater degrees of address many of the shortcomings of C —all with
sophistication and intelligent functionality. While often varying degrees of success as they apply to low-
unnoticed and unseen, these systems are responsible level programming. These solutions tend to focus only
for properly controlling medical devices, automobile on a subset of all the issues involved with low-
braking systems, industrial control systems, and nu- level software development while not considering other
merous other cyber-physical systems that interact with critical aspects. What is missing is a cohesive and
the world in profound ways. The growing fields of sen- practical language that effectively incorporates all of
sor networks and the Internet of Things combined with these methods and techniques in a manner consistent
ubiquitous Internet connectivity will further expand the with the needs of embedded systems.
use of embedded systems. Onlanguage design Landin [9] remarks in the paper
“The Next 700 Programming Languages” that with potentially thousands of vehicles being recalled.
“...we must systematize their design so that With these strict requirements, the need to successfully
a new language is a point chosen from a develop and release error-free software very important.
well-mapped space, rather than a laboriously
devised construction.” 2.2. Data Layout and Representation
Todesign a compelling language to replace C, we must
first identify what the language needs to look like. In most applications, developers are really just con-
With this in mind, the contribution of this paper is a cerned with what data needs to be stored in a structure;
description of the features and constructs necessary in where the data goes and how much room it takes
a language designed to implement secure and reliable is generally unimportant. Embedded systems, on the
embedded systems software. other hand, care a lot about where data is located in a
This paper is structured as follows. Section 2 de- structure and how much space it consumes.
scribes the characteristics that differentiate embedded The ability to specify the size and location of data
systems from other programming disciplines. Section is necessary when defining language-based structures
3 presents the blueprint of a programming language that must match hardware structures or standardized
designed to produce secure and reliable embedded protocols. Data layout is also an important tool for tun-
software. Section 4 highlights related research that has ing. Organizing a structure to improve memory locality
attempted to address the shortcomings of programming based on knowledge of the CPU cache architecture
languages for embedded systems, and finally, section or runtime access can have a significant performance
5 concludes. impact. The ability to represent and manipulate data
2. Characteristics of Embedded Systems in arbitrary ways is a fundamental aspect of writing
embedded systems code.
Embedded systems have a number of characteristics 2.3. Hardware Interaction
that differentiate them from other application domains.
These particularities make most programming lan-
guages ill-suited for this type of software development. Embedded systems interact directly with hardware
High-level languages generally attempt to provide through memory-mapped IO or low-level CPU instruc-
helpful abstractions for tasks that can be automated tions. In the case of memory-mapped IO, hardware
by the compiler or those that are error-prone. While registers appear as normal memory, but may behave
these abstractions can improve development efficiency in ways that are not entirely consistent with regular
and reduce errors, they have the unfortunate side memory. Consider, for example, a hardware device that
effect of obscuring the low-level details that embedded accepts a 32-bit value from the host through a 16-bit
systems must deal with. The necessity to interact memory-mapped register. Listing 1 shows pseudocode
directly with hardware, specify the organization of data that could be used to send this value to the device.
within a structure, operate with limited resources, and The host writes the most significant 16-bit value to
performance requirements are all elements that bring the register followed by the least significant value.
a unique set of challenges to developing this type of
software. This section examines the various aspects of u16 reg = device->input_reg;
embedded systems that necessitates a domain-specific *
language. reg = val >> 16;
*
reg = (val & 0xffff)
*
2.1. Safety and Reliability Listing 1. “Interfacing with Hardware”
Asmentionedintheintroduction,embeddedsystems
are often found in devices that can have a major The compiler, unaware that the memory location is
impact to the physical world. It is frequently required different from others, may eliminate the first assign-
that these systems operate without error and with no ment upon seeing the same memory address is im-
down-time. Furthermore, if a problem is found and a mediately overwritten by another value. Idiosyncrasies
software fix is identified, upgrades can be difficult and such as this are common in embedded systems where
sometimes impossible. Consider faulty software in an unique hardware properties do not always match the
automobile component —the implications are massive abstract machine of the programming language.
2.4. Transparent Expression can have a profound impact to a programs ability
to achieve its goals. These small performance gains
Transparency is a trait of language expression that often come from specific knowledge of a system and
is particularly import to embedded systems. One of the the programmer’s ability to generate appropriate code
strengths of the C programming language is that it is rather than a compiler’s optimizer. Performance can be
easy for the programmer to conceptualize how source a major influence in the overall design of an embedded
code will translate into machine instructions and data system.
structures. This ability becomes very important when
attempting to fit code or data into resource-limited 3. Language Blueprint
hardware, gaining additional performance, or interface
with hardware. Expressiveness can be described as the property of a
2.5. Constrained Environment language that allows a programmer to effectively trans-
late concepts and ideas into code. The more expressive
Embedded systems are almost always constrained in a language is, the easier it is for a programmer to
some fashion. The most common limitations encoun- realize a solution to a problem. This trait is domain
tered are computational power (memory and proces- specific; what constitutes an expressive language in
sor), time, and energy. Advances in hardware tech- one domain does not make it expressive in another.
nology have come a long way in easing some of For example, assembly language is very expressive as
these constraints, but they are still a concern for many compared to Javascript for executing a specific CPU
systems. instruction. Conversely, Javascript is far more capable
Time is an interesting constraint when considered in of describing a web application than is assembly.
the context of cyber-physical systems. An occasional This section describes the features and constructs
half second delay in a desktop application probably that make a language expressive when considering the
wouldn’t be noticed. An equivalent delay in a real- characteristics of embedded systems. These features
time system such as an electronic breaking system collectively form a blueprint that describe the features
or avionics fly-by-wire system could have serious of a language ideally suited for embedded systems.
consequences. Many embedded systems have real-time 3.1. Paradigm
deadlines that must always be met.
Energy is another constraint that can have a signif-
icant impact to embedded systems. Sensor networks The functional language paradigm has many useful
and other devices that rely on battery power have a properties —particularly referencial transparency (code
finite lifetime before they stop working. Power con- has no side effects and there is no global state that
sumption must be managed to maximize operational can change). This property, among others, has many
time. Highly energy-efficient devices also tend to be compelling benefits. However, the functional paradigm
very limited in memory and processing power. is somewhat at odds with embedded systems. Em-
Constraints are driven by a number of factors. Some, bedded systems are state-full by nature. They interact
such as time and physical dimensions, are governed with hardware components that are themselves state
by the laws of physic. Other limitations are driven machines. The advantages of the functional paradigm
by business factors that require the use less powerful are unarguably valuable. However applying it to sys-
hardware to save on manufacturing costs. Regardless tems that are defined by state would likely be ineffec-
of why limitations are present, they introduce unique tive. Conventional wisdom would suggest the impera-
challenges to embedded system development. tive programming paradigm, which is based on state
change, and is a natural choice for a embedded system
2.6. Performance programs.
Object oriented programming, while not strictly nec-
Some devices perform computationally intensive essary, can be useful for embedded systems. Object-
operations or transmit data at high speeds. In such orientation has proven to be a valuable method of
cases, performance is a critical design goal where the reasoning about complex systems. Additionally, OO
difference of a few percentage points can determine techniques can be effective at eliminating some of the
the success or failure of a product. Consequently, the unsafe idioms used in C. For example, C programmers
ability to save a few bytes in a data structure or sometimes use typeless “void” pointers, type casts, and
eliminate a few microseconds from a section of code unions to achieve unsafe versions of polymorphism.
Inheritance and sub-typing provided by OOP is a type- int x=foo(), y=bar();
safe alternative.
The object-oriented features supported by the lan- if (x & y) {
guage must include the basics of the paradigm: data // do stuff...
encapsulation/abstraction, dynamic binding of function }
calls, and inheritance/derivation. Each of these features
is relatively transparent to the programmer in terms of // vs.
overhead and the underlying code that is generated. if (x && y) {
Dynamic binding can impact runtime performance and
memory overhead, but these concerns are generally // do stuff...
insignificant. Achieving equivalent functionality using }
standard imperative techniques will generally incur the Listing 2. ”Syntax and Semantics”
same costs. The concern for embedded systems is that
the constructs generated by the compiler are hidden
from the developer and reduce transparency. Syntax and semantics must be clear and unambigu-
Multiple inheritance can pose significant challenges ous. Although more convenient, preserving the error-
and is worth mentioning. It is a frequently debated prone syntax and idioms of C to avoid learning a new
feature with the primary point of conflict being that of language isn’t justified. In terms of language design,
its utility compared to its drawbacks. When examining there is no reason for the types of ambiguities shown
the implementation of multiple inheritance in C++ in Listing 2. Language syntax and semantics must be
[17], [16], the effects on performance and memory designed to prevent these types of programming errors.
overhead are not trivial. Multiple inheritance affects
the organization of code and data structures; which 3.3. Type System
can have a negative impact to both performance and
data layout. Due to this, M.I. is not a good choice
for embedded systems. There are alternatives (such as A type in a programming language is a form of
“interfaces”) that achieve functionally similar results, specification that defines various characteristics of the
but without the same overhead. constructs within a language. A type system is the
mechanism used to enforce that all type specifications
3.2. Syntax and Semantics are correctly adhered to. The primary role of a type
system is to help promote program correctness and
reduce bugs. This section describes the basic properties
Asomewhat interesting trend of languages designed a type system as well as addressing some specific types
to replace C is the goal of retaining the same syntax that deserve special consideration.
and programming idioms. This is interesting because Static Type System Static type systems are essential
Cand its associated idioms are often unsafe. Consider for embedded systems. Dynamic type systems are un-
the C code in Listing 2. Both if statements are syntac- desirable as they can leave latent type errors undetected
tically correct but are semantically very different. The until runtime. These errors are often unrecoverable
logical “AND” operator && and the bitwise “AND” op- and result in program failure. Conversely, static type
erator &, while visually similar, have different runtime systems attempt to enforce the type rules at compile
behavior. Accidentally adding or omitting a & character time. Static type systems allow type correctness to
is an easy mistake to make and the compiler has no be verified earlier in the development process. While
way to know which is correct. potentially requiring more effort on behalf of the pro-
Pointer arithmetic is another example of syntax that grammer to properly define the type specifications, this
is unsafe and also largely unnecessary. A common C results in systems with fewer bugs. Due to the nature of
idiom is to use pointer arithmetic as a way to iterate embedded systems, namely the difficulty of updating
over a subset of an array. It’s also used as an opti- software and the implications of software failures, it is
mization technique to eliminate array references. This more important to identify errors early. Consequently,
idiom is also responsible for numerous bugs. There are a language for embedded systems should be statically
plenty of examples of safe syntax in other languages typed.
for expressing ranges of arrays and modern compilers Type and Memory Safety A type and memory
can usually optimize array access more effectively than safe language is critical to minimizing program flaws.
humans can. Type and memory violations (e.g.: out-of-bounds array
no reviews yet
Please Login to review.