2 Markus Borg
some of the benefits of software – significant changes can often occur at any
time, both during a development project and post-release. Quickly adapting
to shifting customer needs and technology changes is often vital to survival in
a competitive market. In this light, the concept of DevOps has emerged as an
approach to minimize time to market while maintaining quality [2]. While ag-
ile development is particularly suitable for customer-oriented development in
the Internet era, it is also increasingly used in embedded systems development
of more critical nature [3] with adaptations such as SafeScrum [4]. Moreover,
while agile software development is flexible, we argue that ML development
iterates even faster – and thus necessitates “agility on steroids.”
Data scientists often conduct the highly iterative development of ML mod-
els. Data scientists, representing a new type of software professionals, often
do not have the software engineering training of conventional software de-
velopers [5]. This observation is analogous to what has been reported for
developers of scientific computing in the past, e.g., regarding their familiarity
with agile practices [6]. Instead of prioritizing the crafts of software engineer-
ing and computer science, many data scientists focus on mastering the art
of taming data into shapes that are suitable for model training – typically
using domain knowledge to hunt quantitative accuracy targets for a specific
application. The ML development process is experimental in nature and in-
volves iterating between several intertwined activities, e.g., data collection,
data preprocessing, feature engineering, model selection, model evaluation,
and hyperparameter tuning. An unfortunate characteristic of ML develop-
ment is that nothing can be considered in isolation. A foundational ML paper
by Google researchers described this as the CACE principle “Changing Any-
thing Changes Everything” [7]. When developing ML models in Software 2.0,
no data science activities are ever independent.
In this keynote address, we will discuss two phenomena that have emerged
to meet the characteristics of ML development. First, Notebook interfaces
to meet the data scientists’ needs to move swiftly. Unfortunately, the step
from prototyping in Notebook interfaces to a mature ML solution is often
considerable – and cumbersome for many data scientists. In Section 1.2, we will
present a solution by Jakobsson and Henriksson that bridges the gap between
the data scientists’ preferred notebook interfaces and standard development in
Integrated Development Environments (IDE). Second, analogous to DevOps
in conventional agile software development, in Section 1.3, we will look at
how MLOps has emerged to close the gap between ML development and ML
operations. More than just an agility concept, we claim that it is required
to meet the expectations on the trustworthy AI of the future – illustrated in
the light of the recently proposed Artificial Intelligence Act in the European
Union. We refer to our concept of reinforcing the development and operations
of AI systems, afflicted by the CACE principle, using two metaphors from
construction engineering: buttresses and rebars.