tfm-report/Secciones/background.tex

444 lines
33 KiB
TeX
Raw Normal View History

2019-11-15 22:34:58 +01:00
% !TEX encoding = UTF-8
% !TEX spellcheck = en_GB
2019-11-15 22:34:58 +01:00
% !TEX root = ../paper.tex
\chapter{Background}
\label{cha:background}
2019-12-09 03:28:35 +01:00
Before delving into the specific problems that exist in program slicing currently, let's explore the surface of this thesis' relevant fields: program slicing and exception handling. The last one will be focused specifically on the Java programming language, but could be generalized to other popular programming languages which feature a similar exception handling system (e.g., Python, JavaScript, C++).
2019-11-15 22:34:58 +01:00
\section{Program slicing}
2019-12-09 03:28:35 +01:00
This section provides a series of definitions and background information so that future definitions can be grounded in a common foundation. \carlos{ampliar intro?}
2019-12-09 03:28:35 +01:00
\begin{definition}[Program slicing] \label{def:program-slicing}
\textit{Program slicing} is the process of extracting a slice $S$ given a program $P$ and a slicing criterion $SC$.
\end{definition}
2019-12-03 19:57:40 +01:00
2019-12-09 03:28:35 +01:00
\begin{definition}[Slicing criterion] \label{def:slicing-criterion}
Given a program $P$, composed of statements and containing variables $x_1, x_2 ... x_n \in \textnormal{vars}$, a \textit{slicing criterion} is a tuple $SC = \langle s, v \rangle$ where $s \in P$ is a single statement that belongs to the program, and $v$ is a set of variables from $P$. Each variable in $v$ may not appear in $s$.
\end{definition}
\begin{definition}[Slice] \label{def:slice}
Given a program $P$ and a slicing criterion $SC = \langle s, v \rangle$, a \textit{slice} is a subset of statements of $P$ ($S \subset P$), which behaves like the original program $P$, when considering the values of the variables in $v$ in statement $s$.
\end{definition}
\begin{definition}[Execution history] \label{def:execution-history}
Given a program $P$, composed of a set of statements $S = \{s_1, s_2, s_3 ... s_n\}$, and a set of input values $I$, the \textit{execution history} of $P$ given $I$ is the list of statements $H$ that is executed, in the order that they were executed.
\end{definition}
Until now, the concept of slicing has been centred around finding the instructions that affect a variable.
That is the original definition, but as time has progressed, variations have been proposed, with the one described in definitions \ref{def:program-slicing}, \ref{def:slicing-criterion} and \ref{def:slice} is called \textit{static backward slicing}.
It is also the one that will be used throughout this thesis, though the errors detected and solutions proposed can be easily generalized to others.
The different variations are described later in this chapter, but there exist two fundamental dimensions along which the slicing problem can be proposed \cite{Sil12}:
2019-12-03 19:57:40 +01:00
2019-11-15 22:34:58 +01:00
\begin{itemize}
2019-12-09 03:28:35 +01:00
\item \textit{Static} or \textit{dynamic}: slicing can be performed statically or dynamically.
\textit{Static slicing} \cite{Sil12} produces slices that consider all possible executions of the program: the slice will be correct regardless of the input supplied.
In contrast, \textit{dynamic slicing} \cite{KorL88,AgrH90b} considers a single execution of the program, thus, limiting the slice to the statements present in an execution log.
The slicing criterion is expanded to include a position in the execution history that corresponds to one instance of the selected statement, making it much more specific.
It may help find \carlos{idk if I need the ``to''} a bug related to indeterministic behaviour ---such as a random or pseudo-random number generator--- but, despite selecting the same slicing criterion in the same program, the slice must be recomputed for each set of input values or execution considered. \carlos{Talk about quasi-static as a middle ground?}
\item \textit{Backward} or \textit{forward}: \textit{backward slicing} \cite{Sil12} looks for the statements that affect the slicing criterion.
It sits among the most commonly used slicing technique.
In contrast, \textit{forward slicing} \cite{BerC85} computes the statements that are affected by the slicing criterion.
There also exists a middle-ground approach called \textit{chopping} \cite{JacR94}, which is used to find all the statements that affect some variables in the slicing criterion and at the same time they are affected by some other variables in the slicing criterion.
2019-11-15 22:34:58 +01:00
\end{itemize}
2019-12-09 03:28:35 +01:00
Since the seminal definition of program slicing by Weiser \cite{Wei81}, the most studied variation of slicing has been \textit{static backward slicing}, which has been defined in previous sections of this thesis.
That definition can be split in two sub-types, \textit{strong} and \textit{weak} slices, with different levels of requirements and uses in different fields.
\begin{definition}[Strong static backward slice \cite{Tip95}]
2019-11-15 22:34:58 +01:00
\label{def:strong-slice}
2019-12-09 03:28:35 +01:00
Given a program $P$ and a slicing criterion $SC = \langle s,v \rangle$, $S$ is a \textit{strong static backward slice} of $P$ with
respect to $SC$ if $S$ fulfils the following properties:
2019-11-15 22:34:58 +01:00
\begin{enumerate}
\item $S$ is an executable program.
2019-12-09 03:28:35 +01:00
\item $S \subseteq P$, or $S$ is the result of removing 0 or more statements from $P$.
2019-11-15 22:34:58 +01:00
\item For any input $I$, the values produced on each execution of $s$
for each of the variables in $v$ is the same when executing $S$ as
when executing $P$. \label{enum:exact-output}
\end{enumerate}
\end{definition}
2019-12-03 19:57:40 +01:00
\sergio{Esta definicion no obligaba tambien a acabar con el mismo error en caso de que la ejecucion no termine? Si es asi, plantearse poner algo al respecto.}
2019-12-04 16:24:19 +01:00
\josep{hay que revisar la definición de (1) Weiser, (2) Binkley y Gallagher y (3) Frank Tip. Mi opinion es que NO: Creo que no es necesario que el error se repita. Lo que dice es que el valor de las variables del SC debe ser el mismo, pero no dice nada del error.}
2019-11-15 22:34:58 +01:00
\begin{definition}[Weak static backward slice \cite{RepY89}]
\label{def:weak-slice}
2019-12-09 03:28:35 +01:00
\josep{Si esa cita no es, entonces puedes usar la de Binkley: \cite{BinG96}}
Given a program $P$ and a slicing criterion $SC = \langle s,v \rangle$, $S$ is the \textit{weak static backward slice} of $P$ with respect to $SC$ if $S$ fulfils the following properties:
2019-11-15 22:34:58 +01:00
\begin{enumerate}
\item $S$ is an executable program.
2019-12-09 03:28:35 +01:00
\item $S \subseteq P$, or $S$ is the result of removing 0 or more statements from $P$.
2019-11-15 22:34:58 +01:00
\item For any input $I$, the values produced on each execution of $s$
for each of the variables in $v$ when executing $P$ is a prefix of
those produced while executing $S$ ---which means that the slice
may continue producing values, but the first values produced always
2019-12-03 15:12:13 +01:00
match up with all those produced by the original program.
2019-11-15 22:34:58 +01:00
\end{enumerate}
\end{definition}
2019-12-05 14:59:56 +01:00
\sergio{$\forall~i~\in~I, v\in~V~\rightarrow~seq(i,v,P)~Pref~seq(i,v,S)$ where $seq(i,a,A)$ representa la secuencia de valores obtenidos para $a$ al ejecutar el input $i$ en el programa $A$. $I$ es el conjunto de todos los inputs posibles para $P$. Por ahi irian los tiros creo yo.} \sergio{Formalizacion existente en el repo: Program Slicing $\rightarrow$ Trabajos $\rightarrow$ Erlang Benchmarks $\rightarrow$ Papers $\rightarrow$ ICSM 2018 $\rightarrow$ Submitted (Section III - A)}
2019-12-04 16:24:19 +01:00
\josep{Si se formaliza con el uso de seq, entonces puedes mirar la definicion del paper de POI testing (Sergio sabe cual es).}
2019-12-03 19:57:40 +01:00
2019-11-15 22:34:58 +01:00
Both definitions (\ref{def:strong-slice} and~\ref{def:weak-slice}) are
2019-12-09 03:28:35 +01:00
used throughout the literature (see, e.g., \cite{pending}\carlos{Which citation? Most papers on exception slicing do not indicate or hint whether they use strong or weak.}\sergio{Josep?}\josep{para Strong se puede poner a Weiser. Para Weak se puede poner a Binkley \cite{BinG96}}).
Most do not differentiate them, or acknowledge the other variant, because most publications focus on one variant exclusively.
Therefore, although the definitions come from different authors, the \textit{weak} and \textit{strong} nomenclature employed here originates from a control dependency analysis by Danicic~\cite{DanBHHKL11}, where slices that produce the same output as the original are named \textit{strong}, and those where the original is a prefix of the slice, \textit{weak}.
Different applications of program slicing use the option that fits their needs, though \textit{weak} is used if possible, because the resulting slices are smaller statement-wise, and the algorithms used tend to be simpler.
Of course, if the application of program slices requires the slice to behave exactly like the original program, then \textit{strong} slices are the only option.
As an example, debugging uses weak slicing, as it does not matter what the program does after reaching the slicing criterion, which is typically the point where an error has been detected.
In contrast, program specialization requires strong slicing, as it extracts features or computations from a program to create a smaller, standalone unit which performs in the exact same way.
Along the thesis, we indicate which kind of slice is produced with each problem detected and technique proposed.
2019-11-15 22:34:58 +01:00
\begin{example}[Strong, weak and incorrect slices]
2019-12-09 03:28:35 +01:00
Consider table~\ref{tab:slice-weak}, which displays the sequence of values or execution history obtained with respect to different slices of a program and the same slicing criterion.
The first row stands for the original program, which computes $3!$.
Slice A's execution history is identical to the original and therefore it is a strong slice.
Slice B's execution history does not stop after producing the same first 3 values as the original: it is a weak slice. An instruction responsible for stopping the loop may have been excluded from the slice.
Slice C is incorrect, as the execution history differs from the original program in the second column. It seems that some dependency has not been accounted for and the value is not updating.
\begin{table}
\centering
\label{tab:slice-weak}
\begin{tabular}{r | r | r | r | r | r }
% Evaluation Number & \textbf{1} & \textbf{2} & \textbf{3} & \textbf{4} & \textbf{5} \\ \hline
Original program & 1 & 2 & 6 & - & - \\ \hline
Slice A & 1 & 2 & 6 & - & - \\ \hline
Slice B & 1 & 2 & 6 & 24 & 120 \\ \hline
Slice C & 1 & 1 & 1 & 1 & 1 \\
\end{tabular}
\caption{Sequence of values obtained for a certain variable of the original program and three different slices A, B and C for a particular input.}
\end{table}
2019-11-15 22:34:58 +01:00
\end{example}
2019-12-09 03:28:35 +01:00
\carlos{The following paragraph has already been repeated in previous sections, mainly the motivation. Consider its removal and the addition of citations to the previous mention.}
\josep{Even though the original proposal by Weiser~\cite{Wei81} focussed on an imperative language, program slicing is a language--agnostic technique.} Program slicing is a language--agnostic technique, but the original proposal by
2019-12-03 15:12:13 +01:00
Weiser~\cite{Wei81} covered a simple imperative programming language.
Since then, the literature has been expanded by dozens of authors, that have
2019-11-15 22:34:58 +01:00
described and implemented slicing for more complex structures, such as
uncontrolled control flow~\cite{HorwitzRB88}, global variables~\cite{???},
exception handling~\cite{AllH03}; and for other programming paradigms, such as
2019-12-03 15:12:13 +01:00
object--oriented languages~\cite{???} or functional languages~\cite{???}.
2019-12-09 03:28:35 +01:00
\carlos{Se pueden poner más, faltan las citas correspondientes.}\sergio{Guay, hay que buscarlas y ponerlas, la biblio la veo corta para todos los papers que hay, yo creo que cuando este todo deberia haber sobre 30 casi, si no mas.} \josep{Si. Muchas de esas referencias puedes sacarlas de los ultimos surveys de slicing.}
\subsection{Computing program slices with the system dependence graph}
There exist multiple program representations, data structures and algorithms that can be used to compute a slice, but the most efficient and broadly used data structure is the \textit{system dependence graph} (SDG), introduced by Horwitz et al. \cite{HorRB90}.
It is computed from the program's source code, and once built, a slicing criterion is chosen and mapped on the graph, then the graph is traversed using a specific algorithm, and the slice is obtained.
Its efficiency relies on the fact that, for multiple slices performed on the same program, the graph generation process is only performed once.
Performance-wise, building the graph has quadratic complexity ($\mathcal{O}(n^2)$), and its traversal to compute the slice has linear complexity ($\mathcal{O}(n)$); both with respect to the number of statements in the program being sliced.
The SDG is a directed graph, and as such it has a set of nodes, each representing a statement in the program ---barring some auxiliary nodes introduced by some approaches--- and a set of directed edges, which represent the dependencies among nodes.
Those edges represent several kinds of dependencies ---control, data, calls, parameter passing, summary.
To create the SDG, first a \textit{control flow graph} (CFG) is built for each method in the program, some dependencies are computed based on the CFG.
With that data, a new graph representation is created, called the \textit{program dependence graph} (PDG) \cite{OttO84}.
Each method's PDG is then connected to form the SDG.
For a simple visual example, see Example~\ref{exa:create-sdg} below, which briefly illustrates the intermediate steps in the SDG creation. The whole process is explained in detail in section~\ref{sec:first-def-sdg}.
Once the SDG has been created, a slicing criterion can be mapped on the graph and the edges are traversed backwards starting.
The process is performed twice, the first time ignoring a specific kind of edge, and the second, ignoring another kind.
Once the second pass has finished, all the nodes visited form the slice.
\begin{example}[The creation of a system dependence graph]
\label{exa:create-sdg} \sergio{Este ejemplo da demasiados detalles en cuanto a los grafos.}
Consider the code provided in Figure~\ref{fig:create-sdg-code}, where a simple Java program containing two methods (\texttt{main} and \texttt{multiply}) is displayed.
\begin{figure}[h]
\begin{lstlisting}
void main() {
multiply(3, 2);
}
int multiply(int x, int y) {
int result = 0;
while (x > 0) {
result += y;
x--;
2019-11-15 22:34:58 +01:00
}
2019-12-09 03:28:35 +01:00
System.out.println(result);
return result;
}
\end{lstlisting}
\caption{A simple Java program with two methods.}
\label{fig:create-sdg-code}
\end{figure}
Now turn your attention to Figure~\ref{fig:create-sdg-cfg}\carlos{is this too personal? the second person is used in other places, but not as directly}: a CFG has been created for each method. The CFG has a unique source node (without incoming edges) and a unique sink node (without outgoing edges), named ``Entry'' and ``Exit''. In between, the statements are structured according to all possible executions that could happen.
\begin{figure}[h]
\centering
\includegraphics[width=0.6\linewidth]{img/multiplycfg}
\caption{The control flow graphs for the code in Figure~\ref{fig:create-sdg-code}.}
\label{fig:create-sdg-cfg}
\end{figure}
Next is Figure~\ref{fig:create-sdg-pdg}, which is a reordering of the CFG's nodes according to the dependencies between statements: the PDG. Finally, both PDGs are connected into the SDG.
\begin{figure}
\centering
\includegraphics[width=\linewidth]{img/multiplypdg}
\includegraphics[width=\linewidth]{img/multiplysdg}
\caption{The program dependence graphs (above) and system dependence graph (below) generated from the code in Figure~\ref{fig:create-sdg-code}.}
\label{fig:create-sdg-pdg}
\end{figure}
\end{example}
2019-11-15 22:34:58 +01:00
2019-12-09 03:28:35 +01:00
\subsection{Program slicing metrics}
2019-12-03 22:52:07 +01:00
2019-12-09 03:28:35 +01:00
In the area of program slicing, there exist many slicing techniques and tools implementing them.
This fact has created the need to classify them by defining a set of metrics.
These metrics are commonly associated to some features of the generated slices, or to the resources used by the slicing tool.
The following list details the most relevant metrics considered when evaluating a program slice:
2019-12-03 22:52:07 +01:00
2019-11-15 22:34:58 +01:00
\begin{description}
2019-12-09 03:28:35 +01:00
\item[Completeness.] The solution includes all the statements that affect the slicing criterion. This is the most important feature, and almost all techniques and implemented tools set to achieve at least the generation of complete slices. There exists a trivial way of achieving completeness, by including the whole program in the slice.
\item[Correctness.] The solution excludes all statements that do not affect the slicing criterion. Most solutions are complete, but the degree of correctness is what sets them apart, as solutions that are more correct will produce smaller slices, which will execute fewer instructions to compute the same values, decreasing the executing time and complexity.
\item[Features covered.] Which features (polymorphism, global variables, arrays, etc.), programming languages or paradigms a slicing tool is able to cover. There are slicing tools (publicly published or commercially available) for most popular programming languages, from C++ to Erlang. Some slicing techniques only cover a subset of the targeted language, and as such are less useful, but can be a stepping stone in the betterment of the field. There also exist tools that cover multiple languages or that are language-independent \cite{BinGHI14}. A small set-back of language-independent tools is that they are not as efficient in other metrics.
\item[Resource consumption.] Speed and memory consumption for the graph generation and slice creation. As previously stated, slicing is a two-step process: building a graph and traversing it, with the first process being quadratic and the second lineal (in time). Proposals that build upon the SDG try to keep traversal linear, even if that means making the graph bigger or slowing down its building process.
Though this metric may not seem as important as others, program slicing is not a simple analysis. On top of that, some applications of software slicing like debugging constantly change the program and slicing criterion, which makes faster slicing software preferable for them.
Memory consumption is less relevant, mainly due to its availability, but could become a concern in big systems with millions of lines of code. \carlos{Check this.}
2019-11-15 22:34:58 +01:00
\end{description}
2019-12-09 03:28:35 +01:00
\subsection{Variations and applications of program slicing}
2019-12-03 22:52:07 +01:00
2019-12-09 03:28:35 +01:00
As stated before, there are many uses for program slicing: program specialization, software maintenance, code obfuscation... but there is no doubt that program slicing is first and foremost a debugging technique.
Program slicing can also be performed with small variations on the algorithm or on the meaning of ``slice'' and ``slicing criterion'', so that it answers a slightly or totally different question.
Each variation of program slicing answers a different question and serves a different purpose:
2019-11-15 22:34:58 +01:00
\begin{description}
2019-12-09 03:28:35 +01:00
\item[Backward static.] Used to obtain the lines that affect the slicing criterion,
normally used on a line which contains an incorrect value, to track down
2019-11-15 22:34:58 +01:00
the source of the bug.
2019-12-09 03:28:35 +01:00
\item[Forward static \cite{GalL91}.] Used to obtain the lines affected by the slicing criterion,
used to perform software maintenance: when changing a statement, slice the program w.r.t. that statement to discover the parts of the program that will be affected by the change.
2019-11-15 22:34:58 +01:00
\item[Chopping static.] Obtains both the statements affected by and the
2019-12-09 03:28:35 +01:00
statements that affect the selected statement. \carlos{Add application and verify question.}
2019-11-15 22:34:58 +01:00
\item[Dynamic.] Can be combined with any of the previous variations, and
2019-12-09 03:28:35 +01:00
limits the slice to an execution history, only including statements that
2019-11-15 22:34:58 +01:00
have run in a specific execution. The slice produced is much smaller and
2019-12-09 03:28:35 +01:00
useful, but must be recomputed each time. It can be used for debugging when the input values that cause the error are known.
\item[Quasi--static.] In this slicing variant, some input values are given, and some are left
unspecified: the result is a slice sized between the small dynamic slice and
2019-11-15 22:34:58 +01:00
the general but bigger static slice. It can be specially useful when
debugging a set of function calls which have a specific static input for
some parameters, and variable input for others.
\item[Simultaneous.] Similar to dynamic slicing, but considers multiple
2019-12-09 03:28:35 +01:00
executions instead of only one. It is another middle ground between static and dynamic slicing, similarly to quasy-static slicing.
Likewise, it can offer a slightly bigger slice than pure dynamic slicing while keeping the scope focused on the slicing criterion and the set of executions.
2019-11-15 22:34:58 +01:00
\end{description}
2019-12-09 03:28:35 +01:00
There exist many more, which have been detailed in surveys of the field, such as \cite{Sil12}, which analyzes the different dimensions that can be used to classify slicing techniques.
2019-11-15 22:34:58 +01:00
\section{Exception handling in Java}
\label{sec:intro-exception}
2019-12-09 03:28:35 +01:00
Exception handling is common in most modern programming languages. It generally consists of a few new instructions used to modify the normal execution flow and later return to it. Exceptions are used to react to an abnormal program behaviour (controlled or not), and either solve the error and continue the execution, or stop the program gracefully. In our work we focus on the Java programming language, so in the following, we describe the elements that Java uses to represent and handle exceptions:
2019-11-15 22:34:58 +01:00
\begin{description}
\item[Throwable.] An interface that encompasses all the exceptions or errors
2019-12-09 03:28:35 +01:00
that may be thrown. Its two main implementations are \texttt{Error} for internal errors in the Java Virtual Machine and \texttt{Exception} for normal errors. The first ones are generally not caught, as they indicate a critical internal error, such as running out of memory, or overflowing the stack. The second kind encompasses the rest of exceptions that occur in Java.
All exceptions can be classified as either \textit{unchecked}
(those that extend \texttt{RuntimeException} or \texttt{Error}) or
\textit{checked} (all others, may inherit from \texttt{Throwable}, but typically they do so from \texttt{Exception}). Unchecked exceptions may be thrown anywhere without warning, whereas
checked exceptions, if thrown, must be either caught in the same method or declared in the method header.
2019-11-15 22:34:58 +01:00
\item[throws.] A statement that activates an exception, altering the normal
2019-12-09 03:28:35 +01:00
control-flow of the method. If the statement is inside a \texttt{try}
block with a \texttt{catch} clause for its type or any supertype, the
2019-11-15 22:34:58 +01:00
control flow will continue in the first statement of such clause.
Otherwise, the method is exited and the check performed again, until
either the exception is caught or the last method in the stack
2019-12-09 03:28:35 +01:00
(the \texttt{main} method) is popped, and the execution of the program ends
2019-11-15 22:34:58 +01:00
abruptly.
2019-12-09 03:28:35 +01:00
\carlos{Review stopped here.}
\item[try.] This statement contains a block of statements and one
or more \texttt{catch} clauses and/or a \texttt{finally} block.
All exceptions thrown in the statements contained or any methods called will be processed by the list of catches.
2019-11-15 22:34:58 +01:00
\item[catch.] Contains two elements: a variable declaration (the type must
2019-12-03 22:52:07 +01:00
be an exception \sergio{exception o exception type?}) and a block of statements to be executed when an
2019-11-15 22:34:58 +01:00
exception of the corresponding type (or a subtype) is thrown.
2019-12-08 16:07:32 +01:00
\textit{catch} clauses are processed sequentially, and if any matches
2019-11-15 22:34:58 +01:00
the type of the thrown exception, its block is executed, and the rest
are ignored. Variable declarations may be of multiple types
\texttt{(T1|T2 exc)}, when two unrelated types of exception must be
caught and the same code executed for both. When there is an inheritance
relationship, the parent suffices.\footnotemark
\item[finally.] Contains a block of statements that will always be executed
2019-12-08 16:07:32 +01:00
if the \textit{try} is entered. It is used to tidy up, for example
closing I/O streams. The \textit{finally} can be reached in two ways:
with an exception pending (thrown in \textit{try} and not captured by
any \textit{catch} or thrown inside a \textit{catch}) or without it
(when the \textit{try} or \textit{catch} block end successfully). After
2019-11-15 22:34:58 +01:00
the last instruction of the block is executed, if there is an exception
2019-12-08 16:07:32 +01:00
pending, control will be passed to the corresponding \textit{catch} or
2019-11-15 22:34:58 +01:00
the program will end. Otherwise, the execution continues in the next
2019-12-08 16:07:32 +01:00
statement after the \textit{try-catch-finally} block.
2019-11-15 22:34:58 +01:00
\end{description}
\footnotetext{Introduced in Java 7, see \url{https://docs.oracle.com/javase/7/docs/technotes/guides/language/catch-multiple.html} for more details.}
\subsection{Exception handling in other programming languages}
In almost all programming languages, errors can appear (either through the
developer, the user or the system's fault), and must be dealt with. Most of the
2019-12-03 15:12:13 +01:00
popular object--oriented programs feature some kind of error system, normally
2019-11-15 22:34:58 +01:00
very similar to Java's exceptions. In this section, we will perform a small
survey of the error-handling techniques used on the most popular programming
languages. The language list has been extracted from a survey performed by the
programming Q\&A website Stack
Overflow\footnote{\url{https://stackoverflow.com}}. The survey contains a
question about the technologies used by professional developers in their work,
and from that list we have extracted those languages with more than $5\%$ usage
in the industry. Table~\ref{tab:popular-languages} shows the list and its
2019-12-03 22:52:07 +01:00
source. Except Bash, Assembly, VBA, C and G,\sergio{Bash y companyia no tienen mecanismo de exception handling? o no se parece al de Java? No queda claro en esta frase} the rest of the languages shown
2019-11-15 22:34:58 +01:00
feature an exception system similar to the one appearing in Java.
\begin{table}
\begin{minipage}{0.6\linewidth}
\centering
\begin{tabular}{r | r }
\textbf{Language} & $\%$ usage \\ \hline
JavaScript & 69.7 \\ \hline
HTML/CSS & 63.1 \\ \hline
SQL & 56.5 \\ \hline
Python & 39.4 \\ \hline
Java & 39.2 \\ \hline
Bash/Shell/PowerShell & 37.9 \\ \hline
C\# & 31.9 \\ \hline
PHP & 25.8 \\ \hline
TypeScript & 23.5 \\ \hline
C++ & 20.4 \\ \hline
\end{tabular}
\end{minipage}
\begin{minipage}{0.39\linewidth}
\begin{tabular}{r | r }
\textbf{Language} & $\%$ usage \\ \hline
C & 17.3 \\ \hline
Ruby & 8.9 \\ \hline
Go & 8.8 \\ \hline
Swift & 6.8 \\ \hline
Kotlin & 6.6 \\ \hline
R & 5.6 \\ \hline
VBA & 5.5 \\ \hline
Objective-C & 5.2 \\ \hline
Assembly & 5.0 \\ \hline
\end{tabular}
\end{minipage}
% The caption has a weird structure due to the fact that there's a footnote
% inside of it.
\caption[Commonly used programming languages]{The most commonly used
programming languages by professional developers\protect\footnotemark}
\label{tab:popular-languages}
\end{table}
\footnotetext{Data from \url{https://insights.stackoverflow.com/survey/2019/\#technology-\_-programming-scripting-and-markup-languages}}
The exception systems that are similar to Java are mostly all the same,
featuring a \texttt{throw} statement (\texttt{raise} in Python), try-catching
structure and most include a finally block that may be appended to try blocks.
The difference resides in the value passed by the exception, which in languages
that feature inheritance it is a class descending from a generic error or
2019-12-03 22:52:07 +01:00
exception, and in languages without it\sergio{este ``it" se refiere a inheritance? pon algun objeto y elimina algun it porque hay muchos y me lian xD}, it is an arbitrary value (e.g.
2019-11-15 22:34:58 +01:00
JavaScript, TypeScript). In object--oriented programming, the filtering is
performed by comparing if the exception is a subtype of the exception being
caught (Java, C++, C\#, PowerShell\footnotemark, etc.); and in languages with
arbitrary exception values, a boolean condition is specified, and the first
2019-12-03 22:52:07 +01:00
catch block that fulfills its condition is activated, in following\sergio{in following o following?} a pattern
2019-11-15 22:34:58 +01:00
similar to that of \texttt{switch} statements (e.g. JavaScript). In both cases
there exists a way to indicate that all exceptions should be caught, regardless
of type and content.
\footnotetext{Only since version 2.0, released with Windows 7.}
2019-12-03 22:52:07 +01:00
On the other hand, in \deleted{the other languages } \sergio{``the other languages" es muy vago}\added{those languages that do not offer explicit exception handling mechanisms,} \deleted{there exist a variety of systems that emulate or replace exception handling:}\added{this feature is covered by a variety of systems that emulate or replace their behaviour:}
2019-11-15 22:34:58 +01:00
\begin{description} % bash, vba, C and Go exceptions explained
2019-12-03 15:12:13 +01:00
\item[Bash.] The popular Bourne Again SHell features no exception system, apart
2019-11-15 22:34:58 +01:00
from the user's ability to parse the return code from the last statement
executed. Traps can also be used to capture erroneous states and tidy up all
files and environment variables before exiting the program. Traps allow the
programmer to react to a user or system--sent signal, or an exit run from
within the Bash environment. When a trap is activated, its code run, and the
2019-12-03 15:12:13 +01:00
signal does not proceed and stop the program. This does not replace a fully
featured exception system, but \texttt{bash} programs tend to be short, with programmers preferring the efficiency of C or the commodities of
2019-11-15 22:34:58 +01:00
other high--level languages when the task requires it.
2019-12-03 15:12:13 +01:00
\item[VBA.] Visual Basic for Applications is a scripting programming language
2019-11-15 22:34:58 +01:00
based on Visual Basic that is integrated into Microsoft Office to automate
small tasks, such as generating documents from templates, making advanced
computations that are impossible or slower with spreadsheet functions, etc.
The only error--correcting system it has is the directive \texttt{On Error
$x$}, where $x$ can be 0 ---lets the error crash the program---,
\texttt{Next} ---continues the execution as if nothing had happened--- or a
label in the program ---the execution jumps to the label in case of
error. The directive can be set and reset multiple times, therefore creating
artificial \texttt{try-catch} blocks, but there is no possibility of
attaching a value to the error, lowering its usefulness.
2019-12-03 15:12:13 +01:00
\item[C.] In C, errors can also be controlled via return values, but some
instructions featured in it can be used to create a simple exception system.
2019-11-15 22:34:58 +01:00
\texttt{setjmp} and \texttt{longjmp} are two instructions which set up and
perform inter--function jumps. The first makes a snapshot of the call stack
in a buffer, and the second returns to the position where the buffer was
safe, destroying the current state of the stack and replacing it with the
snapshot. Then, the execution continues from the evaluation of
\texttt{setjmp}, which returns the second argument passed to
\texttt{longjmp}.
\begin{example}[User-built exception system in C] \ \\
\label{fig:exceptions-c}
\begin{minipage}{0.5\linewidth}
\begin{lstlisting}[language=C]
2019-12-03 15:12:13 +01:00
int main() {
if (!setjmp(ref)) {
res = safe_sqrt(x, ref);
} else {
// Handle error
printf /* ... */
}
}
2019-11-15 22:34:58 +01:00
\end{lstlisting}
\end{minipage}
\begin{minipage}{0.49\linewidth}
\begin{lstlisting}[language=C]
2019-12-03 15:12:13 +01:00
double safe_sqrt(double x, int ref) {
if (x < 0)
longjmp(ref, 1);
return /* ... */;
}
2019-11-15 22:34:58 +01:00
\end{lstlisting}
\end{minipage}
In the \texttt{main} function, line 2 will be executed twice: first when
2019-12-03 15:12:13 +01:00
it is normally reached ---returning 0 and continuing in line 3--- and the second when line 3 in
2019-11-15 22:34:58 +01:00
\texttt{safe\_sqrt} is run, returning the second argument of \texttt{longjmp},
and therefore entering the else block in the \texttt{main} method.
\end{example}
2019-12-03 15:12:13 +01:00
\item[Go.] The programming language Go is the odd one out in this section, being a
2019-11-15 22:34:58 +01:00
modern programming language without exceptions, though it is an intentional
design decision made by its authors\footnotemark. The argument made was that
exception handling systems introduce abnormal control--flow and complicate
code analysis and clean code generation, as it is not clear the paths that
the code may follow. Instead, Go allows functions to return multiple values,
with the second value typically associated to an error type. The error is
checked before the value, and acted upon. Additionally, Go also features a
simple panic system, with the functions \texttt{panic} ---throws an
exception with a value associated---, \texttt{defer} ---runs after the
function has ended or when a \texttt{panic} has been activated--- and
\texttt{recover} ---stops the panic state and retrieves its value. The
\texttt{defer} statement doubles as catch and finally, and multiple
2019-12-03 22:52:07 +01:00
instances can be accumulated. When appropriate, they will run in LIFO\deleted{order}
(Last In--First Out) \added{order}.
\item[Assembly.] Assembly is a representation of machine code, and each computer architecture has its own instruction set, which makes an analysis impossible. In general, though, no unified exception handling is provided. \carlos{complete with more info on kinds of error handling at the processor level or is this out of scope???}\sergio{Si metes una explicacion asi breve que se entienda bien, si va a ser muy tecnico yo pararia aqui. Diria que las excepciones se manejan a nivel de procesador o lo que sea asi por encima y matizao}
2019-11-15 22:34:58 +01:00
\end{description}
2019-12-05 14:10:59 +01:00
\footnotetext{For more details on Go's design choices, see \url{https://golang.org/doc/faq\#exceptions}. \carlos{Possible transformation to citation???}\sergio{No creo que nos vaya a hacer falta. Con el state of the art y la intro tendremos bastantes.}\josep{mantenlo como footnote}}
2019-11-15 22:34:58 +01:00
% vim: set noexpandtab:tabstop=2:shiftwidth=2:softtabstop=2:wrap