tfm-report/Secciones/motivation.tex

145 lines
10 KiB
TeX
Raw Permalink Normal View History

2019-11-15 22:34:58 +01:00
% !TEX encoding = UTF-8
% !TEX spellcheck = en_GB
2019-11-15 22:34:58 +01:00
% !TEX root = ../paper.tex
\chapter{Introduction}
\label{cha:introduction}
2019-12-09 03:28:35 +01:00
2019-11-15 22:34:58 +01:00
\section{Motivation}
\label{sec:motivation}
2019-12-09 03:28:35 +01:00
\textit{Program slicing} is a technique for program analysis and transformation whose main objective is to extract from a program the set of statements that affect a specific statement and set of variables, called a \textit{slicing criterion} \cite{Wei81,Tip95}. It answers the question ``Which parts of a program affect a set of variables in a specific statement?'' The program obtained by program slicing is called a \textit{slice}, and it has many uses, such as debugging \cite{DeMPS96}, program specialization \cite{OchSV05}, software maintenance \cite{HajF12}, code obfuscation \cite{MajDT07}, etc. This technique was originally defined \cite{Wei81} for a simple imperative programming language, but now can be used with practically all programming languages and paradigms.
2019-12-03 17:02:42 +01:00
2019-12-09 03:28:35 +01:00
\begin{example}[Program slicing applied a simple Java method]
2019-12-09 21:47:09 +01:00
Consider the code shown on the left side of Figure~\ref{fig:program-slicing-code}, which is a simple method written in Java. If that method is sliced with respect to the slicing criterion $\langle 5, x \rangle$ (which represents variable $x$ in line 5), the slice would be the program on the right. The \texttt{if} and print statements would be excluded from the slice, as they do not affect the value of \texttt{x}. As a test, the execution of line 5 on both programs would yield the same result---assuming both the original program and the slice are executed with the same input value.
2019-11-15 22:34:58 +01:00
\label{exa:program-slicing}
2019-12-09 03:28:35 +01:00
\begin{figure}[h]
2019-11-15 22:34:58 +01:00
\begin{minipage}{0.49\linewidth}
\begin{lstlisting}[stepnumber=1]
void f(int x) {
if (x < 0)
System.err.println(x);
x++;
System.out.println(x);
}
\end{lstlisting}
\end{minipage}
\begin{minipage}{0.49\linewidth}
\begin{lstlisting}[stepnumber=1]
void f(int x) {
x++;
System.out.println(x);
}
\end{lstlisting}
\end{minipage}
2019-12-09 21:47:09 +01:00
\caption{A simple Java method (left) and its slice w.r.t. slicing criterion $\langle 5, x \rangle$.}
2019-12-09 03:28:35 +01:00
\label{fig:program-slicing-code}
\end{figure}
2019-11-15 22:34:58 +01:00
\end{example}
2019-12-09 21:47:09 +01:00
As depicted in Example~\ref{exa:program-slicing}, slices are subsets of the original program. In the most general form, the execution of slices produces the same values in the slicing criterion as the original program would. In other words, the slice criterion behaves identically in the slice as in the original. Some uses of program slicing, such as program specialization, require the slices to be executable, which is useful to extract an independent process from a bigger program or software library. Other uses do not, as the slices are used to find the complete set of dependencies of a slicing criterion.
2019-11-15 22:34:58 +01:00
2019-12-09 21:47:09 +01:00
Though it may seem a really powerful technique, many programming languages lack a mature program slicer which covers the whole language. Even commonly widespread languages like Java does not have a complete program slicer that is publicly available, or documented in the literature; which makes it difficult to use program slicing where it may be needed. Nevertheless, there exist commercial program slicers that cover Java, such as CodeSonar\footnote{Created by GrammaTech. For more information, consult their website at \url{https://www.gramatech.com/}}.
2019-12-03 17:02:42 +01:00
2019-12-09 21:47:09 +01:00
Building a program slicer is not a simple task, requiring a considerable amount of analysis to obtain a valid slice. Smaller slices are preferable, but even more difficult to create. In Java specifically there are several scenarios, such as arrays, polymorphism and inheritance, and exception handling that are quite difficult to analyse. This is the reason why a universal solution does not exist for all the problems in the field of program slicing. Conversely, there are many approaches to solve the same slicing problem. Program slicing is used in so many applications---debugging, program comprehension, parallelization, dead code removal---that any improvement to the state of the art improves those processes.
Even though the original proposal by Weiser~\cite{Wei81} focused on an imperative language, program slicing is a language-agnostic technique.
Since then, the literature has been expanded by dozens of authors, that have described and implemented program slicing for more complex structures, such as uncontrolled control flow~\cite{HorwitzRB88}, exception handling~\cite{AllH03}; and for other programming paradigms, such as object--oriented languages~\cite{LarH96}.
2019-12-03 17:02:42 +01:00
2019-12-09 03:28:35 +01:00
Among others, there is an area that has been investigated, but does not have a definitive solution yet: exception handling. Example~\ref{exa:program-slicing2} shows how, even using the latest developments to handle exceptions in program slicing~\cite{AllH03,JiaZSJ06}, the slice produced is not valid.
2019-11-15 22:34:58 +01:00
2019-12-03 15:12:13 +01:00
\begin{example}[Program slicing with exceptions]
2019-12-09 21:47:09 +01:00
Consider Figure~\ref{fig:program-slicing2-code}: the Java program on the left has been sliced (on the right) using Allen et al.'s proposal~\cite{AllH03}; with respect to the slicing criterion $\langle 17, a \rangle$.
2019-11-15 22:34:58 +01:00
\label{exa:program-slicing2}
2019-12-09 03:28:35 +01:00
\begin{figure}[h]
2019-11-15 22:34:58 +01:00
\begin{minipage}{0.49\linewidth}
\begin{lstlisting}[stepnumber=1]
2019-12-03 15:12:13 +01:00
void f(int x) throws Exception {
2019-11-15 22:34:58 +01:00
try {
g(x);
2019-12-03 15:12:13 +01:00
} catch (Exception e) {
2019-11-15 22:34:58 +01:00
System.err.println("Error");
}
System.out.println("g() was ok");
2019-12-03 15:12:13 +01:00
g(x + 1);
2019-11-15 22:34:58 +01:00
}
2019-12-03 15:12:13 +01:00
void g(int a) throws Exception {
if (a == 0) {
throw new Exception();
2019-11-15 22:34:58 +01:00
}
2019-12-03 15:12:13 +01:00
System.out.println(a);
2019-11-15 22:34:58 +01:00
}
\end{lstlisting}
\end{minipage}
\begin{minipage}{0.49\linewidth}
\begin{lstlisting}[stepnumber=1]
2019-12-03 15:12:13 +01:00
void f(int x) throws Exception {
2019-11-15 22:34:58 +01:00
try {
g(x);
}
2019-12-03 15:12:13 +01:00
g(x + 1);
2019-11-15 22:34:58 +01:00
}
2019-12-03 15:12:13 +01:00
void g(int a) throws Exception {
if (a == 0) {
throw new Exception();
}
System.out.println(a);
2019-11-15 22:34:58 +01:00
}
\end{lstlisting}
\end{minipage}
2019-12-09 21:47:09 +01:00
\caption{A simple Java program with exception (left) and its slice w.r.t. $\langle 17, a \rangle$ (right).}
2019-12-09 03:28:35 +01:00
\label{fig:program-slicing2-code}
\end{figure}
As a test of the validity of the slice, we can execute both (with the initial call being \texttt{f(0)}). We can define the \textit{execution history} as the list of instructions executed by a program \cite{KorL88}.
As an example, the execution log of \texttt{g(1)} is \texttt{13, 14, 17}, and the execution log of \texttt{g(0)}, \texttt{13, 14, 15}.
When the program is executed from the call \texttt{f(0)}, the execution history of the original program (left) is: \texttt{1, 2, 3, 13, 14, 15, 4, 5, 8, 10, 13, 14, 17}.
The slicing criterion executes once: \texttt{a} has value 1.
In contrast, the execution history for the slice is \texttt{1, 2, 3, 13, 14, 15}.
Method \texttt{g} throws an exception, which is not caught, and the program ends with an error, stopping abruptly before reaching the slicing criterion.
2019-12-03 17:02:42 +01:00
2019-12-09 03:28:35 +01:00
The problem in this example is that the \texttt{catch} block in line 4 is not included.
2019-12-09 21:47:09 +01:00
This is because---according to the system dependence graph \cite{HorwitzRB88} computed using Allen et al.'s algorithm \cite{AllH03} and shown in Figure~\ref{fig:program-slicing2-graph} below---it does not influence the execution of line 17.
2019-12-09 03:28:35 +01:00
The graph displays the statements of the methods as nodes; and the dependencies between statements as edges. Some nodes have its outline dashed; as they do not correspond to a statement, but are needed by the algorithm.
The node associated with the slicing criterion is marked in bold and the nodes that represent the slice are filled in grey. Note that there are some edges between both methods that are not shown. The only relevant ones (the ones traversed to create the slice) are shown, and the rest are hidden for clarity.
2019-11-18 14:49:48 +01:00
2019-12-09 03:28:35 +01:00
The graph traversal will be explained later, but the basic rule is that edges are traversed backwards starting from the slicing criterion. Any node that is reached is part of the slice, the rest can be disregarded.
2019-11-18 14:49:48 +01:00
2019-12-09 03:28:35 +01:00
\begin{figure}[h]
2019-12-09 21:47:09 +01:00
\centering
2019-12-03 15:12:13 +01:00
\includegraphics[width=\linewidth]{img/motivation-example-pdg}
2019-12-09 03:28:35 +01:00
\caption{The system dependence graph for the method shown in Figure \ref{fig:program-slicing2-code}.}
\label{fig:program-slicing2-graph}
\end{figure}
2019-12-03 15:12:13 +01:00
\end{example}
2019-11-18 14:49:48 +01:00
2019-12-09 03:28:35 +01:00
Example~\ref{exa:program-slicing2} is a contribution of this thesis, because it showcases an important error in the current state of the art.
This example is later generalized (see chapter \ref{cha:solution}), as under some conditions all \texttt{catch} statements are ignored, regardless of if it is needed or not.
The only way a \texttt{catch} block can be included in the slice is if a statement inside it is needed for another reason.
However, Allen et al. \cite{AllH03} did not tackle this problem, as for some examples the \texttt{catch} statement is included or unnecessary.
2019-11-15 22:34:58 +01:00
2019-12-09 21:47:09 +01:00
A real-life, commonly used instance of Example~\ref{exa:program-slicing2} is the writing of any information to a file or a database; or any other instruction that has no data output (excluding side effects) and may throw an exception.
2019-11-15 22:34:58 +01:00
\section{Contributions}
2019-12-09 03:28:35 +01:00
The main contribution of this thesis is a new approach for program slicing with exception handling for Java programs.
Our approach extends the existing techniques proposed by Allen et al. \cite{AllH03}.
It is able to generate valid slices for all cases considered in their work, but it also provides a solution to other cases not contemplated by them. For the sake of completeness and in order to explain the process that leaded us to this solution, we first summarize the fundamentals of program slicing and its terminology; delving deeper in the progress of program slicing techniques related to exception handling.
2019-11-15 22:34:58 +01:00
2019-12-09 03:28:35 +01:00
The rest of this thesis is structured as follows: chapter~\ref{cha:background} summarizes the theoretical background required in program slicing and exception handling, chapter~\ref{cha:incremental} analyzes each structure used in exception handling and explores the already available solution.
Chapter~\ref{cha:solution} provides a list of problems that occur in the state of the art, detailing the scope and importance of each one, and proposes an appropriate solution, chapter~\ref{cha:state-art} provides a bird's eye view of the current state of the art, and finally, chapter~\ref{cha:conclusion} concludes the thesis and explores future avenues of work, such as improvements or optimizations that have not been explored in our solution.
2019-11-15 22:34:58 +01:00
% vim: set noexpandtab:tabstop=2:shiftwidth=2:softtabstop=2:wrap