review of paper
This commit is contained in:
parent
40bffb73a8
commit
ee58837daf
13 changed files with 340 additions and 306 deletions
|
@ -7,27 +7,25 @@
|
|||
|
||||
\section{Program slicing}
|
||||
\textsl{Program slicing} \cite{Wei81,Sil12} is a debugging technique that
|
||||
answers the question: ``which parts of a program \added{do} affect a given statement and
|
||||
variable\added{s}?'' The statement and the variable\added{s} are the basic input to create a slice
|
||||
answers the question: ``which parts of a program affect a given statement and
|
||||
set of variables?'' The statement and the variables are the basic input to create a slice
|
||||
and are called the \textsl{slicing criterion}. The criterion can be more
|
||||
complex, as different slicing techniques may require additional pieces of input.
|
||||
The \textsl{slice} of a program is the list of statements from the original
|
||||
program ---which constitutes a valid program---, whose execution will result in
|
||||
the same values for the variable\added{s} (selected in the slicing criterion)\deleted{ being read
|
||||
by a debugger in the selected statement}.
|
||||
program ---which constitutes a valid program--- whose execution will result in
|
||||
the same values for the variables (selected in the slicing criterion).
|
||||
There exist two fundamental dimensions along which the problem of slicing can be
|
||||
proposed \added{\cite{vocabulary}}:
|
||||
proposed \cite{Sil12}:
|
||||
\begin{itemize}
|
||||
\item \textsl{Static} or \textsl{dynamic}: slicing can be performed
|
||||
statically or dynamically.
|
||||
\textsl{Static slicing} \cite{Wei81} \added{produces slices that}\deleted{is a slice which} consider\deleted{s} all
|
||||
possible executions of the program, only taking into account the
|
||||
semantics of the programming language.
|
||||
In contrast, \textsl{dynamic slicing} \cite{KorL88} \added{considers a single execution of the program, thus, limiting} \deleted{limits} the slice to
|
||||
\textsl{Static slicing} \cite{Wei81} produces slices which consider all
|
||||
possible executions of the program: the slice will be correct regardless of the input supplied.
|
||||
In contrast, \textsl{dynamic slicing} \cite{KorL88} considers a single execution of the program, thus, limiting the slice to
|
||||
the statements present in an execution log. The slicing criterion is
|
||||
expanded to include a position in the log that corresponds to one
|
||||
instance of the selected statement, making it much more specific. It may
|
||||
help finding a bug related to indeterministic behavior (such as a random
|
||||
help find a bug related to indeterministic behavior (such as a random
|
||||
or pseudo-random number generator), but must be recomputed for each case
|
||||
being analyzed.
|
||||
\item \textsl{Backward} or \textsl{forward}: \textsl{backward slicing}
|
||||
|
@ -35,8 +33,7 @@ proposed \added{\cite{vocabulary}}:
|
|||
that affect the slicing criterion. In contrast, \textsl{forward slicing}
|
||||
\cite{BerC85} computes the statements that are affected by the slicing
|
||||
criterion. There also exists a mixed approach called \textsl{chopping}
|
||||
\cite{JacR94}, which is used to find all statements that affect \added{some variables in the slicing criterion and at the same time they are affected by some other variables in} \deleted{or are
|
||||
affected by} the slicing criterion.
|
||||
\cite{JacR94}, which is used to find all statements that affect some variables in the slicing criterion and at the same time they are affected by some other variables in the slicing criterion.
|
||||
\end{itemize}
|
||||
|
||||
Since the definition of program slicing, the most extended form of slicing has
|
||||
|
@ -45,7 +42,7 @@ affect the value of a variable in a given statement, in all possible executions
|
|||
of the program (i.e., for any input data).
|
||||
\begin{definition}[Strong static backward slice \cite{Wei81,HorwitzRB88}]
|
||||
\label{def:strong-slice}
|
||||
\carlos{Falta ver exactamente cuál es la cita correcta.}
|
||||
\carlos{One of the citations is the correct one.}
|
||||
Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where
|
||||
$s$ is a statement and $v$ is a set of variables in $P$ (the variables may
|
||||
or may not be used in $s$), $S$ is the \textsl{strong slice} of $P$ with
|
||||
|
@ -61,7 +58,7 @@ of the program (i.e., for any input data).
|
|||
|
||||
\begin{definition}[Weak static backward slice \cite{RepY89}]
|
||||
\label{def:weak-slice}
|
||||
\carlos{Comprobar cita y escribir formalmente}
|
||||
\carlos{Check citation and improve ``formalization''?}
|
||||
Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where
|
||||
$s$ is a statement and $v$ is a set of variables in $P$ (the variables may
|
||||
or may not be used in $s$), $S$ is the \textsl{weak slice} of $P$ with
|
||||
|
@ -73,29 +70,28 @@ of the program (i.e., for any input data).
|
|||
for each of the variables in $v$ when executing $P$ is a prefix of
|
||||
those produced while executing $S$ ---which means that the slice
|
||||
may continue producing values, but the first values produced always
|
||||
match up with \added{all those produced with} the original program.
|
||||
match up with all those produced by the original program.
|
||||
\end{enumerate}
|
||||
\end{definition}
|
||||
|
||||
Both definitions (\ref{def:strong-slice} and~\ref{def:weak-slice}) are
|
||||
used throughout the literature \added{(see, e.g., \cite{pending})}, with some cases favoring the first and some the
|
||||
used throughout the literature (see, e.g., \cite{pending}\carlos{Which citation? Most papers on exception slicing do not indicate or hint whether they use strong or weak.}), with some cases favoring the first and some the
|
||||
second. Though the definitions come from the corresponding citations, the naming
|
||||
was first used in a control dependency analysis by Danicic~\cite{DanBHHKL11},
|
||||
where slices \added{that}\deleted{which} produce the same output as the original are named
|
||||
where slices that produce the same output as the original are named
|
||||
\textsl{strong}, and those where the original is a prefix of the slice,
|
||||
\textsl{weak} \carlos{Se podría argumentar que con el slice débil es suficiente
|
||||
para debugging, ya que si un error se presenta en el original, aparecerá también en el programa fragmentado}. \josep{Pues si. añade un parrafo. a continuacion explicando ese hecho, porque asi justificas la existencia de los dos. Un lector que no sepa de slicing ahora mismo se esta preguntando para que sirve la weak :-)}
|
||||
\textsl{weak}. Weak slicing tends to be preferred ---specially for debugging--- for two reasons: the algorithm can be simpler and avoid dealing with termination, and the slices can be smaller, narrowing the focus of the debugger. For some applications, strong slices are preferred, such as extracting a feature from a program, where there is a requirement that the resulting slice behave exactly like the original. In this paper we will indicate which kind of slice is produced with each new technique proposed.
|
||||
|
||||
\begin{example}[Strong, weak and incorrect slices]
|
||||
\carlos{The table is labeled execution logs of... but the execution log is a different thing.}
|
||||
In table~\ref{tab:slice-weak} we can observe examples for the various
|
||||
definitions. Each row shows the values produced by the execution of a
|
||||
program or one of its slices. The first is the original, which computes
|
||||
$3!$. Slice A is one slice, whose execution is identical and therefore is a
|
||||
strong slice. Slice B \added{correctly produced the same values as the original program}\deleted{is correct} but \added{it} continues producing values after the
|
||||
original stops ---a weak slice. It would fit the relaxed definition but not
|
||||
a strong one. Slice C is incorrect, as the values differ from the original.
|
||||
Some data or control dependency has not been included in the slice and the
|
||||
program are behaving in a different way.
|
||||
program or one of its slices.
|
||||
The first is the original, which computes $3!$.
|
||||
Slice A's execution log is identical to the original and therefore it is a strong slice.
|
||||
Slice B is a weak slice: its execution correctly produces the same values as the original program, but it continues producing values after the original stops.
|
||||
Slice C is incorrect, as the values differ from the original.
|
||||
Some data or control dependency has not been included in the slice and the program produce different results, in this case the slice computes Fibonacci numbers instead of factorials.
|
||||
\end{example}
|
||||
|
||||
\begin{table}
|
||||
|
@ -112,24 +108,24 @@ para debugging, ya que si un error se presenta en el original, aparecerá tambi
|
|||
\end{table}
|
||||
|
||||
Program slicing is a language--agnostic tool, but the original proposal by
|
||||
Weiser~\cite{Wei81} \added{covered}\deleted{covers} a simple imperative programming language.
|
||||
Since \added{then}, the literature has been expanded by dozens of authors, that have
|
||||
Weiser~\cite{Wei81} covered a simple imperative programming language.
|
||||
Since then, the literature has been expanded by dozens of authors, that have
|
||||
described and implemented slicing for more complex structures, such as
|
||||
uncontrolled control flow~\cite{HorwitzRB88}, global variables~\cite{???},
|
||||
exception handling~\cite{AllH03}; and for other programming paradigms, such as
|
||||
object-oriented languages~\cite{???} or functional languages~\cite{???}.
|
||||
object--oriented languages~\cite{???} or functional languages~\cite{???}.
|
||||
\carlos{Se pueden poner más, faltan las citas correspondientes.}
|
||||
|
||||
\subsection{The System Dependence Graph (SDG)}
|
||||
|
||||
There exist multiple approaches to compute a slice from a given program and
|
||||
\added{slicing} criterion, but the most efficient and broadly use\added{d} data structure is the System
|
||||
slicing criterion, but the most efficient and broadly used data structure is the System
|
||||
Dependence Graph (SDG), first introduced by Horwitz, Reps and
|
||||
Blinkey~\cite{HorwitzRB88}. It is computed from the program's statements, and
|
||||
once built, a slicing criterion is chosen, the graph traversed using a specific
|
||||
algorithm, and the slice obtained. Its efficiency resides in the fact that for
|
||||
multiple slices that share the same program, the graph must only be built once.
|
||||
On top of that, building the graph has a complexity of $\mathcal{O}(n^2)$ with
|
||||
On top of that, building the graph has a complexity of $\mathcal{O}(n^2)$ \carlos{uso $\mathcal{O}$ o $O$?} with
|
||||
respect to the number of statements in a program, but the traversal is linear
|
||||
with respect to the number of nodes in the graph (each corresponding to a
|
||||
statement).
|
||||
|
@ -141,9 +137,9 @@ dependencies among nodes. Those edges represent various kinds of dependencies
|
|||
---control, data, calls, parameter passing, summary--- which will be defined in
|
||||
section~\ref{sec:first-def-sdg}.
|
||||
|
||||
To create the SDG, first a \textsl{control flow graph} \added{(CFG)} is built for each method
|
||||
To create the SDG, first a \textsl{control flow graph} (CFG) is built for each method
|
||||
in the program, then its control and data dependencies are computed, resulting
|
||||
in the \textsl{program dependence graph} \added{(PDG)}. Finally, all the graphs from every
|
||||
in the \textsl{program dependence graph} (PDG). Finally, all the graphs from every
|
||||
method are joined into the SDG. This process will be explained at greater
|
||||
lengths in section~\ref{sec:first-def-sdg}.
|
||||
%TODO: marked for removal --- this process is repeated later in ref{sec:first-deg-sdg}
|
||||
|
@ -170,8 +166,8 @@ lengths in section~\ref{sec:first-def-sdg}.
|
|||
%\end{description}
|
||||
An example is provided in figure~\ref{fig:basic-graphs}, where a simple
|
||||
multiplication program is converted to CFG, then PDG and finally SDG. For
|
||||
simplicity, only the CFG and PDG of \texttt{multiply} are shown \josep{en realidad también está el SDG)} . Control
|
||||
dependencies are black, data dependencies red\added{,} and summary edges blue.
|
||||
simplicity, only the CFG and PDG of \texttt{main} are omitted. Control
|
||||
dependencies are black, data dependencies red, and summary edges blue.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
|
@ -203,14 +199,13 @@ There are four relevant metrics considered when evaluating a slicing algorithm:
|
|||
|
||||
\begin{description}
|
||||
\item[Completeness.] The solution includes all the statements that affect
|
||||
the \added{slicing criterion}\deleted{slice}. This is the most important feature, and almost all
|
||||
the slicing criterion. This is the most important feature, and almost all
|
||||
publications achieve at least completeness. Trivial completeness is
|
||||
easily achievable, as simple as including the whole program in the
|
||||
slice.
|
||||
\item[Correctness.] The solution excludes all statements that \added{do not}\deleted{don't} affect
|
||||
the \added{slicing criterion}\deleted{slice}. Most solutions are complete, but the degree of correctness is
|
||||
what sets them apart, as smaller slices will not execute unnecessary
|
||||
code to compute the values, decreasing the executing time.
|
||||
\item[Correctness.] The solution excludes all statements that do not affect
|
||||
the slicing criterion. Most solutions are complete, but the degree of correctness is
|
||||
what sets them apart, as solutions that are more correct will produce smaller slices, which will execute fewer instructions to compute the same values, decreasing the executing time and complexity.
|
||||
\item[Features covered.] Which features or language a slicing algorithm
|
||||
covers. Different approaches to slicing cover different programming
|
||||
languages and even paradigms. There are slicing techniques (published or
|
||||
|
@ -219,12 +214,11 @@ There are four relevant metrics considered when evaluating a slicing algorithm:
|
|||
language, and as such are less useful for commercial applications, but
|
||||
can be a stepping stone in the betterment of the field.
|
||||
\item[Speed.] Speed of graph generation and slice creation. As previously
|
||||
stated, slicing is a two-step process: build\added{ing} a graph and travers\deleted{e}\added{ing} it.
|
||||
The traversal is linear in most proposals, with small variations. Graph
|
||||
generation tends to be longer and with higher variance, but it is not as
|
||||
relevant, because it is only done once (per program being analyzed). As
|
||||
such, this is the least important metric. Only proposals that deviate
|
||||
from the aforementioned schema show a wider variation in speed.
|
||||
stated, slicing is a two-step process: building a graph and traversing it.
|
||||
The traversal is a linear two--pass analysis of a graph in most proposals, with small variations.
|
||||
Graph generation tends to be a longer process, but it is not as
|
||||
relevant, because it is only done once (per program being analyzed), making this the least important metric.
|
||||
Only proposals that deviate from the aforementioned schema of building a graph and traversing it show a wider variation in speed.
|
||||
\end{description}
|
||||
|
||||
\subsection{Program slicing as a debugging technique}
|
||||
|
@ -236,8 +230,8 @@ variation a different purpose:
|
|||
\item[Backward static.] Used to obtain the lines that affect a statement,
|
||||
normally used on a line which outputs an incorrect value, to narrow down
|
||||
the source of the bug.
|
||||
\item[Forward\deleted{e} static.] Used to obtain the lines affected by a statement,
|
||||
used to identify dead code, to check the effects a line has \added{on}\deleted{in} the rest
|
||||
\item[Forward static.] Used to obtain the lines affected by a statement,
|
||||
used to identify dead code, to check the effects a line has on the rest
|
||||
of the program.
|
||||
\item[Chopping static.] Obtains both the statements affected by and the
|
||||
statements that affect the selected statement.
|
||||
|
@ -254,7 +248,8 @@ variation a different purpose:
|
|||
executions instead of only one. Similarly to quasy--static slicing, it
|
||||
can offer a slightly bigger slice while keeping the scope focused on the
|
||||
source of the bug.
|
||||
\carlos{completar}
|
||||
\item
|
||||
\carlos{añadir más quizá???}
|
||||
\end{description}
|
||||
|
||||
\section{Exception handling in Java}
|
||||
|
@ -264,12 +259,11 @@ Exception handling is common in most modern programming languages. In Java, it
|
|||
consists of the following elements:
|
||||
\begin{description}
|
||||
\item[Throwable.] An interface that encompasses all the exceptions or errors
|
||||
that may be thrown. Child classes are \texttt{Exception} for most errors
|
||||
and \texttt{Error} for internal errors in the Java Virtual Machine.
|
||||
Exceptions can be classified in\added{to} two categories: \textsl{unchecked}
|
||||
(those inheriting from \texttt{RuntimeException} or \texttt{Error}) and
|
||||
\textsl{checked} (the rest). The first may be thrown anywhere, whereas
|
||||
the second, if thrown, must be caught or declared in the method header.
|
||||
that may be thrown. Its child classes are \texttt{Error} for internal errors in the Java Virtual Machine and \texttt{Exception} for normal errors.
|
||||
Exceptions can be classified as \textsl{unchecked}
|
||||
(those that extend \texttt{RuntimeException} or \texttt{Error}) and
|
||||
\textsl{checked} (all others, may inherit from \texttt{Throwable}, but typically they do so from \texttt{Exception}). The first kind may be thrown anywhere without warning, whereas
|
||||
the second, if thrown, must be either caught in the same method or declared in the method header.
|
||||
\item[throws.] A statement that activates an exception, altering the normal
|
||||
control-flow of the method. If the statement is inside a \textsl{try}
|
||||
block with a \textsl{catch} clause for its type or any supertype, the
|
||||
|
@ -310,7 +304,7 @@ consists of the following elements:
|
|||
|
||||
In almost all programming languages, errors can appear (either through the
|
||||
developer, the user or the system's fault), and must be dealt with. Most of the
|
||||
popular object\added{-}oriented programs feature some kind of error system, normally
|
||||
popular object--oriented programs feature some kind of error system, normally
|
||||
very similar to Java's exceptions. In this section, we will perform a small
|
||||
survey of the error-handling techniques used on the most popular programming
|
||||
languages. The language list has been extracted from a survey performed by the
|
||||
|
@ -383,17 +377,16 @@ On the other hand, in the other languages there exist a variety of systems that
|
|||
emulate or replace exception handling:
|
||||
|
||||
\begin{description} % bash, vba, C and Go exceptions explained
|
||||
\item[Bash] The popular Bourne Again SHell features no exception system, apart
|
||||
\item[Bash.] The popular Bourne Again SHell features no exception system, apart
|
||||
from the user's ability to parse the return code from the last statement
|
||||
executed. Traps can also be used to capture erroneous states and tidy up all
|
||||
files and environment variables before exiting the program. Traps allow the
|
||||
programmer to react to a user or system--sent signal, or an exit run from
|
||||
within the Bash environment. When a trap is activated, its code run, and the
|
||||
signal \added{does not}\deleted{doesn't} proceed and stop the program. This \added{does not}\deleted{doesn't} replace a fully
|
||||
featured exception system, but \texttt{bash} programs tend to be small in
|
||||
size, with programmers preferring the efficiency of C or the commodities of
|
||||
signal does not proceed and stop the program. This does not replace a fully
|
||||
featured exception system, but \texttt{bash} programs tend to be short, with programmers preferring the efficiency of C or the commodities of
|
||||
other high--level languages when the task requires it.
|
||||
\item[VBA] Visual Basic for Applications is a scripting programming language
|
||||
\item[VBA.] Visual Basic for Applications is a scripting programming language
|
||||
based on Visual Basic that is integrated into Microsoft Office to automate
|
||||
small tasks, such as generating documents from templates, making advanced
|
||||
computations that are impossible or slower with spreadsheet functions, etc.
|
||||
|
@ -404,8 +397,8 @@ emulate or replace exception handling:
|
|||
error. The directive can be set and reset multiple times, therefore creating
|
||||
artificial \texttt{try-catch} blocks, but there is no possibility of
|
||||
attaching a value to the error, lowering its usefulness.
|
||||
\item[C] In C, errors can also be control\added{led} via return values, but some of the
|
||||
instructions it features can be used to create a simple exception system.
|
||||
\item[C.] In C, errors can also be controlled via return values, but some
|
||||
instructions featured in it can be used to create a simple exception system.
|
||||
\texttt{setjmp} and \texttt{longjmp} are two instructions which set up and
|
||||
perform inter--function jumps. The first makes a snapshot of the call stack
|
||||
in a buffer, and the second returns to the position where the buffer was
|
||||
|
@ -417,31 +410,31 @@ emulate or replace exception handling:
|
|||
\label{fig:exceptions-c}
|
||||
\begin{minipage}{0.5\linewidth}
|
||||
\begin{lstlisting}[language=C]
|
||||
int main() {
|
||||
if (!setjmp(ref)) {
|
||||
res = safe_sqrt(x, ref);
|
||||
} else {
|
||||
// Handle error
|
||||
printf /* ... */
|
||||
}
|
||||
}
|
||||
int main() {
|
||||
if (!setjmp(ref)) {
|
||||
res = safe_sqrt(x, ref);
|
||||
} else {
|
||||
// Handle error
|
||||
printf /* ... */
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
\end{minipage}
|
||||
\begin{minipage}{0.49\linewidth}
|
||||
\begin{lstlisting}[language=C]
|
||||
double safe_sqrt(double x, int ref) {
|
||||
if (x < 0)
|
||||
longjmp(ref, 1);
|
||||
return /* ... */;
|
||||
}
|
||||
double safe_sqrt(double x, int ref) {
|
||||
if (x < 0)
|
||||
longjmp(ref, 1);
|
||||
return /* ... */;
|
||||
}
|
||||
\end{lstlisting}
|
||||
\end{minipage}
|
||||
In the \texttt{main} function, line 2 will be executed twice: first when
|
||||
it is normally reached ---returning 0--- and the second when line 3 in
|
||||
it is normally reached ---returning 0 and continuing in line 3--- and the second when line 3 in
|
||||
\texttt{safe\_sqrt} is run, returning the second argument of \texttt{longjmp},
|
||||
and therefore entering the else block in the \texttt{main} method.
|
||||
\end{example}
|
||||
\item[Go] The programming language Go is the odd one out in this section, being a
|
||||
\item[Go.] The programming language Go is the odd one out in this section, being a
|
||||
modern programming language without exceptions, though it is an intentional
|
||||
design decision made by its authors\footnotemark. The argument made was that
|
||||
exception handling systems introduce abnormal control--flow and complicate
|
||||
|
@ -456,8 +449,9 @@ emulate or replace exception handling:
|
|||
\texttt{defer} statement doubles as catch and finally, and multiple
|
||||
instances can be accumulated. When appropriate, they will run in LIFO order
|
||||
(Last In--First Out).
|
||||
\item[Assembly.] Assembly is a representation of machine code, and each computer architecture has its own instruction set, which makes an analysis impossible. In general, though, no unified exception handling is provided. \carlos{complete with more info on kinds of error handling at the processor level or is this out of scope???}
|
||||
\end{description}
|
||||
|
||||
\footnotetext{\url{https://golang.org/doc/faq\#exceptions}}
|
||||
\footnotetext{For more details on Go's design choices, see \url{https://golang.org/doc/faq\#exceptions}. \carlos{Possible transformation to citation???}}
|
||||
|
||||
% vim: set noexpandtab:tabstop=2:shiftwidth=2:softtabstop=2:wrap
|
||||
|
|
|
@ -2,6 +2,9 @@
|
|||
% !TEX spellcheck = en_US
|
||||
% !TEX root = ../paper.tex
|
||||
\chapter{Main explanation?}
|
||||
\label{cha:incremental}
|
||||
|
||||
\carlos{Review if we want to call nodes ``Enter'' and ``Exit'' or ``Start'' and ``End'' (I'd prefer the first one).}
|
||||
|
||||
\section{First definition of the SDG}
|
||||
\label{sec:first-def-sdg}
|
||||
|
@ -10,7 +13,7 @@ The system dependence graph (SDG) is a method for program slicing that was first
|
|||
proposed by Horwitz, Reps and Blinkey \cite{HorwitzRB88}. It builds upon the
|
||||
existing control flow graph (CFG), defining dependencies between vertices of the
|
||||
CFG, and building a program dependence graph (PDG), which represents them. The
|
||||
system dependence graph (SDG) is then \added{built}\deleted{build} from the assembly of the different
|
||||
system dependence graph (SDG) is then built from the assembly of the different
|
||||
PDGs (each representing a method of the program), linking each method call to
|
||||
its corresponding definition. Because each graph is built from the previous one,
|
||||
new constructs can be added with to the CFG, without the need to alter the
|
||||
|
@ -33,134 +36,103 @@ edges, each connected to the first statement that should be executed, according
|
|||
to the result of evaluating the conditional expression in the guard of the
|
||||
predicate.
|
||||
|
||||
\begin{definition}[Control Flow Graph~\cite{???}]
|
||||
A \emph{control flow graph} $G$ of a program $P$ is a tuple $\langle N, E \rangle$, where $N$ is a set of nodes, composed of a method's statements and two special nodes, ``Start'' and ``End''. $E$ is a set of edges of the form $e = \left(n_1, n_2\right)$ a directed edge from $n_1$ to $n_2$
|
||||
|
||||
\josep{solo has dicho la parte sintactica. Yo añadiría que existe un arco de el nodo $n_1$ al nodo $n_2$ si y solo si, en alguna ejecución, $n_2$ es ejecutado inmediatamente despues de $n_1$}
|
||||
\begin{definition}[Control Flow Graph \carlos{add original citation}]
|
||||
A \emph{control flow graph} $G$ of a program $P$ is a directed graph, represented as a tuple $\langle N, E \rangle$, where $N$ is a set of nodes, composed of a method's statements plus two special nodes, ``Start'' and ``End''; and $E$ is a set of edges of the form $e = \left(n_1, n_2\right) | n_1, n_2 \in N$. Most algorithms to generate the SDG mandate the ``Start'' node to be the only source and ``End'' to be the only sink in the graph. \carlos{Is it necessary to define source and sink in the context of a graph?}.
|
||||
|
||||
Edges are created according to the possible execution paths that exist; each statement is connected to any statement that may immediately follow it. Formally, an edge $e = (n_1, n_2)$ exists if and only if there exists an execution of the program where $n_2$ is executed immediately after $n_1$. In general, expressions are not evaluated; so an \texttt{if} instruction has two outgoing edges even if the condition is always true or false, e.g. \texttt{1 == 0}.
|
||||
\end{definition}
|
||||
|
||||
To build the PDG and then the SDG, some dependencies must be extracted from the CFG, which are defined as follows:
|
||||
To build the PDG and then the SDG, there are two dependencies based directly on the CFG's structure: data and control dependence.
|
||||
|
||||
\begin{definition}[Postdominance]
|
||||
Vertex $b$ \textit{postdominates} vertex \added{$a$}\deleted{$b$} if and only if $a \neq b$ and $b$ is on every path from $a$ to the ``End'' vertex.
|
||||
\begin{definition}[Postdominance \carlos{add original citation?}]
|
||||
Vertex $b$ \textit{postdominates} vertex $a$ if and only if $b$ is on every path from $a$ to the ``End'' vertex.
|
||||
\end{definition}
|
||||
|
||||
\begin{definition}[Control dependency]
|
||||
\begin{definition}[Control dependency \carlos{add original citation}]
|
||||
\label{def:ctrl-dep}
|
||||
Vertex $b$ is \textit{control dependent} on vertex $a$ ($a \ctrldep b$) if and only if $b$ postdominates one but not all of $a$'s successors. It follows that a vertex with only one successor cannot be the source of control dependence.
|
||||
\end{definition}
|
||||
|
||||
\josep{de donde has sacado esta definicion? La veo incorrecta. Por ejemplo, si tenemos $if ~a ~then ~b$, $b$ no postdomina ningun sucesor de $a$}
|
||||
|
||||
\begin{definition}[Data dependency]
|
||||
Vertex $b$ is \textit{data dependent} on vertex $a$ ($a \datadep b$) if and only if $a$ may define a variable $x$, $b$ may use $x$ and there \added{exists} an $x$-definition free path from $a$ to $b$.\footnote{The initial definition of data dependency was further split into in-loop data dependencies and the rest, but the difference is not relevant for computing the slices in the SDG.}
|
||||
\begin{definition}[Data dependency \carlos{add original citation}]
|
||||
Vertex $b$ is \textit{data dependent} on vertex $a$ ($a \datadep b$) if and only if $a$ may define a variable $x$, $b$ may use $x$ and there exists a \carlos{could it be ``an''??} $x$-definition free path from $a$ to $b$.
|
||||
|
||||
Data dependency was originally defined as flow dependency, and split into loop and non--loop related dependencies, but that distinction is no longer useful to compute program slices.
|
||||
It should be noted that variable definitions and uses can be computed for each statement independently, analyzing the procedures called by it if necessary. The variables used and defined by a procedure call are those used and defined by its body.
|
||||
\end{definition}
|
||||
|
||||
It should be noted that variable definitions and uses can be computed for each
|
||||
statement independently, analyzing the procedures called by it if necessary. In
|
||||
general, any instruction uses all variables that appear in it, save for the
|
||||
left-hand side of assignments. Similarly, no instruction defines variables,
|
||||
except those in the left-hand side of assignments. The variables used and
|
||||
defined by a procedure call are those used and defined by its body.
|
||||
|
||||
With the data and control dependencies, the PDG may be built\deleted{,} by replacing the
|
||||
With the data and control dependencies, the PDG may be built by replacing the
|
||||
edges from the CFG by data and control dependence edges. The first tends to be
|
||||
represented as a thin solid line, and the latter as a thick solid line. In the
|
||||
examples, data dependencies will be thin solid red lines.
|
||||
|
||||
The organization of the vertices of the PDG tends to resemble a tree graph, with
|
||||
the ``Start'' node in the position of the root (at the top), and the ``End''
|
||||
node typically omitted. The control dependence edges structure the tree
|
||||
vertically. In the case that a vertex is control dependent on multiple vertices,
|
||||
it will be placed one level below the lowest source of control dependency. With
|
||||
a programming language this simple, cyclical control dependencies do not appear,
|
||||
but should they do so in further sections, the instructions are sorted top to
|
||||
bottom in the order they appear in the program. Horizontally, the vertices are
|
||||
sorted by their order in the program, left to right, in order to make the graph
|
||||
more readable. Data dependency edges are placed without reordering the nodes of
|
||||
the graph. In the examples given, edges like $a \datadep a$ or $b \ctrldep b$
|
||||
may be omitted, as they are not relevant for later use of the graph. Please be
|
||||
noted that the location of the vertices is irrelevant for the slicing algorithm,
|
||||
and the aforementioned sorting rules are just for consistency with previous
|
||||
papers on the topic and to ease the visualization of programs.
|
||||
\begin{definition}[Program dependence graph]
|
||||
The \textsl{program dependence graph} (PDG) is a directed graph (and originally a tree) represented by three elements: a set of nodes $N$, a set of control edges $E_c$ and a set of data edges $E_d$.
|
||||
|
||||
The set of nodes corresponds to the set of nodes of the CFG, excluding the ``End'' node.
|
||||
|
||||
Both sets of edges are built as follows. There is a control edge between two nodes $n_1$ and $n_2$ if and only if $n_1 \ctrldep n_2$, and a data edge between $n_1$ and $n_2$ if and only if $n_1 \datadep n_2$. Additionally, if a node $n$ does not have any incoming control edges, it has a ``default'' control edge $e = (\textnormal{Start},n)$; so that ``Start'' is the only source node of the graph.
|
||||
|
||||
Note: the most common graphical representation is a tree--like structure based on the control edges, and nodes sorted left to right according to their position on the original program. Data edges do not affect the structure, so that the graph is easily readable.
|
||||
\end{definition}
|
||||
|
||||
Finally, the SDG is built from the combination of all the PDGs that compose the
|
||||
program. Each call vertex is connected to the ``Start'' of the corresponding
|
||||
procedure. All edges that connect PDGs are represented with dashed lines.
|
||||
program.
|
||||
|
||||
\begin{figure}
|
||||
\begin{minipage}{0.3\linewidth}
|
||||
\begin{lstlisting}
|
||||
proc main() {
|
||||
a = 10;
|
||||
b = 20;
|
||||
f(a, b);
|
||||
print(a);
|
||||
\begin{definition}[System dependence graph]
|
||||
The \textsl{system dependence graph} (SDG) is a directed graph that represents the control and data dependencies of a whole program. It has three kinds of edges: control, data and function call. The graph is built combining multiple PDGs, with the ``Start'' nodes labeled after the function they begin. There exists one function call edge between each node containing one or more calls and each of the ``Start'' node of the method called. In a programming language where the function call is ambiguous (e.g. with pointers or polymorphism), there exists one edge leading to every possible function called.
|
||||
\end{definition}
|
||||
|
||||
\begin{example}[Creation of a SDG from a simple program]
|
||||
Given the program shown below (left), the control flow graphs for both methods are shown on the right: \\
|
||||
\begin{minipage}{0.2\linewidth}
|
||||
\begin{lstlisting}
|
||||
proc main() {
|
||||
a = 10;
|
||||
b = 20;
|
||||
f(a, b);
|
||||
}
|
||||
|
||||
proc f(x, y) {
|
||||
while (x > y) {
|
||||
x = x - 1;
|
||||
}
|
||||
|
||||
proc f(x, y) {
|
||||
while (x > y) {
|
||||
x = x - 1;
|
||||
}
|
||||
print(x);
|
||||
}
|
||||
\end{lstlisting}
|
||||
\end{minipage}
|
||||
\begin{minipage}{0.6\linewidth}
|
||||
\includegraphics[width=0.3\linewidth]{img/cfgsimple}
|
||||
\includegraphics[width=0.65\linewidth]{img/cfgsimple2}
|
||||
\end{minipage}
|
||||
\includegraphics[width=0.5\linewidth]{img/pdgsimple}
|
||||
\includegraphics[width=0.49\linewidth]{img/pdgsimple2}
|
||||
\includegraphics[width=0.6\linewidth]{img/sdgsimple}
|
||||
\includegraphics[width=0.4\linewidth]{img/legendsimple}
|
||||
\caption{A simple program with its CFGs (top right), PDGs (center) and SDG (bottom).}
|
||||
\label{fig:sdg-loop}
|
||||
\end{figure}
|
||||
print(x);
|
||||
}
|
||||
\end{lstlisting}
|
||||
\end{minipage}
|
||||
\begin{minipage}{0.79\linewidth}
|
||||
\includegraphics[width=0.6\linewidth]{img/cfgsimple}
|
||||
\end{minipage}
|
||||
|
||||
\subsubsection{Procedures and data dependencies}
|
||||
Then, control and data dependencies are computed, arranging the nodes in the PDG. Finally, the two graphs are connected with summary edges to create the SDG:
|
||||
|
||||
The only thing left to explain before introducing more constructs into the
|
||||
language is the passing of parameters. Most programming language\added{s} accept \added{an arbitrary}\deleted{a
|
||||
variable} number of input parameters and one output parameter. In the case of
|
||||
input parameters passed by reference, or constructs such as structs or classes,
|
||||
modifying a field of a parameter may modify the original variable. In order to
|
||||
\added{properly} deal with \deleted{everything related to} parameter passing, including global variables,
|
||||
class fields, etc. there is a small extension to be made to the CFG and PDG \added{\cite{pendinmg}}.
|
||||
\begin{center}
|
||||
\includegraphics[width=0.8\linewidth]{img/sdgsimple}
|
||||
\end{center}
|
||||
\end{example}
|
||||
|
||||
In the CFG, the ``Start'' and ``End'' nodes contain a list of assignments,
|
||||
inputting and outputting respectively the appropriate values, as can be seen in
|
||||
the example \josep{qué ejemplo? si hay un ejemplo, ponle un identificador y referencialo aquí}. Consequently, every vertex that contains a procedure or function
|
||||
call pack and unpack the arguments. For every variable $x$ that is used in a
|
||||
procedure, every call to it must be preceded by $x_{in} = x$, and the
|
||||
procedures's ``Start'' vertex must contain $x = x_{in}$. The opposite happens
|
||||
when a variable must be ``outputted''\carlos{replace}: before the ``End'' node,
|
||||
the value must be packed ($x_{out} = x$), and after each call, the value must be
|
||||
assigned to the corresponding variable ($x = x_{out}$). Parameters may be
|
||||
assigned as $par^i_{in} = expr_i$ (where $i$ is the index of the parameter in
|
||||
the procedure definition, $par^i$ is the name of the parameter and $expr_i$ is
|
||||
the expression in the $i^{th}$ position in the procedure call) in the call
|
||||
vertex, and parameters whose modifications inside the procedure are passed back
|
||||
to the calling procedure must be extracted as $var = par^i_{out}$ (where $var$
|
||||
is the name of the variable ---passed by reference--- in the calling
|
||||
procedure).\carlos{What if object/struct passed by value?} \josep{Esto no lo has comentado. Si es por valor, los $par_{in}$ y los $par_{out}$ no hacen falta (pero pueden dejarse igual)} As an addition, in
|
||||
the SDG, an extra edge is added (summary edge), which represents the
|
||||
dependencies that the input variables have on the outputs. This allows the
|
||||
algorithm to know the dependencies without traversing the corresponding
|
||||
function.
|
||||
\subsubsection{Function calls and data dependencies}
|
||||
|
||||
All these additions are added as extra lines\josep{lines?} in the ``Start'', ``End'' and
|
||||
calling vertices. When building the PDG, all additions (variable assignments)
|
||||
are split into their own vertices, and are control dependent on them. Data
|
||||
dependencies no longer flow throw the call vertex, but throw the appropriate
|
||||
child, which minimizes the size of the slice produced. As an example,
|
||||
\added{Figure}\deleted{figure}~\ref{fig:sdg-loop} shows the three stages of a program, from CFG to SDG.
|
||||
The construction of the CFG is straight-forward, save for the packing and
|
||||
unpacking of variables in the start, end and call vertices. In the PDG, the
|
||||
statements are split, control and data dependencies replace the control flow
|
||||
edges. Finally, both PDGs are linked via call and parameter (input and output)
|
||||
edges, forming the SDG. Summary edges are placed according to the data and
|
||||
control flow of the method call, and the graph is complete.
|
||||
\carlos{Vocabulary: when is appropriate the use of method, function and procedure????}
|
||||
|
||||
In the original definition of the SDG, there was special handling of data dependencies when calling functions, as it was considered that parameters were passed by value, and global variables did not exist. \carlos{Name and cite paper that introduced it} solves this issue by splitting function calls and function into multiple nodes. This proposal solved everything related to parameter passing: by value, by reference, complex variables such as structs or objects and return values.
|
||||
|
||||
To such end, the following modifications are made to the different graphs:
|
||||
|
||||
\begin{description}
|
||||
\item[CFG.] In each CFG, global variables read or modified and parameters are added to the label of the ``Start'' node in assignments of the form $par = par_{in}$ for each parameter and $x = x_{in}$ for global variables. Similarly, global variables and parameters modified are added to the label of the ``End'' node as $x_{out} = x$. The parameters are only passed back if the value set by the called method can be read by the callee. Finally, in method calls the same values must be packed and unpacked: each statement containing a function called is relabeled to contain input (of the form $par_{in} = \textnormal{exp}$ for parameters or $x_{in} = x$ for global variables) and output (always of the form $x = x_{out}$).
|
||||
\item[PDG.] Each node modified in the CFG is split into multiple nodes: the original label is the main node and each assignment is represented as a new node, which is control--dependent on the main one. Visually, input is placed on the left and output on the right; with parameters sorted accordingly.
|
||||
\item[SDG.] Three kinds of edges are introduced: parameter input (param--in), parameter output (param--out) and summary edges. Parameter input edges are placed between each method call's input node and the corresponding method definition input node. Parameter output edges are placed between each method definition's output node and the corresponding method call output node. Summary edges are placed between the input and output nodes of a method call, according to the dependencies inside the method definition: if there is a path from an input node to an output node, that shows a dependence and a summary method is placed in all method calls between those two nodes.
|
||||
|
||||
Note: parameter input and output edges are separated because the traversal algorithm traverses them only sometimes (the output edges are excluded in the first pass and the input edges in the second).
|
||||
\end{description}
|
||||
|
||||
\begin{example}[Variable packing and unpacking]
|
||||
Let it be a function $f(x, y)$ with two integer parameters, and a call $f(a + b, c)$, with parameters passed by reference if possible. The label of the method call node in the CFG would be ``\texttt{x\_in = a + b, y\_in = c, f(a + b, c), c = y\_out}''; method $f$ would have \texttt{x = x\_in, y = y\_in} in the ``Start'' node and \texttt{y\_out = y} in the ``End'' node. The relevant section of the SDG would be:
|
||||
\begin{center}
|
||||
\includegraphics[width=0.5\linewidth]{img/parameter-passing}
|
||||
\end{center}
|
||||
\end{example}
|
||||
|
||||
\section{Unconditional control flow}
|
||||
|
||||
|
|
|
@ -7,14 +7,14 @@
|
|||
\section{Motivation}
|
||||
\label{sec:motivation}
|
||||
|
||||
Program slicing~\cite{Wei81} is a debugging technique \deleted{which}\added{that}, given a line of
|
||||
code and a \added{set of} variable\added{s} of a program, simplifies such program so that the only parts
|
||||
left of it are those that affect \added{or are affected by} the value\added{s} of the selected variable\added{s}.
|
||||
Program slicing~\cite{Wei81} is a debugging technique that, given a line of
|
||||
code and a set of variables of a program, simplifies such program so that the only parts
|
||||
left of it are those that affect or are affected by the values of the selected variables.
|
||||
|
||||
\begin{example}[Program slicing in a simple method]
|
||||
If the following program is sliced on \added{(line 5, variable \texttt{x})} \deleted{line 5 (variable \texttt{x})}, the
|
||||
result would be the program of\josep{at?} the right, with the \texttt{if} block
|
||||
skipped, as it \added{does not}\deleted{doesn't} affect the value of \texttt{x}.
|
||||
If the left program is sliced on (line 5, variable \texttt{x}), the
|
||||
result would be the program on the right, with the \texttt{if} block
|
||||
removed, as it does not affect the value of \texttt{x}.
|
||||
\label{exa:program-slicing}
|
||||
\begin{center}
|
||||
\begin{minipage}{0.49\linewidth}
|
||||
|
@ -40,49 +40,49 @@ void f(int x) {
|
|||
\end{center}
|
||||
\end{example}
|
||||
|
||||
Slices are \deleted{an} executable program\added{s} whose execution \deleted{will} produce\added{s} the same values
|
||||
for the specified line and variable as the original program, and \added{they} are used to
|
||||
facilitate debugging of large and complex programs, where the data flow may not
|
||||
Slices are executable programs whose execution produces the same values
|
||||
for the specified line and variable as the original program, and they are used to
|
||||
facilitate debugging of large and complex programs, where the control and data flow may not
|
||||
be easily understandable.
|
||||
|
||||
Though it may seem a really powerful technique, the whole Java language is not
|
||||
completely covered by it, and that makes it difficult to apply in practical
|
||||
settings. An area that has been investigated, yet \added{does not}\deleted{doesn't} have a definitive
|
||||
settings. An area that has been investigated, yet does not have a definitive
|
||||
solution yet is exception handling. Example~\ref{exa:program-slicing2}
|
||||
demonstrates how, even using the latest developments in program
|
||||
slicing~\cite{Allen03}, the sliced version \added{does not}\deleted{doesn't} include the catch block, and
|
||||
therefore \added{does not}\deleted{doesn't} produce a correct slice.
|
||||
slicing~\cite{AllH03}, the sliced version does not include the catch block, and
|
||||
therefore does not produce a correct slice.
|
||||
|
||||
\begin{example}[Program slicing with examples]
|
||||
If the following program is sliced \josep{aqui podria colar no decir qué algoritmo usas (el de Horwitz, con su cita), pero en el paper no colará. Ponlo ya, no hace falta que lo expliques aún, pero así eres preciso.} \added{with respect to}\deleted{in} \added{(}line 17, variable \texttt{x}\added{)}, the
|
||||
\begin{example}[Program slicing with exceptions]
|
||||
If the following program is sliced using Allen and Horwitz's proposal~\cite{AllH03} with respect to (line 17, variable \texttt{a}), the
|
||||
slice is incomplete, as it lacks the \texttt{catch} block from lines 4-6.
|
||||
\label{exa:program-slicing2}
|
||||
\begin{center}
|
||||
\begin{minipage}{0.49\linewidth}
|
||||
\begin{lstlisting}[stepnumber=1]
|
||||
void f(int x) {
|
||||
void f(int x) throws Exception {
|
||||
try {
|
||||
g(x);
|
||||
} catch (RuntimeException e) {
|
||||
} catch (Exception e) {
|
||||
System.err.println("Error");
|
||||
}
|
||||
|
||||
System.out.println("g() was ok");
|
||||
|
||||
g(x);
|
||||
g(x + 1);
|
||||
}
|
||||
|
||||
void g(int x) {
|
||||
if (x < 0) {
|
||||
throw new RuntimeException();
|
||||
void g(int a) throws Exception {
|
||||
if (a == 0) {
|
||||
throw new Exception();
|
||||
}
|
||||
System.out.println(x);
|
||||
System.out.println(a);
|
||||
}
|
||||
\end{lstlisting}
|
||||
\end{minipage}
|
||||
\begin{minipage}{0.49\linewidth}
|
||||
\begin{lstlisting}[stepnumber=1]
|
||||
void f(int x) {
|
||||
void f(int x) throws Exception {
|
||||
try {
|
||||
g(x);
|
||||
}
|
||||
|
@ -91,64 +91,40 @@ void f(int x) {
|
|||
|
||||
|
||||
|
||||
g(x);
|
||||
g(x + 1);
|
||||
}
|
||||
|
||||
void g(int x) {
|
||||
if (x < 0) {
|
||||
throw new RuntimeException();
|
||||
}
|
||||
System.out.println(x);
|
||||
void g(int a) throws Exception {
|
||||
if (a == 0) {
|
||||
throw new Exception();
|
||||
}
|
||||
System.out.println(a);
|
||||
}
|
||||
\end{lstlisting}
|
||||
\end{minipage}
|
||||
\end{center}
|
||||
When the program is executed as \texttt{f(0)}, the execution log would be: \texttt{1, 2, 3, 13, 14, 15, 4, 5, 8, 10, 13, 14, 17}. In the only execution of line \texttt{17}, variable \texttt{a} has value 1 in that line. However, in the slice produced, the execution log is \texttt{1, 2, 3, 13, 14, 15}. The exception thrown in \texttt{g()} is not caught in \texttt{f()}, so it returns with an exception and line \texttt{17} never executes.
|
||||
|
||||
The problem in this example is that the \texttt{catch} block in line \texttt{4} is not included, because ---according to the dependency graph shown below--- it does not influence the execution of line \texttt{17}. Two kinds of dependencies among statements are considered: data dependence (a variable is read that may have gotten its value from a given statement) and control dependence (the instruction controls whether another executes).
|
||||
In the graph, the slicing criterion is marked in bold, the nodes that represent the slice are filled in grey, and dependencies are displayed as edges, with control dependencies in black and data dependencies in red. Nodes with a dashed outline represent elements that are not statements of the program.
|
||||
|
||||
\begin{center}
|
||||
\includegraphics[width=\linewidth]{img/motivation-example-pdg}
|
||||
\end{center}
|
||||
\end{example}
|
||||
|
||||
Example~\ref{exa:program-slicing2} showcases an important error in the current slicing procedure for programs that handle errors with exceptions; because the \texttt{catch} block is disregarded. The only way a \texttt{catch} block can be included in the slice is if a statement inside it is needed for another reason. However, Allen and Horwitz did not encounter this problem in their paper~\cite{AllH03}, as the values outputted by method calls are extracted after the \texttt{normal return} and each \texttt{catch}, and in a typical method call with output, the \texttt{catch} is included by default when the outputted value is used. This detail makes the error much smaller, as most \texttt{try-catch} structures are run to obtain a value.
|
||||
|
||||
\josep{Explicar mejor el ejemplo. Ser generoso dando detalles, la ejecución, la diferencia, incluso el grafo de dependencias si hace falta... El motivating example es la parte mas importante de un paper. :-) Determina si van a seguir leyendote o no.}
|
||||
|
||||
\added{If we consider the initial call {\tt f(-1)}, then the execution history of the initial program is:
|
||||
{\tt 1,2,3,13,14,15,4,5,6,7,8,9,10,13,14,15} (line 17 is never executed and {\tt f} returns with an exception).
|
||||
In contrast, the execution of the slice is:
|
||||
{\tt 1,2,3,13,14,15} (line 17 is never executed and {\tt f} returns with an exception).}
|
||||
|
||||
\josep{Si no me he equivocado con esto anterior, este es un mal ejemplo, porque no se ejecuta el CS en ninguno de los dos programas (luego son equivalentes con respecto a ese punto con la informacion que das)}
|
||||
|
||||
|
||||
|
||||
|
||||
\josep{Lo siguiente me suena raro (el ingles). De hecho, no aparece ni una ocurrencia en Google. Reescribelo, please. Es dificil de entender.}As big a problem as this one is, it \added{does not}\deleted{doesn't} occur in all cases, because of how
|
||||
\texttt{catch} blocks are generally treated when slicing. Generally, two kinds
|
||||
of dependencies among statements are analyzed: control (on the result of this
|
||||
line may depend whether another one executes or not) and data (on the result of
|
||||
this line, the inputs for another one may change).
|
||||
|
||||
The problem described \added{does not}\deleted{doesn't} occur when \deleted{there
|
||||
exist outgoing data dependencies inside the \texttt{try} block}\deleted{the inside the \texttt{try} block there
|
||||
exist outgoing data dependencies}, but it does when there \added{are not}\deleted{aren't}, creating
|
||||
problems for structures with side effects such as a write action to a file or
|
||||
database, or a network request whose result \added{is not}\deleted{isn't} used outside the \texttt{try}.
|
||||
As most slicing tools ignore side effects and focus exclusively on the code and
|
||||
some \texttt{catch} blocks are erroneously removed, which leads to incomplete
|
||||
slices, which end with an error that is normally caught.
|
||||
A notable case where a method that may throw an exception is run and no value is recovered (at least from the point of view of program slicing) is when writing to the filesystem or making connections to servers, such as a database or a webservice to store information. In this case, if no confirmation is outputted signaling whether the storage of information was correct, the \texttt{catch} block will be omitted, and the slicer software will produce an incorrect result.
|
||||
|
||||
\section{Contributions}
|
||||
|
||||
The main contribution of this paper is a complete \added{technique}\deleted{solution} for program slicing
|
||||
in the presence of exception handling constructs for Java\added{. This technique extends the previous technique by Hortwitz et al. \cite{pending}. It considers all cases considered in that work, but it also provides a solution to cases not considered by them.}
|
||||
The main contribution of this paper is a complete technique for program slicing programs in the presence of exception handling constructs for Java. This technique extends the previous technique by Allen et al. \cite{AllH03}. It considers all cases considered in that work, but it also provides a solution to cases not considered by them.
|
||||
|
||||
\added{For the sake of completeness and in order to understand the process that leaded us to this solution, firstly,} we
|
||||
will \added{briefly} present a history of program slicing, specifically those changes that
|
||||
have affected exception handling. Furthermore, we provide a summary of the
|
||||
For the sake of completeness and in order to understand the process that leaded us to this solution, we will present a brief history of program slicing, specifically those changes that have affected exception handling. Furthermore, we provide a summary of the
|
||||
different contributions each author has made to the field.
|
||||
|
||||
The rest of the paper is structured as follows: chapter~\ref{cha:background}
|
||||
summarizes the theoretical background required, \josep{y chapter 3?} chapter~\ref{cha:state-art}
|
||||
provides a bird's eye view of the current state of the art,
|
||||
chapter~\ref{cha:solution} provides a step by step description of the problems
|
||||
found with the state of the art and the solutions proposed, and
|
||||
chapter~\ref{cha:conclusion} summarizes the paper and provides avenues of future
|
||||
work.
|
||||
The rest of the paper is structured as follows: chapter~\ref{cha:background} summarizes the theoretical background required in program slicing and exception handling, chapter~\ref{cha:incremental} will analyze each structure used in exception handling, explore the already available solution and propose a new technique that subsumes all of the existing solutions and provides correct slices for each case.
|
||||
Chapter~\ref{cha:state-art} provides a bird's eye view of the current state of the art, chapter~\ref{cha:solution} provides a summarized description of the new algorithm with all the changes proposed in chapter~\ref{cha:incremental}, and finally, chapter~\ref{cha:conclusion} summarizes the paper and explores future avenues of work.
|
||||
|
||||
% vim: set noexpandtab:tabstop=2:shiftwidth=2:softtabstop=2:wrap
|
||||
|
|
|
@ -1,6 +1,13 @@
|
|||
digraph g {
|
||||
Start [shape=box];
|
||||
End [shape=box];
|
||||
f [label=<x_in = a<br/>y_in = b<br/>f (a, b)<br/>b = x_out>]
|
||||
Start -> "a = 10" -> "b = 20" -> f -> "print(a)" -> End;
|
||||
subgraph a {
|
||||
Start [shape=box];
|
||||
End [shape=box];
|
||||
f [label=<f (a, b)>]
|
||||
Start -> "a = 10" -> "b = 20" -> f -> End;
|
||||
}
|
||||
subgraph b {
|
||||
s [shape=box,label=<Start>];
|
||||
End2 [shape=box,label=<End>];
|
||||
s -> "while (x > y)" -> "x = x - 1" -> "while (x > y)" -> "print(x)" -> End2;
|
||||
}
|
||||
}
|
||||
|
|
Binary file not shown.
58
img/motivation-example-pdg.dot
Normal file
58
img/motivation-example-pdg.dot
Normal file
|
@ -0,0 +1,58 @@
|
|||
digraph g {
|
||||
// nodes g()
|
||||
subgraph cluster_g {
|
||||
enter_g [label=<entry<br/>g()>,shape=rect,style=filled];
|
||||
a_in [label="a = a_in",style="dashed,filled"];
|
||||
l14 [label="if (a == 0)",style=filled]
|
||||
l15 [label="throw new Exception()",style=filled];
|
||||
l17 [label="System.out.println(a)",style="filled,bold"];
|
||||
gee [label="error exit",style="dashed"];
|
||||
gne [label="normal exit",style="dashed"];
|
||||
|
||||
}
|
||||
// nodes f()
|
||||
subgraph cluster_f {
|
||||
enter_f [label=<entry<br/>f()>,shape=rect,style=filled];
|
||||
fee [label="error exit",style="dashed"]
|
||||
x_in [label="x = x_in",style="dashed,filled"];
|
||||
l3 [label="g(x)",style=filled];
|
||||
l3_in [label="a_in = x",style="dashed,filled"];
|
||||
nr3 [label="normal return",style="dashed"];
|
||||
nr10 [label="normal return",style="dashed"];
|
||||
l4 [label="catch (Exception e)"];
|
||||
l5 [label="System.err.println(\"Error\")"];
|
||||
l8 [label="System.out.println(\"g() was ok\")"];
|
||||
l10 [label="g(x + 1)",style=filled];
|
||||
l10_in [label="a_in = x + 1",style="dashed,filled"];
|
||||
try [style=filled];
|
||||
//{rank=same; l3_in nr3}
|
||||
//{rank=same; l10_in nr10 fee}
|
||||
//{rank=same; x_in try}
|
||||
}
|
||||
// control g()
|
||||
enter_g -> a_in;
|
||||
enter_g -> l14 -> l15 -> gee;
|
||||
{l14 l15} -> l17;
|
||||
l14 -> gne;
|
||||
// control f()
|
||||
enter_f -> {x_in l10};
|
||||
enter_f -> try -> l3 -> {nr3; l4};
|
||||
nr3 -> l8;
|
||||
l4 -> l5;
|
||||
l10 -> {nr10; fee};
|
||||
l3 -> l3_in;
|
||||
l10 -> l10_in;
|
||||
{ // data
|
||||
edge [color=red,constraint=false];
|
||||
a_in -> l14 [constraint=true];
|
||||
a_in -> l17;
|
||||
x_in -> {l3_in l10_in};
|
||||
}
|
||||
{ // order
|
||||
edge [style=invis];
|
||||
//a_in -> gne -> gee;
|
||||
//x_in -> try;
|
||||
//l3_in -> nr3 -> l4;
|
||||
//l10_in -> nr10 -> fee;
|
||||
}
|
||||
}
|
BIN
img/motivation-example-pdg.pdf
Normal file
BIN
img/motivation-example-pdg.pdf
Normal file
Binary file not shown.
26
img/parameter-passing.dot
Normal file
26
img/parameter-passing.dot
Normal file
|
@ -0,0 +1,26 @@
|
|||
digraph G {
|
||||
// p [label=<x_in = a + b<br/>y_in = c<br/>f()<br/>c = y_out>,shape=rect];
|
||||
f_call [label="f()"]
|
||||
x_in [label="x_in = a + b"]
|
||||
y_in [label="y_in = c"]
|
||||
y_out [label="c = y_out"]
|
||||
f_call -> {x_in y_in y_out};
|
||||
f_start [label="enter f"];
|
||||
fx_in [label="x = x_in"];
|
||||
fy_in [label="y = y_in"];
|
||||
fy_out [label="y_out = y"];
|
||||
f_start -> {fx_in fy_in fy_out};
|
||||
f_call -> f_start [style=bold];
|
||||
y_in -> f_start [style=invis];
|
||||
x_in -> fx_in [style=dashed];
|
||||
y_in -> fy_in [style=dashed];
|
||||
fy_out -> y_out [constraint=false,style=dashed];
|
||||
invis [height=0.001,width=0.001,style=invis];
|
||||
invis2 [height=0.001,width=0.001,style=invis];
|
||||
{rank=same; x_in y_in y_out invis};
|
||||
{rank=same; fx_in fy_in invis2 fy_out};
|
||||
{edge [style=invis];
|
||||
x_in -> y_in -> invis -> y_out;
|
||||
fx_in -> fy_in -> invis2 -> fy_out;
|
||||
}
|
||||
}
|
BIN
img/parameter-passing.pdf
Normal file
BIN
img/parameter-passing.pdf
Normal file
Binary file not shown.
|
@ -1,50 +1,41 @@
|
|||
digraph g {
|
||||
subgraph {
|
||||
l1; l2; l3; l4; l5;
|
||||
"x_in = a"; "y_in = b"; "a = x_out";
|
||||
subgraph cluster_a {
|
||||
Start [shape=box,label="Start main()"];
|
||||
l2 [label="a = 10"];
|
||||
l3 [label="b = 20"];
|
||||
l4 [label="f(a, b)"];
|
||||
// Rank
|
||||
{ rank = same; l2; l3; l4; }
|
||||
{ rank = min; Start; }
|
||||
// Control
|
||||
{ edge [style = bold];
|
||||
Start -> { l2 l3 l4 };
|
||||
}
|
||||
// Data
|
||||
{ edge [color = red];
|
||||
{l2 l3} -> l4;
|
||||
}
|
||||
// Order
|
||||
{ edge [style = invis];
|
||||
l2 -> l3 -> l4;
|
||||
}
|
||||
}
|
||||
subgraph {
|
||||
l8; l9; l10; l12;
|
||||
"x = x_in"; "y = y_in"; "x_out = x";
|
||||
|
||||
subgraph cluster_b {
|
||||
StartF [shape=box,label="Start f()"];
|
||||
l8 [label="while (x > y)"];
|
||||
l9 [label="x = x + 1"];
|
||||
l11 [label="print(x)"];
|
||||
{rank=max; l9}
|
||||
{rank=same; l8 l11}
|
||||
{rank=min; StartF}
|
||||
StartF -> {l8 l11}
|
||||
l8 -> l9;
|
||||
{ edge [color = red, constraint = false];
|
||||
StartF -> {l8 l9 l11}
|
||||
l9 -> {l8 l9 l11}
|
||||
}
|
||||
}
|
||||
l1 [label="main()"];
|
||||
l2 [label="a = 10"];
|
||||
l3 [label="b = 20"];
|
||||
l4 [label="f(a, b)"];
|
||||
l5 [label="print(a)"];
|
||||
l8 [label="f()"];
|
||||
l9 [label="while (x > y)"];
|
||||
l10 [label="x = x + 1"];
|
||||
l12 [label="print(x)"];
|
||||
// Rank
|
||||
{ rank = same; l9; l12; }
|
||||
// s0 -> s2 [style=invis];
|
||||
// Control
|
||||
{
|
||||
edge [style = bold];
|
||||
l1 -> {l2 l3 l4 l5};
|
||||
l4 -> {"x_in = a" "y_in = b" "a = x_out"};
|
||||
l8 -> {"x = x_in" "y = y_in" l9 l12 "x_out = x"};
|
||||
l9 -> l10;
|
||||
}
|
||||
// Data
|
||||
{
|
||||
edge [color = red];
|
||||
edge [constraint = false];
|
||||
l2 -> "x_in = a";
|
||||
l3 -> "y_in = b";
|
||||
"a = x_out" -> l5;
|
||||
{"x = x_in" l10} -> {l9 l10 l12 "x_out = x"};
|
||||
"y = y_in" -> l9;
|
||||
}
|
||||
{
|
||||
edge [style=dashed];
|
||||
edge [constraint=false];
|
||||
"x_in = a" -> "x = x_in";
|
||||
"y_in = b" -> "y = y_in";
|
||||
l4 -> l8 [constraint=true];
|
||||
"x_out = x" -> "a = x_out";
|
||||
}
|
||||
{edge [color=blue,constraint=false]; {"x_in = a" "y_in = b"} -> "a = x_out"}
|
||||
{edge [style=invis]; "y_in = b" -> l8; "y = y_in" -> l9; }
|
||||
|
||||
l4 -> StartF [style=bold,constraint=false];
|
||||
}
|
||||
|
|
Binary file not shown.
|
@ -1,10 +1,10 @@
|
|||
% !TEX encoding = UTF-8
|
||||
% !TEX spellcheck = en_US
|
||||
% !TEX root = paper.tex
|
||||
% !TEX encoding = UTF-8
|
||||
% !TEX spellcheck = en_US
|
||||
% !TEX root = paper.tex
|
||||
\lstset{
|
||||
% Numbering
|
||||
numbers=left,
|
||||
stepnumber=2,
|
||||
stepnumber=1,
|
||||
numberstyle=\tiny,
|
||||
numbersep=5pt,
|
||||
% Style
|
||||
|
|
10
paper.tex
10
paper.tex
|
@ -91,6 +91,16 @@
|
|||
\include{Secciones/state_of_the_art}
|
||||
\include{Secciones/solution}
|
||||
|
||||
\chapter{TODO}
|
||||
\begin{enumerate}
|
||||
\item Averiguar si el código adicional que cogen los saltos incondicionales puede reducirse con algún tipo de arco. (menos breaks)
|
||||
|
||||
Solución: ver
|
||||
\item Averiguar si el arco 1 es imprescindible (buscar contraejemplo).
|
||||
\item Solución alternativa para no tener que elegir entre el 1 y el 2. Sugerencia: sólo coger el catch por control si ambos arcos (1, 2) están activos.
|
||||
\item Arco 3: el que va
|
||||
\end{enumerate}
|
||||
|
||||
\bibliographystyle{plain}
|
||||
\bibliography{../../../../../../Biblio/biblio.bib}
|
||||
|
||||
|
|
Loading…
Add table
Reference in a new issue