review of paper

This commit is contained in:
Carlos Galindo 2019-12-03 14:12:13 +00:00
parent 40bffb73a8
commit ee58837daf
13 changed files with 340 additions and 306 deletions

View file

@ -7,27 +7,25 @@
\section{Program slicing} \section{Program slicing}
\textsl{Program slicing} \cite{Wei81,Sil12} is a debugging technique that \textsl{Program slicing} \cite{Wei81,Sil12} is a debugging technique that
answers the question: ``which parts of a program \added{do} affect a given statement and answers the question: ``which parts of a program affect a given statement and
variable\added{s}?'' The statement and the variable\added{s} are the basic input to create a slice set of variables?'' The statement and the variables are the basic input to create a slice
and are called the \textsl{slicing criterion}. The criterion can be more and are called the \textsl{slicing criterion}. The criterion can be more
complex, as different slicing techniques may require additional pieces of input. complex, as different slicing techniques may require additional pieces of input.
The \textsl{slice} of a program is the list of statements from the original The \textsl{slice} of a program is the list of statements from the original
program ---which constitutes a valid program---, whose execution will result in program ---which constitutes a valid program--- whose execution will result in
the same values for the variable\added{s} (selected in the slicing criterion)\deleted{ being read the same values for the variables (selected in the slicing criterion).
by a debugger in the selected statement}.
There exist two fundamental dimensions along which the problem of slicing can be There exist two fundamental dimensions along which the problem of slicing can be
proposed \added{\cite{vocabulary}}: proposed \cite{Sil12}:
\begin{itemize} \begin{itemize}
\item \textsl{Static} or \textsl{dynamic}: slicing can be performed \item \textsl{Static} or \textsl{dynamic}: slicing can be performed
statically or dynamically. statically or dynamically.
\textsl{Static slicing} \cite{Wei81} \added{produces slices that}\deleted{is a slice which} consider\deleted{s} all \textsl{Static slicing} \cite{Wei81} produces slices which consider all
possible executions of the program, only taking into account the possible executions of the program: the slice will be correct regardless of the input supplied.
semantics of the programming language. In contrast, \textsl{dynamic slicing} \cite{KorL88} considers a single execution of the program, thus, limiting the slice to
In contrast, \textsl{dynamic slicing} \cite{KorL88} \added{considers a single execution of the program, thus, limiting} \deleted{limits} the slice to
the statements present in an execution log. The slicing criterion is the statements present in an execution log. The slicing criterion is
expanded to include a position in the log that corresponds to one expanded to include a position in the log that corresponds to one
instance of the selected statement, making it much more specific. It may instance of the selected statement, making it much more specific. It may
help finding a bug related to indeterministic behavior (such as a random help find a bug related to indeterministic behavior (such as a random
or pseudo-random number generator), but must be recomputed for each case or pseudo-random number generator), but must be recomputed for each case
being analyzed. being analyzed.
\item \textsl{Backward} or \textsl{forward}: \textsl{backward slicing} \item \textsl{Backward} or \textsl{forward}: \textsl{backward slicing}
@ -35,8 +33,7 @@ proposed \added{\cite{vocabulary}}:
that affect the slicing criterion. In contrast, \textsl{forward slicing} that affect the slicing criterion. In contrast, \textsl{forward slicing}
\cite{BerC85} computes the statements that are affected by the slicing \cite{BerC85} computes the statements that are affected by the slicing
criterion. There also exists a mixed approach called \textsl{chopping} criterion. There also exists a mixed approach called \textsl{chopping}
\cite{JacR94}, which is used to find all statements that affect \added{some variables in the slicing criterion and at the same time they are affected by some other variables in} \deleted{or are \cite{JacR94}, which is used to find all statements that affect some variables in the slicing criterion and at the same time they are affected by some other variables in the slicing criterion.
affected by} the slicing criterion.
\end{itemize} \end{itemize}
Since the definition of program slicing, the most extended form of slicing has Since the definition of program slicing, the most extended form of slicing has
@ -45,7 +42,7 @@ affect the value of a variable in a given statement, in all possible executions
of the program (i.e., for any input data). of the program (i.e., for any input data).
\begin{definition}[Strong static backward slice \cite{Wei81,HorwitzRB88}] \begin{definition}[Strong static backward slice \cite{Wei81,HorwitzRB88}]
\label{def:strong-slice} \label{def:strong-slice}
\carlos{Falta ver exactamente cuál es la cita correcta.} \carlos{One of the citations is the correct one.}
Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where
$s$ is a statement and $v$ is a set of variables in $P$ (the variables may $s$ is a statement and $v$ is a set of variables in $P$ (the variables may
or may not be used in $s$), $S$ is the \textsl{strong slice} of $P$ with or may not be used in $s$), $S$ is the \textsl{strong slice} of $P$ with
@ -61,7 +58,7 @@ of the program (i.e., for any input data).
\begin{definition}[Weak static backward slice \cite{RepY89}] \begin{definition}[Weak static backward slice \cite{RepY89}]
\label{def:weak-slice} \label{def:weak-slice}
\carlos{Comprobar cita y escribir formalmente} \carlos{Check citation and improve ``formalization''?}
Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where
$s$ is a statement and $v$ is a set of variables in $P$ (the variables may $s$ is a statement and $v$ is a set of variables in $P$ (the variables may
or may not be used in $s$), $S$ is the \textsl{weak slice} of $P$ with or may not be used in $s$), $S$ is the \textsl{weak slice} of $P$ with
@ -73,29 +70,28 @@ of the program (i.e., for any input data).
for each of the variables in $v$ when executing $P$ is a prefix of for each of the variables in $v$ when executing $P$ is a prefix of
those produced while executing $S$ ---which means that the slice those produced while executing $S$ ---which means that the slice
may continue producing values, but the first values produced always may continue producing values, but the first values produced always
match up with \added{all those produced with} the original program. match up with all those produced by the original program.
\end{enumerate} \end{enumerate}
\end{definition} \end{definition}
Both definitions (\ref{def:strong-slice} and~\ref{def:weak-slice}) are Both definitions (\ref{def:strong-slice} and~\ref{def:weak-slice}) are
used throughout the literature \added{(see, e.g., \cite{pending})}, with some cases favoring the first and some the used throughout the literature (see, e.g., \cite{pending}\carlos{Which citation? Most papers on exception slicing do not indicate or hint whether they use strong or weak.}), with some cases favoring the first and some the
second. Though the definitions come from the corresponding citations, the naming second. Though the definitions come from the corresponding citations, the naming
was first used in a control dependency analysis by Danicic~\cite{DanBHHKL11}, was first used in a control dependency analysis by Danicic~\cite{DanBHHKL11},
where slices \added{that}\deleted{which} produce the same output as the original are named where slices that produce the same output as the original are named
\textsl{strong}, and those where the original is a prefix of the slice, \textsl{strong}, and those where the original is a prefix of the slice,
\textsl{weak} \carlos{Se podría argumentar que con el slice débil es suficiente \textsl{weak}. Weak slicing tends to be preferred ---specially for debugging--- for two reasons: the algorithm can be simpler and avoid dealing with termination, and the slices can be smaller, narrowing the focus of the debugger. For some applications, strong slices are preferred, such as extracting a feature from a program, where there is a requirement that the resulting slice behave exactly like the original. In this paper we will indicate which kind of slice is produced with each new technique proposed.
para debugging, ya que si un error se presenta en el original, aparecerá también en el programa fragmentado}. \josep{Pues si. añade un parrafo. a continuacion explicando ese hecho, porque asi justificas la existencia de los dos. Un lector que no sepa de slicing ahora mismo se esta preguntando para que sirve la weak :-)}
\begin{example}[Strong, weak and incorrect slices] \begin{example}[Strong, weak and incorrect slices]
\carlos{The table is labeled execution logs of... but the execution log is a different thing.}
In table~\ref{tab:slice-weak} we can observe examples for the various In table~\ref{tab:slice-weak} we can observe examples for the various
definitions. Each row shows the values produced by the execution of a definitions. Each row shows the values produced by the execution of a
program or one of its slices. The first is the original, which computes program or one of its slices.
$3!$. Slice A is one slice, whose execution is identical and therefore is a The first is the original, which computes $3!$.
strong slice. Slice B \added{correctly produced the same values as the original program}\deleted{is correct} but \added{it} continues producing values after the Slice A's execution log is identical to the original and therefore it is a strong slice.
original stops ---a weak slice. It would fit the relaxed definition but not Slice B is a weak slice: its execution correctly produces the same values as the original program, but it continues producing values after the original stops.
a strong one. Slice C is incorrect, as the values differ from the original. Slice C is incorrect, as the values differ from the original.
Some data or control dependency has not been included in the slice and the Some data or control dependency has not been included in the slice and the program produce different results, in this case the slice computes Fibonacci numbers instead of factorials.
program are behaving in a different way.
\end{example} \end{example}
\begin{table} \begin{table}
@ -112,24 +108,24 @@ para debugging, ya que si un error se presenta en el original, aparecerá tambi
\end{table} \end{table}
Program slicing is a language--agnostic tool, but the original proposal by Program slicing is a language--agnostic tool, but the original proposal by
Weiser~\cite{Wei81} \added{covered}\deleted{covers} a simple imperative programming language. Weiser~\cite{Wei81} covered a simple imperative programming language.
Since \added{then}, the literature has been expanded by dozens of authors, that have Since then, the literature has been expanded by dozens of authors, that have
described and implemented slicing for more complex structures, such as described and implemented slicing for more complex structures, such as
uncontrolled control flow~\cite{HorwitzRB88}, global variables~\cite{???}, uncontrolled control flow~\cite{HorwitzRB88}, global variables~\cite{???},
exception handling~\cite{AllH03}; and for other programming paradigms, such as exception handling~\cite{AllH03}; and for other programming paradigms, such as
object-oriented languages~\cite{???} or functional languages~\cite{???}. object--oriented languages~\cite{???} or functional languages~\cite{???}.
\carlos{Se pueden poner más, faltan las citas correspondientes.} \carlos{Se pueden poner más, faltan las citas correspondientes.}
\subsection{The System Dependence Graph (SDG)} \subsection{The System Dependence Graph (SDG)}
There exist multiple approaches to compute a slice from a given program and There exist multiple approaches to compute a slice from a given program and
\added{slicing} criterion, but the most efficient and broadly use\added{d} data structure is the System slicing criterion, but the most efficient and broadly used data structure is the System
Dependence Graph (SDG), first introduced by Horwitz, Reps and Dependence Graph (SDG), first introduced by Horwitz, Reps and
Blinkey~\cite{HorwitzRB88}. It is computed from the program's statements, and Blinkey~\cite{HorwitzRB88}. It is computed from the program's statements, and
once built, a slicing criterion is chosen, the graph traversed using a specific once built, a slicing criterion is chosen, the graph traversed using a specific
algorithm, and the slice obtained. Its efficiency resides in the fact that for algorithm, and the slice obtained. Its efficiency resides in the fact that for
multiple slices that share the same program, the graph must only be built once. multiple slices that share the same program, the graph must only be built once.
On top of that, building the graph has a complexity of $\mathcal{O}(n^2)$ with On top of that, building the graph has a complexity of $\mathcal{O}(n^2)$ \carlos{uso $\mathcal{O}$ o $O$?} with
respect to the number of statements in a program, but the traversal is linear respect to the number of statements in a program, but the traversal is linear
with respect to the number of nodes in the graph (each corresponding to a with respect to the number of nodes in the graph (each corresponding to a
statement). statement).
@ -141,9 +137,9 @@ dependencies among nodes. Those edges represent various kinds of dependencies
---control, data, calls, parameter passing, summary--- which will be defined in ---control, data, calls, parameter passing, summary--- which will be defined in
section~\ref{sec:first-def-sdg}. section~\ref{sec:first-def-sdg}.
To create the SDG, first a \textsl{control flow graph} \added{(CFG)} is built for each method To create the SDG, first a \textsl{control flow graph} (CFG) is built for each method
in the program, then its control and data dependencies are computed, resulting in the program, then its control and data dependencies are computed, resulting
in the \textsl{program dependence graph} \added{(PDG)}. Finally, all the graphs from every in the \textsl{program dependence graph} (PDG). Finally, all the graphs from every
method are joined into the SDG. This process will be explained at greater method are joined into the SDG. This process will be explained at greater
lengths in section~\ref{sec:first-def-sdg}. lengths in section~\ref{sec:first-def-sdg}.
%TODO: marked for removal --- this process is repeated later in ref{sec:first-deg-sdg} %TODO: marked for removal --- this process is repeated later in ref{sec:first-deg-sdg}
@ -170,8 +166,8 @@ lengths in section~\ref{sec:first-def-sdg}.
%\end{description} %\end{description}
An example is provided in figure~\ref{fig:basic-graphs}, where a simple An example is provided in figure~\ref{fig:basic-graphs}, where a simple
multiplication program is converted to CFG, then PDG and finally SDG. For multiplication program is converted to CFG, then PDG and finally SDG. For
simplicity, only the CFG and PDG of \texttt{multiply} are shown \josep{en realidad también está el SDG)} . Control simplicity, only the CFG and PDG of \texttt{main} are omitted. Control
dependencies are black, data dependencies red\added{,} and summary edges blue. dependencies are black, data dependencies red, and summary edges blue.
\begin{figure} \begin{figure}
\centering \centering
@ -203,14 +199,13 @@ There are four relevant metrics considered when evaluating a slicing algorithm:
\begin{description} \begin{description}
\item[Completeness.] The solution includes all the statements that affect \item[Completeness.] The solution includes all the statements that affect
the \added{slicing criterion}\deleted{slice}. This is the most important feature, and almost all the slicing criterion. This is the most important feature, and almost all
publications achieve at least completeness. Trivial completeness is publications achieve at least completeness. Trivial completeness is
easily achievable, as simple as including the whole program in the easily achievable, as simple as including the whole program in the
slice. slice.
\item[Correctness.] The solution excludes all statements that \added{do not}\deleted{don't} affect \item[Correctness.] The solution excludes all statements that do not affect
the \added{slicing criterion}\deleted{slice}. Most solutions are complete, but the degree of correctness is the slicing criterion. Most solutions are complete, but the degree of correctness is
what sets them apart, as smaller slices will not execute unnecessary what sets them apart, as solutions that are more correct will produce smaller slices, which will execute fewer instructions to compute the same values, decreasing the executing time and complexity.
code to compute the values, decreasing the executing time.
\item[Features covered.] Which features or language a slicing algorithm \item[Features covered.] Which features or language a slicing algorithm
covers. Different approaches to slicing cover different programming covers. Different approaches to slicing cover different programming
languages and even paradigms. There are slicing techniques (published or languages and even paradigms. There are slicing techniques (published or
@ -219,12 +214,11 @@ There are four relevant metrics considered when evaluating a slicing algorithm:
language, and as such are less useful for commercial applications, but language, and as such are less useful for commercial applications, but
can be a stepping stone in the betterment of the field. can be a stepping stone in the betterment of the field.
\item[Speed.] Speed of graph generation and slice creation. As previously \item[Speed.] Speed of graph generation and slice creation. As previously
stated, slicing is a two-step process: build\added{ing} a graph and travers\deleted{e}\added{ing} it. stated, slicing is a two-step process: building a graph and traversing it.
The traversal is linear in most proposals, with small variations. Graph The traversal is a linear two--pass analysis of a graph in most proposals, with small variations.
generation tends to be longer and with higher variance, but it is not as Graph generation tends to be a longer process, but it is not as
relevant, because it is only done once (per program being analyzed). As relevant, because it is only done once (per program being analyzed), making this the least important metric.
such, this is the least important metric. Only proposals that deviate Only proposals that deviate from the aforementioned schema of building a graph and traversing it show a wider variation in speed.
from the aforementioned schema show a wider variation in speed.
\end{description} \end{description}
\subsection{Program slicing as a debugging technique} \subsection{Program slicing as a debugging technique}
@ -236,8 +230,8 @@ variation a different purpose:
\item[Backward static.] Used to obtain the lines that affect a statement, \item[Backward static.] Used to obtain the lines that affect a statement,
normally used on a line which outputs an incorrect value, to narrow down normally used on a line which outputs an incorrect value, to narrow down
the source of the bug. the source of the bug.
\item[Forward\deleted{e} static.] Used to obtain the lines affected by a statement, \item[Forward static.] Used to obtain the lines affected by a statement,
used to identify dead code, to check the effects a line has \added{on}\deleted{in} the rest used to identify dead code, to check the effects a line has on the rest
of the program. of the program.
\item[Chopping static.] Obtains both the statements affected by and the \item[Chopping static.] Obtains both the statements affected by and the
statements that affect the selected statement. statements that affect the selected statement.
@ -254,7 +248,8 @@ variation a different purpose:
executions instead of only one. Similarly to quasy--static slicing, it executions instead of only one. Similarly to quasy--static slicing, it
can offer a slightly bigger slice while keeping the scope focused on the can offer a slightly bigger slice while keeping the scope focused on the
source of the bug. source of the bug.
\carlos{completar} \item
\carlos{añadir más quizá???}
\end{description} \end{description}
\section{Exception handling in Java} \section{Exception handling in Java}
@ -264,12 +259,11 @@ Exception handling is common in most modern programming languages. In Java, it
consists of the following elements: consists of the following elements:
\begin{description} \begin{description}
\item[Throwable.] An interface that encompasses all the exceptions or errors \item[Throwable.] An interface that encompasses all the exceptions or errors
that may be thrown. Child classes are \texttt{Exception} for most errors that may be thrown. Its child classes are \texttt{Error} for internal errors in the Java Virtual Machine and \texttt{Exception} for normal errors.
and \texttt{Error} for internal errors in the Java Virtual Machine. Exceptions can be classified as \textsl{unchecked}
Exceptions can be classified in\added{to} two categories: \textsl{unchecked} (those that extend \texttt{RuntimeException} or \texttt{Error}) and
(those inheriting from \texttt{RuntimeException} or \texttt{Error}) and \textsl{checked} (all others, may inherit from \texttt{Throwable}, but typically they do so from \texttt{Exception}). The first kind may be thrown anywhere without warning, whereas
\textsl{checked} (the rest). The first may be thrown anywhere, whereas the second, if thrown, must be either caught in the same method or declared in the method header.
the second, if thrown, must be caught or declared in the method header.
\item[throws.] A statement that activates an exception, altering the normal \item[throws.] A statement that activates an exception, altering the normal
control-flow of the method. If the statement is inside a \textsl{try} control-flow of the method. If the statement is inside a \textsl{try}
block with a \textsl{catch} clause for its type or any supertype, the block with a \textsl{catch} clause for its type or any supertype, the
@ -310,7 +304,7 @@ consists of the following elements:
In almost all programming languages, errors can appear (either through the In almost all programming languages, errors can appear (either through the
developer, the user or the system's fault), and must be dealt with. Most of the developer, the user or the system's fault), and must be dealt with. Most of the
popular object\added{-}oriented programs feature some kind of error system, normally popular object--oriented programs feature some kind of error system, normally
very similar to Java's exceptions. In this section, we will perform a small very similar to Java's exceptions. In this section, we will perform a small
survey of the error-handling techniques used on the most popular programming survey of the error-handling techniques used on the most popular programming
languages. The language list has been extracted from a survey performed by the languages. The language list has been extracted from a survey performed by the
@ -383,17 +377,16 @@ On the other hand, in the other languages there exist a variety of systems that
emulate or replace exception handling: emulate or replace exception handling:
\begin{description} % bash, vba, C and Go exceptions explained \begin{description} % bash, vba, C and Go exceptions explained
\item[Bash] The popular Bourne Again SHell features no exception system, apart \item[Bash.] The popular Bourne Again SHell features no exception system, apart
from the user's ability to parse the return code from the last statement from the user's ability to parse the return code from the last statement
executed. Traps can also be used to capture erroneous states and tidy up all executed. Traps can also be used to capture erroneous states and tidy up all
files and environment variables before exiting the program. Traps allow the files and environment variables before exiting the program. Traps allow the
programmer to react to a user or system--sent signal, or an exit run from programmer to react to a user or system--sent signal, or an exit run from
within the Bash environment. When a trap is activated, its code run, and the within the Bash environment. When a trap is activated, its code run, and the
signal \added{does not}\deleted{doesn't} proceed and stop the program. This \added{does not}\deleted{doesn't} replace a fully signal does not proceed and stop the program. This does not replace a fully
featured exception system, but \texttt{bash} programs tend to be small in featured exception system, but \texttt{bash} programs tend to be short, with programmers preferring the efficiency of C or the commodities of
size, with programmers preferring the efficiency of C or the commodities of
other high--level languages when the task requires it. other high--level languages when the task requires it.
\item[VBA] Visual Basic for Applications is a scripting programming language \item[VBA.] Visual Basic for Applications is a scripting programming language
based on Visual Basic that is integrated into Microsoft Office to automate based on Visual Basic that is integrated into Microsoft Office to automate
small tasks, such as generating documents from templates, making advanced small tasks, such as generating documents from templates, making advanced
computations that are impossible or slower with spreadsheet functions, etc. computations that are impossible or slower with spreadsheet functions, etc.
@ -404,8 +397,8 @@ emulate or replace exception handling:
error. The directive can be set and reset multiple times, therefore creating error. The directive can be set and reset multiple times, therefore creating
artificial \texttt{try-catch} blocks, but there is no possibility of artificial \texttt{try-catch} blocks, but there is no possibility of
attaching a value to the error, lowering its usefulness. attaching a value to the error, lowering its usefulness.
\item[C] In C, errors can also be control\added{led} via return values, but some of the \item[C.] In C, errors can also be controlled via return values, but some
instructions it features can be used to create a simple exception system. instructions featured in it can be used to create a simple exception system.
\texttt{setjmp} and \texttt{longjmp} are two instructions which set up and \texttt{setjmp} and \texttt{longjmp} are two instructions which set up and
perform inter--function jumps. The first makes a snapshot of the call stack perform inter--function jumps. The first makes a snapshot of the call stack
in a buffer, and the second returns to the position where the buffer was in a buffer, and the second returns to the position where the buffer was
@ -417,31 +410,31 @@ emulate or replace exception handling:
\label{fig:exceptions-c} \label{fig:exceptions-c}
\begin{minipage}{0.5\linewidth} \begin{minipage}{0.5\linewidth}
\begin{lstlisting}[language=C] \begin{lstlisting}[language=C]
int main() { int main() {
if (!setjmp(ref)) { if (!setjmp(ref)) {
res = safe_sqrt(x, ref); res = safe_sqrt(x, ref);
} else { } else {
// Handle error // Handle error
printf /* ... */ printf /* ... */
} }
} }
\end{lstlisting} \end{lstlisting}
\end{minipage} \end{minipage}
\begin{minipage}{0.49\linewidth} \begin{minipage}{0.49\linewidth}
\begin{lstlisting}[language=C] \begin{lstlisting}[language=C]
double safe_sqrt(double x, int ref) { double safe_sqrt(double x, int ref) {
if (x < 0) if (x < 0)
longjmp(ref, 1); longjmp(ref, 1);
return /* ... */; return /* ... */;
} }
\end{lstlisting} \end{lstlisting}
\end{minipage} \end{minipage}
In the \texttt{main} function, line 2 will be executed twice: first when In the \texttt{main} function, line 2 will be executed twice: first when
it is normally reached ---returning 0--- and the second when line 3 in it is normally reached ---returning 0 and continuing in line 3--- and the second when line 3 in
\texttt{safe\_sqrt} is run, returning the second argument of \texttt{longjmp}, \texttt{safe\_sqrt} is run, returning the second argument of \texttt{longjmp},
and therefore entering the else block in the \texttt{main} method. and therefore entering the else block in the \texttt{main} method.
\end{example} \end{example}
\item[Go] The programming language Go is the odd one out in this section, being a \item[Go.] The programming language Go is the odd one out in this section, being a
modern programming language without exceptions, though it is an intentional modern programming language without exceptions, though it is an intentional
design decision made by its authors\footnotemark. The argument made was that design decision made by its authors\footnotemark. The argument made was that
exception handling systems introduce abnormal control--flow and complicate exception handling systems introduce abnormal control--flow and complicate
@ -456,8 +449,9 @@ emulate or replace exception handling:
\texttt{defer} statement doubles as catch and finally, and multiple \texttt{defer} statement doubles as catch and finally, and multiple
instances can be accumulated. When appropriate, they will run in LIFO order instances can be accumulated. When appropriate, they will run in LIFO order
(Last In--First Out). (Last In--First Out).
\item[Assembly.] Assembly is a representation of machine code, and each computer architecture has its own instruction set, which makes an analysis impossible. In general, though, no unified exception handling is provided. \carlos{complete with more info on kinds of error handling at the processor level or is this out of scope???}
\end{description} \end{description}
\footnotetext{\url{https://golang.org/doc/faq\#exceptions}} \footnotetext{For more details on Go's design choices, see \url{https://golang.org/doc/faq\#exceptions}. \carlos{Possible transformation to citation???}}
% vim: set noexpandtab:tabstop=2:shiftwidth=2:softtabstop=2:wrap % vim: set noexpandtab:tabstop=2:shiftwidth=2:softtabstop=2:wrap

View file

@ -2,6 +2,9 @@
% !TEX spellcheck = en_US % !TEX spellcheck = en_US
% !TEX root = ../paper.tex % !TEX root = ../paper.tex
\chapter{Main explanation?} \chapter{Main explanation?}
\label{cha:incremental}
\carlos{Review if we want to call nodes ``Enter'' and ``Exit'' or ``Start'' and ``End'' (I'd prefer the first one).}
\section{First definition of the SDG} \section{First definition of the SDG}
\label{sec:first-def-sdg} \label{sec:first-def-sdg}
@ -10,7 +13,7 @@ The system dependence graph (SDG) is a method for program slicing that was first
proposed by Horwitz, Reps and Blinkey \cite{HorwitzRB88}. It builds upon the proposed by Horwitz, Reps and Blinkey \cite{HorwitzRB88}. It builds upon the
existing control flow graph (CFG), defining dependencies between vertices of the existing control flow graph (CFG), defining dependencies between vertices of the
CFG, and building a program dependence graph (PDG), which represents them. The CFG, and building a program dependence graph (PDG), which represents them. The
system dependence graph (SDG) is then \added{built}\deleted{build} from the assembly of the different system dependence graph (SDG) is then built from the assembly of the different
PDGs (each representing a method of the program), linking each method call to PDGs (each representing a method of the program), linking each method call to
its corresponding definition. Because each graph is built from the previous one, its corresponding definition. Because each graph is built from the previous one,
new constructs can be added with to the CFG, without the need to alter the new constructs can be added with to the CFG, without the need to alter the
@ -33,134 +36,103 @@ edges, each connected to the first statement that should be executed, according
to the result of evaluating the conditional expression in the guard of the to the result of evaluating the conditional expression in the guard of the
predicate. predicate.
\begin{definition}[Control Flow Graph~\cite{???}] \begin{definition}[Control Flow Graph \carlos{add original citation}]
A \emph{control flow graph} $G$ of a program $P$ is a tuple $\langle N, E \rangle$, where $N$ is a set of nodes, composed of a method's statements and two special nodes, ``Start'' and ``End''. $E$ is a set of edges of the form $e = \left(n_1, n_2\right)$ a directed edge from $n_1$ to $n_2$ A \emph{control flow graph} $G$ of a program $P$ is a directed graph, represented as a tuple $\langle N, E \rangle$, where $N$ is a set of nodes, composed of a method's statements plus two special nodes, ``Start'' and ``End''; and $E$ is a set of edges of the form $e = \left(n_1, n_2\right) | n_1, n_2 \in N$. Most algorithms to generate the SDG mandate the ``Start'' node to be the only source and ``End'' to be the only sink in the graph. \carlos{Is it necessary to define source and sink in the context of a graph?}.
\josep{solo has dicho la parte sintactica. Yo añadiría que existe un arco de el nodo $n_1$ al nodo $n_2$ si y solo si, en alguna ejecución, $n_2$ es ejecutado inmediatamente despues de $n_1$} Edges are created according to the possible execution paths that exist; each statement is connected to any statement that may immediately follow it. Formally, an edge $e = (n_1, n_2)$ exists if and only if there exists an execution of the program where $n_2$ is executed immediately after $n_1$. In general, expressions are not evaluated; so an \texttt{if} instruction has two outgoing edges even if the condition is always true or false, e.g. \texttt{1 == 0}.
\end{definition} \end{definition}
To build the PDG and then the SDG, some dependencies must be extracted from the CFG, which are defined as follows: To build the PDG and then the SDG, there are two dependencies based directly on the CFG's structure: data and control dependence.
\begin{definition}[Postdominance] \begin{definition}[Postdominance \carlos{add original citation?}]
Vertex $b$ \textit{postdominates} vertex \added{$a$}\deleted{$b$} if and only if $a \neq b$ and $b$ is on every path from $a$ to the ``End'' vertex. Vertex $b$ \textit{postdominates} vertex $a$ if and only if $b$ is on every path from $a$ to the ``End'' vertex.
\end{definition} \end{definition}
\begin{definition}[Control dependency] \begin{definition}[Control dependency \carlos{add original citation}]
\label{def:ctrl-dep} \label{def:ctrl-dep}
Vertex $b$ is \textit{control dependent} on vertex $a$ ($a \ctrldep b$) if and only if $b$ postdominates one but not all of $a$'s successors. It follows that a vertex with only one successor cannot be the source of control dependence. Vertex $b$ is \textit{control dependent} on vertex $a$ ($a \ctrldep b$) if and only if $b$ postdominates one but not all of $a$'s successors. It follows that a vertex with only one successor cannot be the source of control dependence.
\end{definition} \end{definition}
\josep{de donde has sacado esta definicion? La veo incorrecta. Por ejemplo, si tenemos $if ~a ~then ~b$, $b$ no postdomina ningun sucesor de $a$} \begin{definition}[Data dependency \carlos{add original citation}]
Vertex $b$ is \textit{data dependent} on vertex $a$ ($a \datadep b$) if and only if $a$ may define a variable $x$, $b$ may use $x$ and there exists a \carlos{could it be ``an''??} $x$-definition free path from $a$ to $b$.
\begin{definition}[Data dependency] Data dependency was originally defined as flow dependency, and split into loop and non--loop related dependencies, but that distinction is no longer useful to compute program slices.
Vertex $b$ is \textit{data dependent} on vertex $a$ ($a \datadep b$) if and only if $a$ may define a variable $x$, $b$ may use $x$ and there \added{exists} an $x$-definition free path from $a$ to $b$.\footnote{The initial definition of data dependency was further split into in-loop data dependencies and the rest, but the difference is not relevant for computing the slices in the SDG.} It should be noted that variable definitions and uses can be computed for each statement independently, analyzing the procedures called by it if necessary. The variables used and defined by a procedure call are those used and defined by its body.
\end{definition} \end{definition}
It should be noted that variable definitions and uses can be computed for each With the data and control dependencies, the PDG may be built by replacing the
statement independently, analyzing the procedures called by it if necessary. In
general, any instruction uses all variables that appear in it, save for the
left-hand side of assignments. Similarly, no instruction defines variables,
except those in the left-hand side of assignments. The variables used and
defined by a procedure call are those used and defined by its body.
With the data and control dependencies, the PDG may be built\deleted{,} by replacing the
edges from the CFG by data and control dependence edges. The first tends to be edges from the CFG by data and control dependence edges. The first tends to be
represented as a thin solid line, and the latter as a thick solid line. In the represented as a thin solid line, and the latter as a thick solid line. In the
examples, data dependencies will be thin solid red lines. examples, data dependencies will be thin solid red lines.
The organization of the vertices of the PDG tends to resemble a tree graph, with \begin{definition}[Program dependence graph]
the ``Start'' node in the position of the root (at the top), and the ``End'' The \textsl{program dependence graph} (PDG) is a directed graph (and originally a tree) represented by three elements: a set of nodes $N$, a set of control edges $E_c$ and a set of data edges $E_d$.
node typically omitted. The control dependence edges structure the tree
vertically. In the case that a vertex is control dependent on multiple vertices, The set of nodes corresponds to the set of nodes of the CFG, excluding the ``End'' node.
it will be placed one level below the lowest source of control dependency. With
a programming language this simple, cyclical control dependencies do not appear, Both sets of edges are built as follows. There is a control edge between two nodes $n_1$ and $n_2$ if and only if $n_1 \ctrldep n_2$, and a data edge between $n_1$ and $n_2$ if and only if $n_1 \datadep n_2$. Additionally, if a node $n$ does not have any incoming control edges, it has a ``default'' control edge $e = (\textnormal{Start},n)$; so that ``Start'' is the only source node of the graph.
but should they do so in further sections, the instructions are sorted top to
bottom in the order they appear in the program. Horizontally, the vertices are Note: the most common graphical representation is a tree--like structure based on the control edges, and nodes sorted left to right according to their position on the original program. Data edges do not affect the structure, so that the graph is easily readable.
sorted by their order in the program, left to right, in order to make the graph \end{definition}
more readable. Data dependency edges are placed without reordering the nodes of
the graph. In the examples given, edges like $a \datadep a$ or $b \ctrldep b$
may be omitted, as they are not relevant for later use of the graph. Please be
noted that the location of the vertices is irrelevant for the slicing algorithm,
and the aforementioned sorting rules are just for consistency with previous
papers on the topic and to ease the visualization of programs.
Finally, the SDG is built from the combination of all the PDGs that compose the Finally, the SDG is built from the combination of all the PDGs that compose the
program. Each call vertex is connected to the ``Start'' of the corresponding program.
procedure. All edges that connect PDGs are represented with dashed lines.
\begin{figure} \begin{definition}[System dependence graph]
\begin{minipage}{0.3\linewidth} The \textsl{system dependence graph} (SDG) is a directed graph that represents the control and data dependencies of a whole program. It has three kinds of edges: control, data and function call. The graph is built combining multiple PDGs, with the ``Start'' nodes labeled after the function they begin. There exists one function call edge between each node containing one or more calls and each of the ``Start'' node of the method called. In a programming language where the function call is ambiguous (e.g. with pointers or polymorphism), there exists one edge leading to every possible function called.
\begin{lstlisting} \end{definition}
proc main() {
a = 10; \begin{example}[Creation of a SDG from a simple program]
b = 20; Given the program shown below (left), the control flow graphs for both methods are shown on the right: \\
f(a, b); \begin{minipage}{0.2\linewidth}
print(a); \begin{lstlisting}
proc main() {
a = 10;
b = 20;
f(a, b);
}
proc f(x, y) {
while (x > y) {
x = x - 1;
} }
print(x);
}
\end{lstlisting}
\end{minipage}
\begin{minipage}{0.79\linewidth}
\includegraphics[width=0.6\linewidth]{img/cfgsimple}
\end{minipage}
proc f(x, y) { Then, control and data dependencies are computed, arranging the nodes in the PDG. Finally, the two graphs are connected with summary edges to create the SDG:
while (x > y) {
x = x - 1;
}
print(x);
}
\end{lstlisting}
\end{minipage}
\begin{minipage}{0.6\linewidth}
\includegraphics[width=0.3\linewidth]{img/cfgsimple}
\includegraphics[width=0.65\linewidth]{img/cfgsimple2}
\end{minipage}
\includegraphics[width=0.5\linewidth]{img/pdgsimple}
\includegraphics[width=0.49\linewidth]{img/pdgsimple2}
\includegraphics[width=0.6\linewidth]{img/sdgsimple}
\includegraphics[width=0.4\linewidth]{img/legendsimple}
\caption{A simple program with its CFGs (top right), PDGs (center) and SDG (bottom).}
\label{fig:sdg-loop}
\end{figure}
\subsubsection{Procedures and data dependencies} \begin{center}
\includegraphics[width=0.8\linewidth]{img/sdgsimple}
\end{center}
\end{example}
The only thing left to explain before introducing more constructs into the \subsubsection{Function calls and data dependencies}
language is the passing of parameters. Most programming language\added{s} accept \added{an arbitrary}\deleted{a
variable} number of input parameters and one output parameter. In the case of
input parameters passed by reference, or constructs such as structs or classes,
modifying a field of a parameter may modify the original variable. In order to
\added{properly} deal with \deleted{everything related to} parameter passing, including global variables,
class fields, etc. there is a small extension to be made to the CFG and PDG \added{\cite{pendinmg}}.
In the CFG, the ``Start'' and ``End'' nodes contain a list of assignments, \carlos{Vocabulary: when is appropriate the use of method, function and procedure????}
inputting and outputting respectively the appropriate values, as can be seen in
the example \josep{qué ejemplo? si hay un ejemplo, ponle un identificador y referencialo aquí}. Consequently, every vertex that contains a procedure or function
call pack and unpack the arguments. For every variable $x$ that is used in a
procedure, every call to it must be preceded by $x_{in} = x$, and the
procedures's ``Start'' vertex must contain $x = x_{in}$. The opposite happens
when a variable must be ``outputted''\carlos{replace}: before the ``End'' node,
the value must be packed ($x_{out} = x$), and after each call, the value must be
assigned to the corresponding variable ($x = x_{out}$). Parameters may be
assigned as $par^i_{in} = expr_i$ (where $i$ is the index of the parameter in
the procedure definition, $par^i$ is the name of the parameter and $expr_i$ is
the expression in the $i^{th}$ position in the procedure call) in the call
vertex, and parameters whose modifications inside the procedure are passed back
to the calling procedure must be extracted as $var = par^i_{out}$ (where $var$
is the name of the variable ---passed by reference--- in the calling
procedure).\carlos{What if object/struct passed by value?} \josep{Esto no lo has comentado. Si es por valor, los $par_{in}$ y los $par_{out}$ no hacen falta (pero pueden dejarse igual)} As an addition, in
the SDG, an extra edge is added (summary edge), which represents the
dependencies that the input variables have on the outputs. This allows the
algorithm to know the dependencies without traversing the corresponding
function.
All these additions are added as extra lines\josep{lines?} in the ``Start'', ``End'' and In the original definition of the SDG, there was special handling of data dependencies when calling functions, as it was considered that parameters were passed by value, and global variables did not exist. \carlos{Name and cite paper that introduced it} solves this issue by splitting function calls and function into multiple nodes. This proposal solved everything related to parameter passing: by value, by reference, complex variables such as structs or objects and return values.
calling vertices. When building the PDG, all additions (variable assignments)
are split into their own vertices, and are control dependent on them. Data To such end, the following modifications are made to the different graphs:
dependencies no longer flow throw the call vertex, but throw the appropriate
child, which minimizes the size of the slice produced. As an example, \begin{description}
\added{Figure}\deleted{figure}~\ref{fig:sdg-loop} shows the three stages of a program, from CFG to SDG. \item[CFG.] In each CFG, global variables read or modified and parameters are added to the label of the ``Start'' node in assignments of the form $par = par_{in}$ for each parameter and $x = x_{in}$ for global variables. Similarly, global variables and parameters modified are added to the label of the ``End'' node as $x_{out} = x$. The parameters are only passed back if the value set by the called method can be read by the callee. Finally, in method calls the same values must be packed and unpacked: each statement containing a function called is relabeled to contain input (of the form $par_{in} = \textnormal{exp}$ for parameters or $x_{in} = x$ for global variables) and output (always of the form $x = x_{out}$).
The construction of the CFG is straight-forward, save for the packing and \item[PDG.] Each node modified in the CFG is split into multiple nodes: the original label is the main node and each assignment is represented as a new node, which is control--dependent on the main one. Visually, input is placed on the left and output on the right; with parameters sorted accordingly.
unpacking of variables in the start, end and call vertices. In the PDG, the \item[SDG.] Three kinds of edges are introduced: parameter input (param--in), parameter output (param--out) and summary edges. Parameter input edges are placed between each method call's input node and the corresponding method definition input node. Parameter output edges are placed between each method definition's output node and the corresponding method call output node. Summary edges are placed between the input and output nodes of a method call, according to the dependencies inside the method definition: if there is a path from an input node to an output node, that shows a dependence and a summary method is placed in all method calls between those two nodes.
statements are split, control and data dependencies replace the control flow
edges. Finally, both PDGs are linked via call and parameter (input and output) Note: parameter input and output edges are separated because the traversal algorithm traverses them only sometimes (the output edges are excluded in the first pass and the input edges in the second).
edges, forming the SDG. Summary edges are placed according to the data and \end{description}
control flow of the method call, and the graph is complete.
\begin{example}[Variable packing and unpacking]
Let it be a function $f(x, y)$ with two integer parameters, and a call $f(a + b, c)$, with parameters passed by reference if possible. The label of the method call node in the CFG would be ``\texttt{x\_in = a + b, y\_in = c, f(a + b, c), c = y\_out}''; method $f$ would have \texttt{x = x\_in, y = y\_in} in the ``Start'' node and \texttt{y\_out = y} in the ``End'' node. The relevant section of the SDG would be:
\begin{center}
\includegraphics[width=0.5\linewidth]{img/parameter-passing}
\end{center}
\end{example}
\section{Unconditional control flow} \section{Unconditional control flow}

View file

@ -7,14 +7,14 @@
\section{Motivation} \section{Motivation}
\label{sec:motivation} \label{sec:motivation}
Program slicing~\cite{Wei81} is a debugging technique \deleted{which}\added{that}, given a line of Program slicing~\cite{Wei81} is a debugging technique that, given a line of
code and a \added{set of} variable\added{s} of a program, simplifies such program so that the only parts code and a set of variables of a program, simplifies such program so that the only parts
left of it are those that affect \added{or are affected by} the value\added{s} of the selected variable\added{s}. left of it are those that affect or are affected by the values of the selected variables.
\begin{example}[Program slicing in a simple method] \begin{example}[Program slicing in a simple method]
If the following program is sliced on \added{(line 5, variable \texttt{x})} \deleted{line 5 (variable \texttt{x})}, the If the left program is sliced on (line 5, variable \texttt{x}), the
result would be the program of\josep{at?} the right, with the \texttt{if} block result would be the program on the right, with the \texttt{if} block
skipped, as it \added{does not}\deleted{doesn't} affect the value of \texttt{x}. removed, as it does not affect the value of \texttt{x}.
\label{exa:program-slicing} \label{exa:program-slicing}
\begin{center} \begin{center}
\begin{minipage}{0.49\linewidth} \begin{minipage}{0.49\linewidth}
@ -40,49 +40,49 @@ void f(int x) {
\end{center} \end{center}
\end{example} \end{example}
Slices are \deleted{an} executable program\added{s} whose execution \deleted{will} produce\added{s} the same values Slices are executable programs whose execution produces the same values
for the specified line and variable as the original program, and \added{they} are used to for the specified line and variable as the original program, and they are used to
facilitate debugging of large and complex programs, where the data flow may not facilitate debugging of large and complex programs, where the control and data flow may not
be easily understandable. be easily understandable.
Though it may seem a really powerful technique, the whole Java language is not Though it may seem a really powerful technique, the whole Java language is not
completely covered by it, and that makes it difficult to apply in practical completely covered by it, and that makes it difficult to apply in practical
settings. An area that has been investigated, yet \added{does not}\deleted{doesn't} have a definitive settings. An area that has been investigated, yet does not have a definitive
solution yet is exception handling. Example~\ref{exa:program-slicing2} solution yet is exception handling. Example~\ref{exa:program-slicing2}
demonstrates how, even using the latest developments in program demonstrates how, even using the latest developments in program
slicing~\cite{Allen03}, the sliced version \added{does not}\deleted{doesn't} include the catch block, and slicing~\cite{AllH03}, the sliced version does not include the catch block, and
therefore \added{does not}\deleted{doesn't} produce a correct slice. therefore does not produce a correct slice.
\begin{example}[Program slicing with examples] \begin{example}[Program slicing with exceptions]
If the following program is sliced \josep{aqui podria colar no decir qué algoritmo usas (el de Horwitz, con su cita), pero en el paper no colará. Ponlo ya, no hace falta que lo expliques aún, pero así eres preciso.} \added{with respect to}\deleted{in} \added{(}line 17, variable \texttt{x}\added{)}, the If the following program is sliced using Allen and Horwitz's proposal~\cite{AllH03} with respect to (line 17, variable \texttt{a}), the
slice is incomplete, as it lacks the \texttt{catch} block from lines 4-6. slice is incomplete, as it lacks the \texttt{catch} block from lines 4-6.
\label{exa:program-slicing2} \label{exa:program-slicing2}
\begin{center} \begin{center}
\begin{minipage}{0.49\linewidth} \begin{minipage}{0.49\linewidth}
\begin{lstlisting}[stepnumber=1] \begin{lstlisting}[stepnumber=1]
void f(int x) { void f(int x) throws Exception {
try { try {
g(x); g(x);
} catch (RuntimeException e) { } catch (Exception e) {
System.err.println("Error"); System.err.println("Error");
} }
System.out.println("g() was ok"); System.out.println("g() was ok");
g(x); g(x + 1);
} }
void g(int x) { void g(int a) throws Exception {
if (x < 0) { if (a == 0) {
throw new RuntimeException(); throw new Exception();
} }
System.out.println(x); System.out.println(a);
} }
\end{lstlisting} \end{lstlisting}
\end{minipage} \end{minipage}
\begin{minipage}{0.49\linewidth} \begin{minipage}{0.49\linewidth}
\begin{lstlisting}[stepnumber=1] \begin{lstlisting}[stepnumber=1]
void f(int x) { void f(int x) throws Exception {
try { try {
g(x); g(x);
} }
@ -91,64 +91,40 @@ void f(int x) {
g(x); g(x + 1);
} }
void g(int x) { void g(int a) throws Exception {
if (x < 0) { if (a == 0) {
throw new RuntimeException(); throw new Exception();
} }
System.out.println(x); System.out.println(a);
} }
\end{lstlisting} \end{lstlisting}
\end{minipage} \end{minipage}
\end{center} \end{center}
When the program is executed as \texttt{f(0)}, the execution log would be: \texttt{1, 2, 3, 13, 14, 15, 4, 5, 8, 10, 13, 14, 17}. In the only execution of line \texttt{17}, variable \texttt{a} has value 1 in that line. However, in the slice produced, the execution log is \texttt{1, 2, 3, 13, 14, 15}. The exception thrown in \texttt{g()} is not caught in \texttt{f()}, so it returns with an exception and line \texttt{17} never executes.
The problem in this example is that the \texttt{catch} block in line \texttt{4} is not included, because ---according to the dependency graph shown below--- it does not influence the execution of line \texttt{17}. Two kinds of dependencies among statements are considered: data dependence (a variable is read that may have gotten its value from a given statement) and control dependence (the instruction controls whether another executes).
In the graph, the slicing criterion is marked in bold, the nodes that represent the slice are filled in grey, and dependencies are displayed as edges, with control dependencies in black and data dependencies in red. Nodes with a dashed outline represent elements that are not statements of the program.
\begin{center}
\includegraphics[width=\linewidth]{img/motivation-example-pdg}
\end{center}
\end{example} \end{example}
Example~\ref{exa:program-slicing2} showcases an important error in the current slicing procedure for programs that handle errors with exceptions; because the \texttt{catch} block is disregarded. The only way a \texttt{catch} block can be included in the slice is if a statement inside it is needed for another reason. However, Allen and Horwitz did not encounter this problem in their paper~\cite{AllH03}, as the values outputted by method calls are extracted after the \texttt{normal return} and each \texttt{catch}, and in a typical method call with output, the \texttt{catch} is included by default when the outputted value is used. This detail makes the error much smaller, as most \texttt{try-catch} structures are run to obtain a value.
\josep{Explicar mejor el ejemplo. Ser generoso dando detalles, la ejecución, la diferencia, incluso el grafo de dependencias si hace falta... El motivating example es la parte mas importante de un paper. :-) Determina si van a seguir leyendote o no.} A notable case where a method that may throw an exception is run and no value is recovered (at least from the point of view of program slicing) is when writing to the filesystem or making connections to servers, such as a database or a webservice to store information. In this case, if no confirmation is outputted signaling whether the storage of information was correct, the \texttt{catch} block will be omitted, and the slicer software will produce an incorrect result.
\added{If we consider the initial call {\tt f(-1)}, then the execution history of the initial program is:
{\tt 1,2,3,13,14,15,4,5,6,7,8,9,10,13,14,15} (line 17 is never executed and {\tt f} returns with an exception).
In contrast, the execution of the slice is:
{\tt 1,2,3,13,14,15} (line 17 is never executed and {\tt f} returns with an exception).}
\josep{Si no me he equivocado con esto anterior, este es un mal ejemplo, porque no se ejecuta el CS en ninguno de los dos programas (luego son equivalentes con respecto a ese punto con la informacion que das)}
\josep{Lo siguiente me suena raro (el ingles). De hecho, no aparece ni una ocurrencia en Google. Reescribelo, please. Es dificil de entender.}As big a problem as this one is, it \added{does not}\deleted{doesn't} occur in all cases, because of how
\texttt{catch} blocks are generally treated when slicing. Generally, two kinds
of dependencies among statements are analyzed: control (on the result of this
line may depend whether another one executes or not) and data (on the result of
this line, the inputs for another one may change).
The problem described \added{does not}\deleted{doesn't} occur when \deleted{there
exist outgoing data dependencies inside the \texttt{try} block}\deleted{the inside the \texttt{try} block there
exist outgoing data dependencies}, but it does when there \added{are not}\deleted{aren't}, creating
problems for structures with side effects such as a write action to a file or
database, or a network request whose result \added{is not}\deleted{isn't} used outside the \texttt{try}.
As most slicing tools ignore side effects and focus exclusively on the code and
some \texttt{catch} blocks are erroneously removed, which leads to incomplete
slices, which end with an error that is normally caught.
\section{Contributions} \section{Contributions}
The main contribution of this paper is a complete \added{technique}\deleted{solution} for program slicing The main contribution of this paper is a complete technique for program slicing programs in the presence of exception handling constructs for Java. This technique extends the previous technique by Allen et al. \cite{AllH03}. It considers all cases considered in that work, but it also provides a solution to cases not considered by them.
in the presence of exception handling constructs for Java\added{. This technique extends the previous technique by Hortwitz et al. \cite{pending}. It considers all cases considered in that work, but it also provides a solution to cases not considered by them.}
\added{For the sake of completeness and in order to understand the process that leaded us to this solution, firstly,} we For the sake of completeness and in order to understand the process that leaded us to this solution, we will present a brief history of program slicing, specifically those changes that have affected exception handling. Furthermore, we provide a summary of the
will \added{briefly} present a history of program slicing, specifically those changes that
have affected exception handling. Furthermore, we provide a summary of the
different contributions each author has made to the field. different contributions each author has made to the field.
The rest of the paper is structured as follows: chapter~\ref{cha:background} The rest of the paper is structured as follows: chapter~\ref{cha:background} summarizes the theoretical background required in program slicing and exception handling, chapter~\ref{cha:incremental} will analyze each structure used in exception handling, explore the already available solution and propose a new technique that subsumes all of the existing solutions and provides correct slices for each case.
summarizes the theoretical background required, \josep{y chapter 3?} chapter~\ref{cha:state-art} Chapter~\ref{cha:state-art} provides a bird's eye view of the current state of the art, chapter~\ref{cha:solution} provides a summarized description of the new algorithm with all the changes proposed in chapter~\ref{cha:incremental}, and finally, chapter~\ref{cha:conclusion} summarizes the paper and explores future avenues of work.
provides a bird's eye view of the current state of the art,
chapter~\ref{cha:solution} provides a step by step description of the problems
found with the state of the art and the solutions proposed, and
chapter~\ref{cha:conclusion} summarizes the paper and provides avenues of future
work.
% vim: set noexpandtab:tabstop=2:shiftwidth=2:softtabstop=2:wrap % vim: set noexpandtab:tabstop=2:shiftwidth=2:softtabstop=2:wrap

View file

@ -1,6 +1,13 @@
digraph g { digraph g {
Start [shape=box]; subgraph a {
End [shape=box]; Start [shape=box];
f [label=<x_in = a<br/>y_in = b<br/>f (a, b)<br/>b = x_out>] End [shape=box];
Start -> "a = 10" -> "b = 20" -> f -> "print(a)" -> End; f [label=<f (a, b)>]
Start -> "a = 10" -> "b = 20" -> f -> End;
}
subgraph b {
s [shape=box,label=<Start>];
End2 [shape=box,label=<End>];
s -> "while (x > y)" -> "x = x - 1" -> "while (x > y)" -> "print(x)" -> End2;
}
} }

Binary file not shown.

View file

@ -0,0 +1,58 @@
digraph g {
// nodes g()
subgraph cluster_g {
enter_g [label=<entry<br/>g()>,shape=rect,style=filled];
a_in [label="a = a_in",style="dashed,filled"];
l14 [label="if (a == 0)",style=filled]
l15 [label="throw new Exception()",style=filled];
l17 [label="System.out.println(a)",style="filled,bold"];
gee [label="error exit",style="dashed"];
gne [label="normal exit",style="dashed"];
}
// nodes f()
subgraph cluster_f {
enter_f [label=<entry<br/>f()>,shape=rect,style=filled];
fee [label="error exit",style="dashed"]
x_in [label="x = x_in",style="dashed,filled"];
l3 [label="g(x)",style=filled];
l3_in [label="a_in = x",style="dashed,filled"];
nr3 [label="normal return",style="dashed"];
nr10 [label="normal return",style="dashed"];
l4 [label="catch (Exception e)"];
l5 [label="System.err.println(\"Error\")"];
l8 [label="System.out.println(\"g() was ok\")"];
l10 [label="g(x + 1)",style=filled];
l10_in [label="a_in = x + 1",style="dashed,filled"];
try [style=filled];
//{rank=same; l3_in nr3}
//{rank=same; l10_in nr10 fee}
//{rank=same; x_in try}
}
// control g()
enter_g -> a_in;
enter_g -> l14 -> l15 -> gee;
{l14 l15} -> l17;
l14 -> gne;
// control f()
enter_f -> {x_in l10};
enter_f -> try -> l3 -> {nr3; l4};
nr3 -> l8;
l4 -> l5;
l10 -> {nr10; fee};
l3 -> l3_in;
l10 -> l10_in;
{ // data
edge [color=red,constraint=false];
a_in -> l14 [constraint=true];
a_in -> l17;
x_in -> {l3_in l10_in};
}
{ // order
edge [style=invis];
//a_in -> gne -> gee;
//x_in -> try;
//l3_in -> nr3 -> l4;
//l10_in -> nr10 -> fee;
}
}

Binary file not shown.

26
img/parameter-passing.dot Normal file
View file

@ -0,0 +1,26 @@
digraph G {
// p [label=<x_in = a + b<br/>y_in = c<br/>f()<br/>c = y_out>,shape=rect];
f_call [label="f()"]
x_in [label="x_in = a + b"]
y_in [label="y_in = c"]
y_out [label="c = y_out"]
f_call -> {x_in y_in y_out};
f_start [label="enter f"];
fx_in [label="x = x_in"];
fy_in [label="y = y_in"];
fy_out [label="y_out = y"];
f_start -> {fx_in fy_in fy_out};
f_call -> f_start [style=bold];
y_in -> f_start [style=invis];
x_in -> fx_in [style=dashed];
y_in -> fy_in [style=dashed];
fy_out -> y_out [constraint=false,style=dashed];
invis [height=0.001,width=0.001,style=invis];
invis2 [height=0.001,width=0.001,style=invis];
{rank=same; x_in y_in y_out invis};
{rank=same; fx_in fy_in invis2 fy_out};
{edge [style=invis];
x_in -> y_in -> invis -> y_out;
fx_in -> fy_in -> invis2 -> fy_out;
}
}

BIN
img/parameter-passing.pdf Normal file

Binary file not shown.

View file

@ -1,50 +1,41 @@
digraph g { digraph g {
subgraph { subgraph cluster_a {
l1; l2; l3; l4; l5; Start [shape=box,label="Start main()"];
"x_in = a"; "y_in = b"; "a = x_out"; l2 [label="a = 10"];
l3 [label="b = 20"];
l4 [label="f(a, b)"];
// Rank
{ rank = same; l2; l3; l4; }
{ rank = min; Start; }
// Control
{ edge [style = bold];
Start -> { l2 l3 l4 };
}
// Data
{ edge [color = red];
{l2 l3} -> l4;
}
// Order
{ edge [style = invis];
l2 -> l3 -> l4;
}
} }
subgraph {
l8; l9; l10; l12; subgraph cluster_b {
"x = x_in"; "y = y_in"; "x_out = x"; StartF [shape=box,label="Start f()"];
l8 [label="while (x > y)"];
l9 [label="x = x + 1"];
l11 [label="print(x)"];
{rank=max; l9}
{rank=same; l8 l11}
{rank=min; StartF}
StartF -> {l8 l11}
l8 -> l9;
{ edge [color = red, constraint = false];
StartF -> {l8 l9 l11}
l9 -> {l8 l9 l11}
}
} }
l1 [label="main()"];
l2 [label="a = 10"]; l4 -> StartF [style=bold,constraint=false];
l3 [label="b = 20"];
l4 [label="f(a, b)"];
l5 [label="print(a)"];
l8 [label="f()"];
l9 [label="while (x > y)"];
l10 [label="x = x + 1"];
l12 [label="print(x)"];
// Rank
{ rank = same; l9; l12; }
// s0 -> s2 [style=invis];
// Control
{
edge [style = bold];
l1 -> {l2 l3 l4 l5};
l4 -> {"x_in = a" "y_in = b" "a = x_out"};
l8 -> {"x = x_in" "y = y_in" l9 l12 "x_out = x"};
l9 -> l10;
}
// Data
{
edge [color = red];
edge [constraint = false];
l2 -> "x_in = a";
l3 -> "y_in = b";
"a = x_out" -> l5;
{"x = x_in" l10} -> {l9 l10 l12 "x_out = x"};
"y = y_in" -> l9;
}
{
edge [style=dashed];
edge [constraint=false];
"x_in = a" -> "x = x_in";
"y_in = b" -> "y = y_in";
l4 -> l8 [constraint=true];
"x_out = x" -> "a = x_out";
}
{edge [color=blue,constraint=false]; {"x_in = a" "y_in = b"} -> "a = x_out"}
{edge [style=invis]; "y_in = b" -> l8; "y = y_in" -> l9; }
} }

Binary file not shown.

View file

@ -4,7 +4,7 @@
\lstset{ \lstset{
% Numbering % Numbering
numbers=left, numbers=left,
stepnumber=2, stepnumber=1,
numberstyle=\tiny, numberstyle=\tiny,
numbersep=5pt, numbersep=5pt,
% Style % Style

View file

@ -91,6 +91,16 @@
\include{Secciones/state_of_the_art} \include{Secciones/state_of_the_art}
\include{Secciones/solution} \include{Secciones/solution}
\chapter{TODO}
\begin{enumerate}
\item Averiguar si el código adicional que cogen los saltos incondicionales puede reducirse con algún tipo de arco. (menos breaks)
Solución: ver
\item Averiguar si el arco 1 es imprescindible (buscar contraejemplo).
\item Solución alternativa para no tener que elegir entre el 1 y el 2. Sugerencia: sólo coger el catch por control si ambos arcos (1, 2) están activos.
\item Arco 3: el que va
\end{enumerate}
\bibliographystyle{plain} \bibliographystyle{plain}
\bibliography{../../../../../../Biblio/biblio.bib} \bibliography{../../../../../../Biblio/biblio.bib}