added section 3.2 and part of 3.3; changed language to en-GB

This commit is contained in:
Carlos Galindo 2019-12-04 15:47:37 +00:00
parent 59722faa7b
commit 80acb5243e
16 changed files with 174 additions and 235 deletions

View file

@ -1,11 +1,16 @@
% !TEX encoding = UTF-8 % !TEX encoding = UTF-8
% !TEX spellcheck = en_US % !TEX spellcheck = en_GB
% !TEX root = ../paper.tex % !TEX root = ../paper.tex
\chapter{Background} \chapter{Background}
\label{cha:background} \label{cha:background}
\section{Program slicing} \section{Program slicing}
\carlos{citar a Weiser solo hablando del inicio del campo} \\
\carlos{el resto, utilizar surveys (Tip95, Sil12)} \\
\carlos{mover párrafo a la intro, aquí poner definiciones formales de program slicing, citar a \cite{AgrH90b}}
\textsl{Program slicing} \cite{Wei81,Sil12}\sergio{hay alguna razon para que \cite{Sil12} no este en la intro?, la unica cita alli es\cite{Wei81}. Propongo eliminar \cite{Sil12} por homogeneidad}\josep{mas bien, tendria que estar 13 tambi\'en en la intro} is a debugging technique that \textsl{Program slicing} \cite{Wei81,Sil12}\sergio{hay alguna razon para que \cite{Sil12} no este en la intro?, la unica cita alli es\cite{Wei81}. Propongo eliminar \cite{Sil12} por homogeneidad}\josep{mas bien, tendria que estar 13 tambi\'en en la intro} is a debugging technique that
answers the question: ``which parts of a program \josep{do?} answers the question: ``which parts of a program \josep{do?}
affect a given statement and affect a given statement and
@ -25,7 +30,7 @@ proposed \cite{Sil12}:
statically or dynamically. statically or dynamically.
\textsl{Static slicing} \cite{Wei81} produces slices which\josep{that} consider all \textsl{Static slicing} \cite{Wei81} produces slices which\josep{that} consider all
possible executions of the program: the slice will be correct regardless of the input supplied. possible executions of the program: the slice will be correct regardless of the input supplied.
In contrast, \textsl{dynamic slicing} \cite{KorL88} considers a single execution of the program, thus, limiting the slice to In contrast, \textsl{dynamic slicing} \cite{KorL88,} considers a single execution of the program, thus, limiting the slice to
the statements present in an execution log. The slicing criterion is the statements present in an execution log. The slicing criterion is
expanded to include a position in the log\josep{execution history} that corresponds to one expanded to include a position in the log\josep{execution history} that corresponds to one
instance of the selected statement, making it much more specific. It may instance of the selected statement, making it much more specific. It may
@ -44,11 +49,10 @@ Since the definition of program slicing\sergio{Since Weiser defined program slic
been \textsl{static backward slicing}, which obtains the list of statements that been \textsl{static backward slicing}, which obtains the list of statements that
affect the value of a variable in a given statement, in all possible executions affect the value of a variable in a given statement, in all possible executions
of the program (i.e., for any input data). of the program (i.e., for any input data).
\begin{definition}[Strong static backward slice \cite{Wei81,HorwitzRB88}] \begin{definition}[Strong static backward slice \cite{Wei81}]
\label{def:strong-slice} \label{def:strong-slice}
\carlos{One of the citations is the correct one.}\josep{la de Weiser}
Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where
$s$ is a statement and $v$ is a set\sergio{los set no se representan con letras mayusculas?} of variables in $P$ (the variables may $s$ is a statement and $v$ is a set\sergio{los set no se representan con letras mayusculas?} \carlos{no} of variables in $P$ (the variables may
or may not be used in $s$), $S$ is the \textsl{strong slice} of $P$ with or may not be used in $s$), $S$ is the \textsl{strong slice} of $P$ with
respect to $C$ if $S$ has\sergio{fulfils?} the following properties: respect to $C$ if $S$ has\sergio{fulfils?} the following properties:
\begin{enumerate} \begin{enumerate}

View file

@ -1,5 +1,5 @@
% !TEX encoding = UTF-8 % !TEX encoding = UTF-8
% !TEX spellcheck = en_US % !TEX spellcheck = en_GB
% !TEX root = ../paper.tex % !TEX root = ../paper.tex
\chapter{Main explanation?} \chapter{Main explanation?}
\label{cha:incremental} \label{cha:incremental}
@ -38,6 +38,7 @@ to the result of evaluating the conditional expression in the guard of the
predicate. predicate.
\begin{definition}[Control Flow Graph \carlos{add original citation}] \begin{definition}[Control Flow Graph \carlos{add original citation}]
\label{def:cfg}
A \emph{control flow graph} $G$ of a program $P$ is a directed graph, represented as a tuple $\langle N, E \rangle$, where $N$ is a set of nodes, composed of a method's statements plus two special nodes, ``Start'' and ``End''; and $E$ is a set of edges of the form $e = \left(n_1, n_2\right) | n_1, n_2 \in N$. Most algorithms to generate the SDG mandate the ``Start'' node to be the only source and ``End'' to be the only sink in the graph. \carlos{Is it necessary to define source and sink in the context of a graph?}. A \emph{control flow graph} $G$ of a program $P$ is a directed graph, represented as a tuple $\langle N, E \rangle$, where $N$ is a set of nodes, composed of a method's statements plus two special nodes, ``Start'' and ``End''; and $E$ is a set of edges of the form $e = \left(n_1, n_2\right) | n_1, n_2 \in N$. Most algorithms to generate the SDG mandate the ``Start'' node to be the only source and ``End'' to be the only sink in the graph. \carlos{Is it necessary to define source and sink in the context of a graph?}.
Edges are created according to the possible execution paths that exist; each statement is connected to any statement that may immediately follow it. Formally, an edge $e = (n_1, n_2)$ exists if and only if there exists an execution of the program where $n_2$ is executed immediately after $n_1$. In general, expressions are not evaluated; so an \texttt{if} instruction has two outgoing edges even if the condition is always true or false, e.g. \texttt{1 == 0}. Edges are created according to the possible execution paths that exist; each statement is connected to any statement that may immediately follow it. Formally, an edge $e = (n_1, n_2)$ exists if and only if there exists an execution of the program where $n_2$ is executed immediately after $n_1$. In general, expressions are not evaluated; so an \texttt{if} instruction has two outgoing edges even if the condition is always true or false, e.g. \texttt{1 == 0}.
@ -46,6 +47,7 @@ predicate.
To build the PDG and then the SDG, there are two dependencies based directly on the CFG's structure: data and control dependence. To build the PDG and then the SDG, there are two dependencies based directly on the CFG's structure: data and control dependence.
\begin{definition}[Postdominance \carlos{add original citation?}] \begin{definition}[Postdominance \carlos{add original citation?}]
\label{def:postdominance}
Vertex $b$ \textit{postdominates} vertex $a$ if and only if $b$ is on every path from $a$ to the ``End'' vertex. Vertex $b$ \textit{postdominates} vertex $a$ if and only if $b$ is on every path from $a$ to the ``End'' vertex.
\end{definition} \end{definition}
@ -55,6 +57,7 @@ To build the PDG and then the SDG, there are two dependencies based directly on
\end{definition} \end{definition}
\begin{definition}[Data dependency \carlos{add original citation}] \begin{definition}[Data dependency \carlos{add original citation}]
\label{def:data-dep}
Vertex $b$ is \textit{data dependent} on vertex $a$ ($a \datadep b$) if and only if $a$ may define a variable $x$, $b$ may use $x$ and there exists a \carlos{could it be ``an''??} $x$-definition free path from $a$ to $b$. Vertex $b$ is \textit{data dependent} on vertex $a$ ($a \datadep b$) if and only if $a$ may define a variable $x$, $b$ may use $x$ and there exists a \carlos{could it be ``an''??} $x$-definition free path from $a$ to $b$.
Data dependency was originally defined as flow dependency, and split into loop and non--loop related dependencies, but that distinction is no longer useful to compute program slices. Data dependency was originally defined as flow dependency, and split into loop and non--loop related dependencies, but that distinction is no longer useful to compute program slices.
@ -67,6 +70,7 @@ represented as a thin solid line, and the latter as a thick solid line. In the
examples, data dependencies will be thin solid red lines. examples, data dependencies will be thin solid red lines.
\begin{definition}[Program dependence graph] \begin{definition}[Program dependence graph]
\label{def:pdg}
The \textsl{program dependence graph} (PDG) is a directed graph (and originally a tree) represented by three elements: a set of nodes $N$, a set of control edges $E_c$ and a set of data edges $E_d$. The \textsl{program dependence graph} (PDG) is a directed graph (and originally a tree) represented by three elements: a set of nodes $N$, a set of control edges $E_c$ and a set of data edges $E_d$.
The set of nodes corresponds to the set of nodes of the CFG, excluding the ``End'' node. The set of nodes corresponds to the set of nodes of the CFG, excluding the ``End'' node.
@ -80,6 +84,7 @@ Finally, the SDG is built from the combination of all the PDGs that compose the
program. program.
\begin{definition}[System dependence graph] \begin{definition}[System dependence graph]
\label{def:sdg}
The \textsl{system dependence graph} (SDG) is a directed graph that represents the control and data dependencies of a whole program. It has three kinds of edges: control, data and function call. The graph is built combining multiple PDGs, with the ``Start'' nodes labeled after the function they begin. There exists one function call edge between each node containing one or more calls and each of the ``Start'' node of the method called. In a programming language where the function call is ambiguous (e.g. with pointers or polymorphism), there exists one edge leading to every possible function called. The \textsl{system dependence graph} (SDG) is a directed graph that represents the control and data dependencies of a whole program. It has three kinds of edges: control, data and function call. The graph is built combining multiple PDGs, with the ``Start'' nodes labeled after the function they begin. There exists one function call edge between each node containing one or more calls and each of the ``Start'' node of the method called. In a programming language where the function call is ambiguous (e.g. with pointers or polymorphism), there exists one edge leading to every possible function called.
\end{definition} \end{definition}
@ -138,41 +143,30 @@ To such end, the following modifications are made to the different graphs:
\section{Unconditional control flow} \section{Unconditional control flow}
Even though the initial definition of the SDG was useful to compute slices, the Even though the initial definition of the SDG was useful to compute slices, the
language covered was not enough for the typical language of the 1980's, which language covered was not enough for the typical language of the 1980s, which
included (in one form or another) unconditional control flow. Therefore, one of included (in one form or another) unconditional control flow. Therefore, one of
the first additions contributed to the algorithm to build system dependence the first additions contributed to the algorithm to build system dependence
graphs was the inclusion of unconditional jumps, such as ``break'', graphs was the inclusion of unconditional jumps, such as ``break'',
``continue'', ``goto'' and ``return'' statements (or any other equivalent). A ``continue'', ``goto'' and ``return'' statements (or any other equivalent). A
naive representation would be to treat them the same as any other statement, but naive representation would be to treat them the same as any other statement, but
with the outgoing edge landing in the corresponding instruction (outside the with the outgoing edge landing in the corresponding instruction (outside the
loop, at the loop condition, at the method's end, etc.). An alternative loop, at the loop condition, at the method's end, etc.).
approach is to represent the instruction as an edge, not a vertex, connecting An alternative approach is to represent the instruction as an edge, not a vertex, connecting the previous statement with the next to be executed. Both of these approaches fail to generate a control dependence from the unconditional jump, as the definition of control dependence (see definition~\ref{def:ctrl-dep}) requires a vertex to have more than one successor for it to be possible to be a source of control dependence.
the previous statement with the next to be executed. Both of these approaches From here, there stem two approaches: the first would be to
fail to generate a control dependence from the unconditional jump, as the
definition of control dependence (see Definition~\ref{def:ctrl-dep}) requires a
vertex to have more than one successor for it to be possible to be a source of
control dependence. From here, there stem two approaches: the first would be to
redefine control dependency, in order to reflect the real effect of these redefine control dependency, in order to reflect the real effect of these
instructions ---as some authors~\cite{DanBHHKL11} have tried to do--- and the instructions ---as some authors~\cite{DanBHHKL11} have tried to do--- and the
second would be to alter the creation of the SDG to ``create'' those second would be to alter the creation of the SDG to ``create'' those
dependencies, which is the most widely--used solution \added{\cite{pending1,pending2}}. dependencies, which is the most widely--used solution \cite{BalH93}.
The most popular approach was proposed by Ball and Horwitz\cite{BalH93}, and The most popular approach was proposed by Ball and Horwitz~\cite{BalH93}, classifying instructions into three separate categories:
represents unconditional jumps as a \textsl{pseudo--predicate}. The true edge
would lead to the next instruction to be executed, and the false edge would be \begin{description}
non-executable or \textit{dummy} edges, connected to the instruction that would \item[Statement.] Any instruction that is not a conditional or unconditional jump. It has one outgoing edge in the CFG, to the next instruction that follows it in the program.
be executed were the unconditional jump a \textit{nop}\josep{esta frase no se entiende}. The consequence of this \item[Predicate.] Any conditional jump instruction, such as \texttt{while}, \texttt{until}, \texttt{do-while}, \texttt{if}, etc. It has two outgoing edges, labeled \textit{true} and \textit{false}; leading to the corresponding instructions.
solution is that every instruction placed after the unconditional jump is \item[Pseudo--predicates.] Unconditional jumps (e.g. \texttt{break}, \texttt{goto}, \texttt{continue}, \texttt{return}); are like predicates, with the difference that the outgoing edge labeled \textit{false} is marked as non--executable, and there is no possible execution where such edge would be possible, according to the definition of the CFG (as seen in definition~\ref{def:cfg}). Originally the edges had a specific reasoning backing them up: the \textit{true} edge leads to the jump's destination and the \textit{false} one, to the instruction that would be executed if the unconditional jump was removed, or converted into a \texttt{no op} (a blank operation that performs no change to the program's state). This specific behavior is used with unconditional jumps, but no longer applies to pseudo--predicates, as more instructions have used this category as means of ``artificially'' \carlos{bad word choice} generating control dependencies.
control dependent on the jump, as can be seen in Figure~\ref{fig:break-graphs}. \end{description}
In the example \josep{con "the example" te refieres a la figura? Es importante distinguir entre figuras y ejemplos. Lo que este texto te está pidiendo a gritos es que crees un entorno ejemplo, con su identificador. Ese ejemplo muestre y explique la figura, y desde aqui cites el ejemplo, no la figura.}, when slicing with respect to variable $a$ on line 5, every
statement would be included, save for ``print(a)''. Line 4 is not strictly As a consequence of this classification, every instruction after an unconditional jump $j$ is control--dependent (either directly or indirectly) on $j$ and the structure containing it (a conditional statement or a loop), as can be seen in the following example.
necessary in this example ---in the context of weak slicing---, but is included
nonetheless. In the original paper\josep{cual? citalo entre parentesis si es un recordatorio}, the transformation is proved to be
complete, but not correct, as for some examples, the slice includes more
unconditional jumps that would be strictly necessary, even for weak slicing. \josep{yo incluiria uno de esos ejemplos aqui}
Ball and Horwitz theorize that a more correct approach would be possible, if it
were \added{not}\deleted{n't} for the limitation of slices to be a subset of statements of the
program, in the same order as in the original.
\begin{figure} \begin{figure}
\centering \centering
@ -196,39 +190,73 @@ static void f() {
\label{fig:break-graphs} \label{fig:break-graphs}
\end{figure} \end{figure}
\begin{example}[Control dependencies generated by unconditional instructions]
\label{exa:unconditional}
Figure~\ref{fig:break-graphs} showcases a small program with a \texttt{break} statement, its CFG and PDG with a slice in gray. The slicing criterion (line 5, variable $a$) is control dependent on both the unconditional jump and its surrounding conditional instruction (both on line 4); even though it is not necessary to include it (in the context of weak slicing).
Note: the ``Start'' node $S$ is also categorized as a pseudo--statement, with the \textit{false} edge connected to the ``End'' node, therefore generating a dependence from $S$ to all the nodes inside the method. This removes the need to handle $S$ with a special case when converting a CFG to a PDG, but lowers the explainability of non--executable edges as leading to the ``instruction that would be executed if the node was absent or a no--op''.
\end{example}
The original paper~\cite{BalH93} does prove its completeness, but disproves its correctness by providing a counter--example similar to example~\ref{exa:nested-unconditional}. This proof affects both weak and strong slicing, so improvements can be made on this proposal. The authors postulate that a more correct approach would be achievable if the slice's restriction of being a subset of instructions were lifted.
\begin{example}[Nested unconditional jumps]
\label{exa:nested-unconditional}
In the case of nested unconditional jumps where both jump to the same destination, only one of them (the out--most one) is needed. Figure~\ref{fig:nested-unconditional} showcases the problem, with the minimal slice \carlos{have not defined this yet} in gray, and the algorithmically computed slice in light blue. Specifically, lines 3 and 5 are included unnecessarily.
\begin{figure}
\begin{minipage}{0.15\linewidth}
\begin{lstlisting}
while (X) {
if (Y) {
if (Z) {
A;
break;
}
B;
break;
}
C;
}
D;
\end{lstlisting}
\end{minipage}
\begin{minipage}{0.84\linewidth}
\includegraphics[width=0.4\linewidth]{img/nested-unconditional-cfg}
\includegraphics[width=0.59\linewidth]{img/nested-unconditional-pdg}
\end{minipage}
\caption{A program with nested unconditional control flow (left), its CFG (center) and PDG (right).}
\label{fig:nested-unconditional}
\end{figure}
\end{example}
\carlos{Add proposals to fix both problems showcased.}
\section{Exceptions} \section{Exceptions}
Exception handling was first tackled in the context of Java program slicing by Sinha et al. \cite{SinH98}, with later contributions by Allen and Horwitz~\cite{AllH03}. There exist contributions for other programming languages, which will be explored later (chapter~\ref{cha:state-art}) and other small contributions. The following section will explain the treatment of the different elements of exception handling in Java program slicing.
As seen in section~\ref{sec:intro-exception}, exception handling in Java adds As seen in section~\ref{sec:intro-exception}, exception handling in Java adds
two constructs: the \texttt{throw} and the \texttt{try-catch} statements. The two constructs: \texttt{throw} and \texttt{try-catch}. Structurally, the
first one resembles an unconditional control flow statement, with an unknown (on first one resembles an unconditional control flow statement carrying a value ---like \texttt{return} statements--- but its destination is not fixed, as it depends on the dynamic typing of the value.
compile time) destination. The exception will be caught by a \texttt{catch} of If there is a compatible \texttt{catch} block, execution will continue inside it, otherwise the method exits with the corresponding value as the error.
the corresponding type or a supertype ---if it exists. Otherwise, it will crash The same process is repeated in the method that called the current one, until either the call stack is emptied or the exception is successfully caught.
the corresponding thread (or in single-threaded programs, stop the Java Virtual If the exception is not caught at all, the program exits with an error ---except in multi--threaded programs, in which case the corresponding thread is terminated.
Machine). The second stops the exceptional control flow conditionally, based on The \texttt{try-catch} statement can be compared to a \texttt{switch} which compares types (with \texttt{instanceof}) instead of constants (with \texttt{==} and \texttt{Object\#equals(Object)}). Both structures require special handling to place the proper dependencies, so that slices are complete and as correct as can be.
the dynamic typing of the exception thrown. Both introduce challenges that must
be solved.
\subsection{\texttt{throw} statement} \subsection{\texttt{throw} statement}
The \texttt{throw} statement represents two elements at the same time: an The \texttt{throw} statement compounds two elements in one instruction: an
unconditional jump and an erroneous exit from its method. The first one has unconditional jump with a value attached and a switch to an ``exception mode'', in which the statement's execution order is disregarded. The first one has been extensively covered and solved; as it is equivalent to the \texttt{return} instruction, but the second one requires a small addition to the CFG: there must be an alternative control flow, where the path of the exception is shown. For now, without including \texttt{try-catch} structures, any exception thrown will exit its method with an error; so a new ``Error end'' node is needed. The preexisting ``End'' node is renamed ``Normal end'', but now the CFG has two distinct sink nodes; which is forbidden in most slicing algorithms. To solve that problem, a general ``End'' node is created, with both normal and exit ends connected to it; making it the only sink in the graph.
been extensively covered and solved, but the second \added{one} requires a small addition
to the CFG: instead of having a single ``End'' node, it will be split in two
---normal and error exit---, though the ``End'' cannot be removed, as a restriction
of most slicing algorithms is that the CFG have only one sink node. Therefore all
nodes that connected to the ``End'' will now lead to ``Normal exit'', all throw
statements' true outgoing edges will connect to the ``Error exit'', and both exit
nodes will converge on the ``End'' node.
\texttt{throw} statements in Java take a single value, a subtype of \texttt{Throwable}, In order to properly accomodate a method's output variables (global variables or parameters passed by reference that have been modified), variable unpacking is
and that value is used to stop the propagation of the exception; which can be handled
as a returned value. This treatment of \texttt{throw} statements only modifies the
structure of the CFG, without altering any other algorithm, nor the basic definitions
for control and data dependencies, making it very easy to incorporate to any existing
slicing software solution that follows the general model described.
\begin{example}[CFG of an uncaught \texttt{throw} statement] \ \\ This treatment of \texttt{throw} statements only modifies the structure of the CFG, without altering the other graphs, the traversal algorithm, or the basic definitions for control and data dependencies. That fact makes it easy to incorporate to any existing program slicer that follows the general model described. Example~\ref{exa:throw} showcases the new exit nodes and the treatment of the \texttt{throw} as if it were an unconditional jump whose destination is the ``Error exit''.
\begin{minipage}{0.69\linewidth}
\begin{example}[CFG of an uncaught \texttt{throw} statement]
Consider the simple Java method on the right of figure~\ref{fig:throw}; which performs a square root if the number is positive, throwing otherwise a \texttt{RuntimeError}. The CFG in the centre illustrates the treatment of \texttt{throw}, ``normal exit'' and ``error exit'' as pseudo--statements, and the PDG on the right describes the
\label{exa:throw}
\begin{figure}[h]
\begin{minipage}{0.3\linewidth}
\begin{lstlisting} \begin{lstlisting}
double f(int x) { double f(int x) {
if (x < 0) if (x < 0)
@ -236,12 +264,13 @@ slicing software solution that follows the general model described.
return Math.sqrt(x) return Math.sqrt(x)
} }
\end{lstlisting} \end{lstlisting}
By analyzing the CFG, we can see that both exits are control dependent on the \texttt{throw}
statement; data dependencies present no special case in this example.
\end{minipage} \end{minipage}
\begin{minipage}{0.3\linewidth} \begin{minipage}{0.69\linewidth}
\includegraphics[width=\linewidth]{img/throw-example-cfg} \includegraphics[width=\linewidth]{img/throw-example-cfg}
\end{minipage} \end{minipage}
\caption{A simple program with a \texttt{throw} statement, its CFG (centre) and its PDG (left).}
\label{fig:throw}
\end{figure}
\end{example} \end{example}
\subsection{\texttt{try-catch} statement} \subsection{\texttt{try-catch} statement}

View file

@ -1,5 +1,5 @@
% !TEX encoding = UTF-8 % !TEX encoding = UTF-8
% !TEX spellcheck = en_US % !TEX spellcheck = en_GB
% !TEX root = ../paper.tex % !TEX root = ../paper.tex
\chapter{Introduction} \chapter{Introduction}
@ -7,6 +7,8 @@
\section{Motivation} \section{Motivation}
\label{sec:motivation} \label{sec:motivation}
\carlos{Presentar más que definir program slicing.}
Program slicing~\cite{Wei81} is a debugging technique that, given a line of Program slicing~\cite{Wei81} is a debugging technique that, given a line of
code and a set of variables of a program, simplifies such program so that the only parts code and a set of variables of a program, simplifies such program so that the only parts
left of it are those that affect or are affected by the values of the selected variables. \josep{aqui, antes del ejemplo, habria que decir de manera informal que es un slice y que es un SC} left of it are those that affect or are affected by the values of the selected variables. \josep{aqui, antes del ejemplo, habria que decir de manera informal que es un slice y que es un SC}
@ -48,7 +50,9 @@ void f(int x) {
\end{center} \end{center}
\end{example} \end{example}
Slices are executable programs whose execution produces the same values \sergio{OJO!, cuidao con ese jardin que luego esta el weak slice.}\josep{puedes evitar el jard\'in empezando la frase así: ``In its more general form, slices are..."} \carlos{Detallar los distintos usos y evitar relacionar debugging con ejecutable.}
Slices are executable programs whose execution produces the same values \sergio{OJO!, cuidao con ese jardin que luego esta el weak slice.}\josep{puedes evitar el jard\'in empezando la frase así: ``In its more general form, slices are..."} \carlos{Alternativa: programa que se comporta igual (luego se define mismos valores o lista prefija.)}
for the specified line and variable as the original program, and they are used to for the specified line and variable as the original program, and they are used to
facilitate debugging of large and complex programs, where the control and data flow may not facilitate debugging of large and complex programs, where the control and data flow may not
be easily understandable. \josep{en realidad los executable slices no suelen usarse en debugging. M\'as bien en Program specialization...} be easily understandable. \josep{en realidad los executable slices no suelen usarse en debugging. M\'as bien en Program specialization...}
@ -135,13 +139,17 @@ void g(int a) throws Exception {
\end{center} \end{center}
\end{example} \end{example}
Example~\ref{exa:program-slicing2} \josep{is a contribution of this work because it} showcases an important error in the current slicing procedure for programs that handle errors with exceptions\josep{\deleted{; because}\added{where}} the \texttt{catch} block is disregarded. The only way a \texttt{catch} block can be included in the slice is if a statement inside it is needed for another reason. However, Allen and Horwitz did not encounter\josep{tackle? account for?} this problem in their paper~\cite{AllH03}, as the values outputted by method calls are extracted after the \texttt{normal return} and each \texttt{catch}, and in a typical method call with output, the \texttt{catch} is included by default when the outputted value is used. This detail makes the error much smaller, as most \texttt{try-catch} structures are run to obtain a value. \sergio{Anyadir el nodo \textit{out} para que lo que has explicado aqui quede mas comprensible. Viendo que existe el nodo \textit{out}, pero que nadie el SC no lo necesita.} \carlos{mover todas las imágenes y segmentos de código a figuras separadas} \\
\carlos{indicar la conexión entre grafos} \\
\carlos{mover el grafo y la explicación a después del background; el porqué y la solución se presenta en sección X}
Example~\ref{exa:program-slicing2} \josep{is a contribution of this work because it} showcases an important error in the current slicing procedure for programs that handle errors with exceptions\josep{\deleted{; because}\added{where}} the \texttt{catch} block is disregarded. The only way a \texttt{catch} block can be included in the slice is if a statement inside it is needed for another reason. However, Allen and Horwitz~\cite{AllH03} did not encounter\josep{tackle? account for?} this problem in their paper, as the values outputted by method calls are extracted after the \texttt{normal return} and each \texttt{catch}, and in a typical method call with output, the \texttt{catch} is included by default when the outputted value is used. This detail makes the error much smaller, as most \texttt{try-catch} structures are run to obtain a value. \sergio{Anyadir el nodo \textit{out} para que lo que has explicado aqui quede mas comprensible. Viendo que existe el nodo \textit{out}, pero que nadie el SC no lo necesita.}
\added{There is also another }\deleted{A} notable case where a method that may throw an exception is run and no value is recovered (at least from the point of view of program slicing)\added{. It occurs}\deleted{is} when writing to the filesystem or making connections to servers, such as a database or a webservice to store information. In this case, if no confirmation is outputted signaling whether the storage of information was correct, the \texttt{catch} block \deleted{will be}\added{is} omitted, and the \josep{program} slicer \josep{\deleted{software}} \deleted{will} produce\added{s} an incorrect result. \added{There is also another }\deleted{A} notable case where a method that may throw an exception is run and no value is recovered (at least from the point of view of program slicing)\added{. It occurs}\deleted{is} when writing to the filesystem or making connections to servers, such as a database or a webservice to store information. In this case, if no confirmation is outputted signaling whether the storage of information was correct, the \texttt{catch} block \deleted{will be}\added{is} omitted, and the \josep{program} slicer \josep{\deleted{software}} \deleted{will} produce\added{s} an incorrect result.
\section{Contributions} \section{Contributions}
The main contribution of this paper\sergio{paper?research?work?}\josep{work o research} is a The main contribution of this paper\carlos{thesis}\sergio{paper?research?work?}\josep{work o research} is a
\added{new approach for program slicing with exception handling for Java programs.} \deleted{complete technique for program slicing programs in the presence of exception handling constructs for Java}. \added{Our approach}\deleted{This technique} extends the previous technique \added{proposed} by Allen et al. \cite{AllH03}. It \added{is able to properly slice}\deleted{considers} all cases considered in \deleted{that}\added{their} work, but it also provides a solution to \sergio{some other} cases not \deleted{considered}\added{contemplated}\josep{considered} by them. \added{new approach for program slicing with exception handling for Java programs.} \deleted{complete technique for program slicing programs in the presence of exception handling constructs for Java}. \added{Our approach}\deleted{This technique} extends the previous technique \added{proposed} by Allen et al. \cite{AllH03}. It \added{is able to properly slice}\deleted{considers} all cases considered in \deleted{that}\added{their} work, but it also provides a solution to \sergio{some other} cases not \deleted{considered}\added{contemplated}\josep{considered} by them.
For the sake of completeness and in order to understand the process that leaded us to this solution, we \josep{\deleted{will present}\josep{first summarize the fundamentals o background}} a brief history\sergio{background?} of program slicing \added{terminology}, specifically those changes that have affected exception handling.\sergio{delving deeper in the progress of program slicing techniques related to exception handling.?} Furthermore, we provide a summary of the For the sake of completeness and in order to understand the process that leaded us to this solution, we \josep{\deleted{will present}\josep{first summarize the fundamentals o background}} a brief history\sergio{background?} of program slicing \added{terminology}, specifically those changes that have affected exception handling.\sergio{delving deeper in the progress of program slicing techniques related to exception handling.?} Furthermore, we provide a summary of the

View file

@ -1,113 +0,0 @@
% !TEX encoding = UTF-8
% !TEX spellcheck = en_US
% !TEX root = ../paper.tex
\chapter{Proposed solution}
\label{cha:solution}
This solution is an extension of Allen's \cite{AllH03}, with some modifications to solve the problem found \josep{el problem found no ha quedado claro. Se ha diluido entre la maraña abrumadora de casos. debes formular y dejar nitido cristalino cual es el problema y por qué no lo solucinan las dsemás aproximaciones, y poner un ejempllo concreto.}. Before starting, we need to split all instructions in three categories:
\begin{description}
\item[statement] non-branching instruction, e.g. an assignment or method call.
\item[predicate] conditional branch, e.g. if statements and loops.
\item[pseudo-predicate] unconditional jump, e.g. break, continue, return, goto and throw instructions.
\end{description}
Pseudo-predicates have been previously use to model unconditional jumps with a counter-intuitive reasoning: the next statement that would be executed were the pseudo-predicate not there would be executed, therefore it is control dependent on it. Going back to the definition of control dependency, one could argue that the real control dependency is on the conditional branch that lead to the \josep{???}
\begin{figure}
\centering
\begin{lstlisting}
if (a) {
return a;
}
print(a);
\end{lstlisting}
\begin{lstlisting}
if (a) {
}
print(a);
\end{lstlisting}
\caption{Example of pseudo-predicates control dependencies \josep{no se referencia a esta figura desde ningún sitio}}
\end{figure}
This is the process used to build the Program Dependence Graph. \josep{Todo lo que sigue es demasiado verbose. No hay definiciones concretas. Es todo muy informal, y no hay un ongoing example que permita ver como las fases van evolucionando paso a paso. Nos reunimos para hablar de esta sección antes de reescribirla...}
\begin{description}
\item[Step 1 (static analysis):] Identify for each instruction the variables read and defined. Each method is annotated with the global variables that they access or modify.
\item[Step 2 (build CFGs):] Build a CFG for each method of the program. The start of all methods is a vertex labeled \textsl{enter}, which also contains the assignments for parameters and global variables used (\texttt{var = var\_in}). The \textsl{enter} node is connected to the first instruction of the method. In a similar fashion, all methods end in an \textsl{exit} vertex with the corresponding output variables. There exists one \textsl{normal exit} to which the last instruction and all return instructions are connected. If the method can throw any exceptions, there exists one \textsl{error exit} for each type of exception that may be thrown. The normal and erroneous exits are connected to the \textsl{exit} node.
Every normal statement is connected to the subsequent one by an unlabeled edge. Predicates have two outgoing edges, labeled \textsl{true} and \textsl{false}. Pseudo-predicates also have two outgoing edges. The \textsl{true} edge is connected to the destination of the jump (\textsl{normal exit} in the case of return, the begin or end of the loop in the case of continue and break, etc.). The \textsl{false} edge is a non-executable edge, marked with a dashed line, and it is connected to the next instruction that would be executed if the pseudo-predicate was a \textsl{nop}.
Nodes that represent a call to a method $M$ include the transfer of parameters and variables that may be read or written to, then execute the call, and finally the extraction of modified variables. Call nodes are an exception to the previous paragraph, as they can have an unlimited amount of outgoing edges. Each outgoing edge lands on a pseudo-predicate which indicates if the execution was correct or an exception was raised. The executable edge of each pseudo-predicate will lead to the next instruction to be executed, whereas the non-executable one will lead to the end of the try-catch block. All call nodes can lead to a \textsl{normal return} node, which is linked to the next instruction, and one error node for each type of exception that may be thrown. The erroneous returns are labeled \textsl{catch ExType}, and lead to the first instruction in the corresponding catch block\footnotemark. Any exception that may not be caught will lead to the erroneous exit node of the method it's in. See the example for more details.
\footnotetext{A problem presents itself here, as some exceptions may be able to trigger different catch blocks, due to the secuential nature of catches and polymorphism in Java. A way to fix this is to make catch blocks behave as a switch.}. %TODO
\item[Step 3 (compute dependences):] For each node in the CFG, compute the control and data dependencies. Non-executable edges are only included when computing control dependencies.\\
\carlos{put inside definition}
A node $a$ is \textsl{control dependent} on node $b$ iff $a$ post-dominates one but not all of $b$'s successors.\\
A node $a$ is \textsl{data dependent} on node $b$ iff $b$ defines or may define a variable $x$, $a$ uses or may use $x$, and there is an $x$-definition-free path in the CFG from $b$ to $a$.\\
\item[Step 4 (convert each CFG into a PDG):] each node of the CFG is one node of the PDG, with two exceptions. The first are the \textsl{enter}, \textsl{exit} and method call nodes, where the variable input and output assignments are split and placed as control-dependent on their original node. The second is the \textsl{exit} node, which is to be removed (the control-dependencies from \textsl{exit} to the variable outputs is transferred to the \textsl{enter} node). Then all the dependencies computed in the previous step are drawn.
\item[Step 5 (connect PDGs to form a SDG):] each method call to $M$ must be connected to the \textsl{enter} node in $M$'s PDG, as a control dependence. Each variable input from the method call is connected to a variable input of the method definition via a data dependence. Each variable output from the method definition is connected to the variable output of the method call via a data dependence. Each method exit is connected \carlos{complete}.
\end{description}
\begin{itemize}
\item An extra type of control dependency represented by an ``exception edge''. It will represent the need to include a \textsl{catch} clause when an exception can be thrown. It is represented with a dotted line (dashed line is for data dependency). These edges have a special characteristic: when one is traversed, only ``exception edges'' may be traversed from the new nodes included in the slice. If the same node is reached by another kind of edge, the restriction is lifted. The behavior is documented in algorithm \ref{alg:2pass}, with changes from the original algorithm are \underline{underlined}.
\item Add an extra ``exception edge'' from each ``exit with exception of type T'' node, where the type of the exception is \texttt{t} to all the corresponding ``\texttt{throw e}'', such that \texttt{e} is or inherits from \texttt{T}.
\item Add an extra ``exception edge'' from each catch statement to every statement that can throw that error.
\item The exception edges will only be placed when the method or the try-catch statement are loop-carrier\footnote{Loop-carrier, when referring to a statement, is the property that in a CFG for the complete program, the node representing the statement is part of a loop, meaning that it could be executed again once it is executed.}.
\end{itemize}
\begin{algorithm} % generate slice
\caption{Two-pass algorithm to obtain a backward static slice with exceptions}
\label{alg:2pass}
\begin{algorithmic}[1]
\REQUIRE SDG $\mathcal{G}$ representing program P. $\mathcal{G} = \{\mathcal{S}, \mathcal{E}\}$, where $\mathcal{S}$ is a set of states (some are statements) connected by a set of edges $\mathcal{E}$. Each edge, is a triplet composed of the type of edge (control, data or \underline{exception} dependency, summary, param-in, param-out), the source and destination of the edge.
\REQUIRE A slicing criterion, composed of a statement $s \in \mathcal{S}$ and a variable $v$.
\ENSURE $\mathcal{S}' \subseteq \mathcal{S}$, representing the slice of P according to the criterion provided.
\medskip
\COMMENT{First pass (do not traverse output parameter edges).}
\STATE{$\mathcal{S}' \Leftarrow \emptyset$ (slice), $\mathcal{Q}\Leftarrow\{s\}$ (queue), $\mathcal{S}\Leftarrow \mathcal{S} - \{s\}$ (not visited), $\mathcal{R}\Leftarrow \emptyset$ (only visited via exception edge)}
\WHILE{$\mathcal{Q} \neq \emptyset$}
\STATE{$a \in \mathcal{Q}$} \COMMENT{Select an element from $\mathcal{Q}$}
\STATE{$\mathcal{Q} \Leftarrow \mathcal{Q} - \{a\}$}
\STATE{$\mathcal{S}' \Leftarrow \mathcal{S}' + \{a\}$}
\FORALL{$\mathcal{A}$ in $\{\{type, origin, a\} \in \mathcal{E}\}$}
\IF{$type \neq$ param-out \AND ($origin \notin \mathcal{S}'$ \OR ($origin \in \mathcal{R}$ \AND $a \notin \mathcal{R}$))} \label{line:param-out}
\IF{\underline{$a \in \mathcal{R}$}}
\IF{\underline{$type =$ exception}}
\STATE{\underline{$\mathcal{Q} \Leftarrow \mathcal{Q} + \{origin\}$}}
\STATE{\underline{$\mathcal{R} \Leftarrow \mathcal{R} + \{origin\}$}}
\ENDIF
\ELSE
\STATE{$\mathcal{Q} \Leftarrow \mathcal{Q} + \{origin\}$}
\ENDIF
\ENDIF
\ENDFOR
\ENDWHILE
\\
\medskip
\COMMENT{Second pass (very similar, do not traverse input parameter edges).}
\STATE $\mathcal{Q} \Leftarrow \mathcal{S}'$
\WHILE{$\mathcal{Q} \neq \emptyset$}
\STATE{$a \in \mathcal{Q}$} \COMMENT{Select an element from $\mathcal{Q}$}
\STATE{$\mathcal{Q} \Leftarrow \mathcal{Q} - \{a\}$}
\STATE{$\mathcal{S}' \Leftarrow \mathcal{S}' + \{a\}$}
\FORALL{$\mathcal{A}$ in $\{\{type, origin, a\} \in \mathcal{E}\}$}
\IF{$type \neq$ param-in \AND ($origin \notin \mathcal{S}'$ \OR ($origin \in \mathcal{R}$ \AND $a \notin \mathcal{R}$))}
\IF{\underline{$a \in \mathcal{R}$}}
\IF{\underline{$type =$ exception}}
\STATE{\underline{$\mathcal{Q} \Leftarrow \mathcal{Q} + \{origin\}$}}
\STATE{\underline{$\mathcal{R} \Leftarrow \mathcal{R} + \{origin\}$}}
\ENDIF
\ELSE
\STATE{$\mathcal{Q} \Leftarrow \mathcal{Q} + \{origin\}$}
\ENDIF
\ENDIF
\ENDFOR
\ENDWHILE
\end{algorithmic}
\end{algorithm}
% vim: set noexpandtab:ts=2:sw=2:wrap

View file

@ -1,7 +1,7 @@
% !TEX encoding = UTF-8 % !TEX encoding = UTF-8
% !TEX spellcheck = en_US % !TEX spellcheck = en_GB
% !TEX root = ../paper.tex % !TEX root = ../paper.tex
\chapter{State of the art} \chapter{Related work}
\label{cha:state-art} \label{cha:state-art}
Slicing was proposed \cite{Wei81} and improved until the proposal of the current system (the SDG) \carlos{(citation)}. Specifically in the context of exceptions, multiple approaches have been attempted, with varying degrees of success. There exist commercial solutions for various programming languages: \carlos{name them and link}. Slicing was proposed \cite{Wei81} and improved until the proposal of the current system (the SDG) \carlos{(citation)}. Specifically in the context of exceptions, multiple approaches have been attempted, with varying degrees of success. There exist commercial solutions for various programming languages: \carlos{name them and link}.

View file

@ -1,22 +1,29 @@
digraph g { digraph g {
"f()" [shape=box, rank=min]; "f()" [shape=box, rank=min, style = filled];
// Rank adjustment // Rank adjustment
{ node [style=filled]
{ rank = same; "int a = 1"; "while (a > 0)"; } { rank = same; "int a = 1"; "while (a > 0)"; }
"if (a > 10)"; break;
}
{ rank = same; "print(a)"; "a++"; } { rank = same; "print(a)"; "a++"; }
{ rank = max; "a++"; "print(a)"; } { rank = max; "a++"; "print(a)"; }
"a++" [style="filled,bold"];
// Control flow // Control flow
"f()" -> "int a = 1" [style=bold]; "f()" -> "while (a > 0)";
"f()" -> "while (a > 0)" [style=bold]; "f()" -> "int a = 1";
"while (a > 0)" -> "if (a > 10)" [style=bold]; "while (a > 0)" -> "if (a > 10)";
"if (a > 10)" -> "break" [style=bold]; "if (a > 10)" -> "break";
"break" -> "print(a)" [style=bold]; "break" -> "print(a)";
"break" -> "a++" [style=bold]; "break" -> "a++";
"break" -> "while (a > 0)" [style=bold]; "break" -> "while (a > 0)";
// Data flow // Data flow
"int a = 1" -> "while (a > 0)" [color=red]; { edge [color = red];
"int a = 1" -> "if (a > 10)" [color=red]; "int a = 1" -> "while (a > 0)";
"int a = 1" -> "print(a)" [color=red]; "int a = 1" -> "if (a > 10)";
"a++" -> "a++" -> "while (a > 0)" [color=red]; "int a = 1" -> "print(a)";
"a++" -> "if (a > 10)" [color=red]; "a++" -> "a++";
"a++" -> "print(a)" [color=red]; "a++" -> "while (a > 0)";
"a++" -> "if (a > 10)";
"a++" -> "print(a)" [constraint = true];
}
} }

Binary file not shown.

View file

@ -1,8 +1,21 @@
digraph g { digraph g {
Start [shape=box]; Start [shape=box,label=<Start<br/>x = x_in>];
End [shape=box]; End [shape=box];
Start -> End [style=dashed]; Start -> End [style=dashed];
Start -> "if (x < 0)" -> "throw" -> "Error exit" -> End; Start -> "if (x < 0)" -> "throw" -> "Error exit" -> End;
"throw" -> "return Math.sqrt(x)" [style=dashed]; "throw" -> "return Math.sqrt(x)" [style=dashed];
"if (x < 0)" -> "return Math.sqrt(x)" -> "Normal exit" -> End; "if (x < 0)" -> "return Math.sqrt(x)" -> "Normal exit" -> End;
// pdg
f [label="f()",shape=rect];
x_in [label = "x = x_in", style = dashed];
if [label = "if (x < 0)"];
t [label = "throw"];
ret [label = "return Math.sqrt(x)"];
ee [label = "error exit", style = dashed];
ne [label = "normal exit", style = dashed];
f -> x_in;
f -> if -> t -> {ret ee ne};
{ edge [color = red, constraint = false];
x_in -> {if ret};
}
} }

Binary file not shown.

View file

@ -1,5 +1,5 @@
% !TEX encoding = UTF-8 % !TEX encoding = UTF-8
% !TEX spellcheck = en_US % !TEX spellcheck = en_GB
% !TEX root = paper.tex % !TEX root = paper.tex
\chapter{Main explanation?} \chapter{Main explanation?}

View file

@ -1,5 +1,5 @@
% !TEX encoding = UTF-8 % !TEX encoding = UTF-8
% !TEX spellcheck = en_US % !TEX spellcheck = en_GB
% !TEX root = paper.tex % !TEX root = paper.tex
\chapter{Introduction} \chapter{Introduction}

View file

@ -1,5 +1,5 @@
% !TEX encoding = UTF-8 % !TEX encoding = UTF-8
% !TEX spellcheck = en_US % !TEX spellcheck = en_GB
% !TEX root = paper.tex % !TEX root = paper.tex
\lstset{ \lstset{
% Numbering % Numbering

View file

@ -1,5 +1,5 @@
% !TEX encoding = UTF-8 % !TEX encoding = UTF-8
% !TEX spellcheck = en_US % !TEX spellcheck = en_GB
% !TEX root = paper.tex % !TEX root = paper.tex
\chapter{Introduction} \chapter{Introduction}

View file

@ -1,5 +1,5 @@
% !TEX encoding = UTF-8 % !TEX encoding = UTF-8
% !TEX spellcheck = en_US % !TEX spellcheck = en_GB
% !TEX root = paper.tex % !TEX root = paper.tex
\documentclass[a4paper,twoside]{report} \documentclass[a4paper,twoside]{report}
@ -88,18 +88,9 @@
\include{Secciones/motivation} \include{Secciones/motivation}
\include{Secciones/background} \include{Secciones/background}
\include{Secciones/incremental_slicing} \include{Secciones/incremental_slicing}
\include{Secciones/problem_solution}
\include{Secciones/state_of_the_art} \include{Secciones/state_of_the_art}
\include{Secciones/solution} \include{Secciones/conclusion}
\chapter{TODO}
\begin{enumerate}
\item Averiguar si el código adicional que cogen los saltos incondicionales puede reducirse con algún tipo de arco. (menos breaks)
Solución: ver
\item Averiguar si el arco 1 es imprescindible (buscar contraejemplo).
\item Solución alternativa para no tener que elegir entre el 1 y el 2. Sugerencia: sólo coger el catch por control si ambos arcos (1, 2) están activos.
\item Arco 3: el que va
\end{enumerate}
\bibliographystyle{plain} \bibliographystyle{plain}
\bibliography{../../../../../../Biblio/biblio.bib} \bibliography{../../../../../../Biblio/biblio.bib}

View file

@ -1,5 +1,5 @@
% !TEX encoding = UTF-8 % !TEX encoding = UTF-8
% !TEX spellcheck = en_US % !TEX spellcheck = en_GB
% !TEX root = paper.tex % !TEX root = paper.tex
\chapter{Proposed solution} \chapter{Proposed solution}

View file

@ -1,5 +1,5 @@
% !TEX encoding = UTF-8 % !TEX encoding = UTF-8
% !TEX spellcheck = en_US % !TEX spellcheck = en_GB
% !TEX root = paper.tex % !TEX root = paper.tex
\chapter{State of the art} \chapter{State of the art}