This commit is contained in:
Josep Silva 2019-11-18 23:06:07 +00:00
parent b6e27e83d5
commit 6808123db3
3 changed files with 35 additions and 34 deletions

View file

@ -121,16 +121,16 @@ procedure. All edges that connect PDGs are represented with dashed lines.
\subsubsection{Procedures and data dependencies}
The only thing left to explain before introducing more constructs into the
language is the passing of parameters. Most programming language accept a
variable number of input parameters and one output parameter. In the case of
language is the passing of parameters. Most programming language\added{s} accept \added{an arbitrary}\deleted{a
variable} number of input parameters and one output parameter. In the case of
input parameters passed by reference, or constructs such as structs or classes,
modifying a field of a parameter may modify the original variable. In order to
deal with everything related to parameter passing, including global variables,
class fields, etc. there is a small extension to be made to the CFG and PDG.
\added{properly} deal with \deleted{everything related to} parameter passing, including global variables,
class fields, etc. there is a small extension to be made to the CFG and PDG \added{\cite{pendinmg}}.
In the CFG, the ``Start'' and ``End'' nodes contain a list of assignments,
inputting and outputting respectively the appropriate values, as can be seen in
the example. Consequently, every vertex that contains a procedure or function
the example \josep{qué ejemplo? si hay un ejemplo, ponle un identificador y referencialo aquí}. Consequently, every vertex that contains a procedure or function
call pack and unpack the arguments. For every variable $x$ that is used in a
procedure, every call to it must be preceded by $x_{in} = x$, and the
procedures's ``Start'' vertex must contain $x = x_{in}$. The opposite happens
@ -143,18 +143,18 @@ the expression in the $i^{th}$ position in the procedure call) in the call
vertex, and parameters whose modifications inside the procedure are passed back
to the calling procedure must be extracted as $var = par^i_{out}$ (where $var$
is the name of the variable ---passed by reference--- in the calling
procedure).\carlos{What if object/struct passed by value?} As an addition, in
procedure).\carlos{What if object/struct passed by value?} \josep{Esto no lo has comentado. Si es por valor, los $par_{in}$ y los $par_{out}$ no hacen falta (pero pueden dejarse igual)} As an addition, in
the SDG, an extra edge is added (summary edge), which represents the
dependencies that the input variables have on the outputs. This allows the
algorithm to know the dependencies without traversing the corresponding
function.
All these additions are added as extra lines in the ``Start'', ``End'' and
All these additions are added as extra lines\josep{lines?} in the ``Start'', ``End'' and
calling vertices. When building the PDG, all additions (variable assignments)
are split into their own vertices, and are control dependent on them. Data
dependencies no longer flow throw the call vertex, but throw the appropriate
child, which minimizes the size of the slice produced. As an example,
figure~\ref{fig:sdg-loop} shows the three stages of a program, from CFG to SDG.
\added{Figure}\deleted{figure}~\ref{fig:sdg-loop} shows the three stages of a program, from CFG to SDG.
The construction of the CFG is straight-forward, save for the packing and
unpacking of variables in the start, end and call vertices. In the PDG, the
statements are split, control and data dependencies replace the control flow
@ -182,23 +182,23 @@ control dependence. From here, there stem two approaches: the first would be to
redefine control dependency, in order to reflect the real effect of these
instructions ---as some authors~\cite{DanBHHKL11} have tried to do--- and the
second would be to alter the creation of the SDG to ``create'' those
dependencies, which is the most widely--used solution.
dependencies, which is the most widely--used solution \added{\cite{pending1,pending2}}.
The most popular approach was proposed by Ball and Horwitz\cite{BalH93}, and
represents unconditional jumps as a \textsl{pseudo--predicate}. The true edge
would lead to the next instruction to be executed, and the false edge would be
non-executable or \textit{dummy} edges, connected to the instruction that would
be executed were the unconditional jump a \textit{nop}. The consequence of this
be executed were the unconditional jump a \textit{nop}\josep{esta frase no se entiende}. The consequence of this
solution is that every instruction placed after the unconditional jump is
control dependent on the jump, as can be seen in Figure~\ref{fig:break-graphs}.
In the example, when slicing with respect to variable $a$ on line 5, every
In the example \josep{con "the example" te refieres a la figura? Es importante distinguir entre figuras y ejemplos. Lo que este texto te está pidiendo a gritos es que crees un entorno ejemplo, con su identificador. Ese ejemplo muestre y explique la figura, y desde aqui cites el ejemplo, no la figura.}, when slicing with respect to variable $a$ on line 5, every
statement would be included, save for ``print(a)''. Line 4 is not strictly
necessary in this example ---in the context of weak slicing---, but is included
nonetheless. In the original paper, the transformation is proved to be
nonetheless. In the original paper\josep{cual? citalo entre parentesis si es un recordatorio}, the transformation is proved to be
complete, but not correct, as for some examples, the slice includes more
unconditional jumps that would be strictly necessary, even for weak slicing.
unconditional jumps that would be strictly necessary, even for weak slicing. \josep{yo incluiria uno de esos ejemplos aqui}
Ball and Horwitz theorize that a more correct approach would be possible, if it
weren't for the limitation of slices to be a subset of statements of the
were \added{not}\deleted{n't} for the limitation of slices to be a subset of statements of the
program, in the same order as in the original.
\begin{figure}
@ -239,7 +239,7 @@ be solved.
The \texttt{throw} statement represents two elements at the same time: an
unconditional jump and an erroneous exit from its method. The first one has
been extensively covered and solved, but the second requires a small addition
been extensively covered and solved, but the second \added{one} requires a small addition
to the CFG: instead of having a single ``End'' node, it will be split in two
---normal and error exit---, though the ``End'' cannot be removed, as a restriction
of most slicing algorithms is that the CFG have only one sink node. Therefore all
@ -273,7 +273,7 @@ slicing software solution that follows the general model described.
\subsection{\texttt{try-catch} statement}
The \texttt{try-catch-finally} statement is the only way to stop an exception once it's thrown,
The \texttt{try-catch-finally} statement is the only way to stop an exception once \added{it is}\deleted{it's} thrown,
filtering by type, or otherwise letting it propagate further up the call stack. On top of that,
\texttt{finally} helps guarantee consistency, executing in any case (even when an exception is
left uncaught, the program returns or an exception occurs in a \texttt{catch} block). The main
@ -305,7 +305,7 @@ construct. Inside the \texttt{try} there can be four distinct sources of excepti
types that inherit from \texttt{RuntimeException}, but those may only be explicitly thrown.
Their inclusion in program slicing and therefore in the method's CFG generates extra
dependencies that make the slices produced bigger.
\item[Erorrs.] May be generated at any point in the execution of the program, but they normally
\item[\added{Errors}\deleted{Erorrs}.] May be generated at any point in the execution of the program, but they normally
signal a situation from which it may be impossible to recover, such as an internal JVM error.
In general, most programs do not consider these to be ``catch-able''.
\end{description}
@ -342,7 +342,7 @@ structures generate different control dependencies by default.
edge connected to the first statement inside the \texttt{catch} block, and the false edge
to the next \texttt{catch} block, until the last one. The last one will be a pseudo--predicate
connected to the first statement after the \texttt{try} if it is a catch--all type or to the
``Error exit'' if it isn't.
``Error exit'' if it \added{is not}\deleted{isn't}.
\end{description}
\begin{example}[Catches.]\ \\
@ -357,7 +357,7 @@ structures generate different control dependencies by default.
\end{lstlisting}
\end{minipage}
\begin{minipage}{0.49\linewidth}
\carlos{missing figures with 4 alternatives: if-else (with catch--all and without) and switch (same two)}
\carlos{missing figures with 4 alternatives: if-else (with catch--all and without) and switch (same two)}\josep{Definitely!!!}
% \includegraphics[0.5\linewidth]{img/catch1}
% \includegraphics[0.5\linewidth]{img/catch2}
% \includegraphics[0.5\linewidth]{img/catch3}

View file

@ -4,7 +4,7 @@
\chapter{Proposed solution}
\label{cha:solution}
This solution is an extension of Allen's\cite{AllH03}, with some modifications to solve the problem found. Before starting, we need to split all instructions in three categories:
This solution is an extension of Allen's \cite{AllH03}, with some modifications to solve the problem found \josep{el problem found no ha quedado claro. Se ha diluido entre la maraña abrumadora de casos. debes formular y dejar nitido cristalino cual es el problema y por qué no lo solucinan las dsemás aproximaciones, y poner un ejempllo concreto.}. Before starting, we need to split all instructions in three categories:
\begin{description}
\item[statement] non-branching instruction, e.g. an assignment or method call.
@ -12,7 +12,7 @@ This solution is an extension of Allen's\cite{AllH03}, with some modifications t
\item[pseudo-predicate] unconditional jump, e.g. break, continue, return, goto and throw instructions.
\end{description}
Pseudo-predicates have been previously use to model unconditional jumps with a counter-intuitive reasoning: the next statement that would be executed were the pseudo-predicate not there would be executed, therefore it is control dependent on it. Going back to the definition of control dependency, one could argue that the real control dependency is on the conditional branch that lead to the
Pseudo-predicates have been previously use to model unconditional jumps with a counter-intuitive reasoning: the next statement that would be executed were the pseudo-predicate not there would be executed, therefore it is control dependent on it. Going back to the definition of control dependency, one could argue that the real control dependency is on the conditional branch that lead to the \josep{???}
\begin{figure}
\centering
@ -28,7 +28,7 @@ if (a) {
}
print(a);
\end{lstlisting}
\caption{Example of pseudo-predicates control dependencies}
\caption{Example of pseudo-predicates control dependencies \josep{no se referencia a esta figura desde ningún sitio}}
\end{figure}
This is the process used to build the Program Dependence Graph.

View file

@ -4,31 +4,32 @@
\chapter{State of the art}
\label{cha:state-art}
Slicing was proposed\cite{Wei81} and improved until the proposal of the current system (the SDG) \carlos{(citation)}. Specifically in the context of exceptions, multiple approaches have been attempted, with varying degrees of success. There exist commercial solutions for various programming languages: \carlos{name them and link}.
In the realm of academia, there exists no definite solution. One of the most relevant initial proposal\cite{AllH03}, although not the first one\cite{SinH98,SinHR99} to target Java specifically.
Slicing was proposed \cite{Wei81} and improved until the proposal of the current system (the SDG) \carlos{(citation)}. Specifically in the context of exceptions, multiple approaches have been attempted, with varying degrees of success. There exist commercial solutions for various programming languages: \carlos{name them and link}.
In the realm of academia, there exists no definite solution. One of the most relevant initial proposal\added{s} \cite{AllH03}, although not the first one \cite{SinH98,SinHR99} to target Java specifically.
It uses the existing proposals for \textsl{return}, \textsl{goto} and other unconditional jumps to model the behavior of \textsl{throw} statements. Control flow inside \textsl{try-catch-finally} statements is simulated, both for explicit \textsl{throw} and those nested inside a method call. The base algorithm is presented, and then the proposal is detailed as changes. Unchecked exceptions are considered but regarded as ``worthless'' to include, due to the increase in size of the slices, which reduces their effectiveness as a debugging tool. This is due to the number of unchecked exceptions embedded in normal Java instructions, such as \texttt{NullException} in any instance field or method, \texttt{IndexOutOfBoundsException} in array accesses and countless others. On top of that, handling \textsl{unchecked} exceptions opens the problem of calling an API to which there is no analyzable source code, either because the module was compiled before-hand or because it is part of a distributed system. The first should not be an obstacle, as class files can be easily decompiled. The only information that may be lost is variable names and comments, which don't affect a slice's precision, only its readability.
It uses the existing proposals for \textsl{return}, \textsl{goto} and other unconditional jumps to model the behavior of \textsl{throw} statements. Control flow inside \textsl{try-catch-finally} statements is simulated, both for explicit \textsl{throw} and those nested inside a method call. The base algorithm is presented, and then the proposal is detailed as changes. Unchecked exceptions are considered but regarded as ``worthless'' to include, due to the increase in size of the slices, which reduces their effectiveness as a debugging tool. This is due to the number of unchecked exceptions embedded in normal Java instructions, such as \texttt{NullException} in any instance field or method, \texttt{IndexOutOfBoundsException} in array accesses and countless others. On top of that, handling \textsl{unchecked} exceptions opens the problem of calling an API to which there is no analyzable source code, either because the module was compiled before-hand or because it is part of a distributed system. The first should not be an obstacle, as class files can be easily decompiled. The only information that may be lost is variable names and comments, which \added{do not}\deleted{don't} affect a slice's precision, only its readability.
Chang and Jo\cite{JoC04} present an alternative to the CFG by computing exception-induced control flow separately from the traditional control flow computation, but go no further into the ramifications it entails for the PDG and the SDG.
Chang and Jo \cite{JoC04} present an alternative to the CFG by computing exception-induced control flow separately from the traditional control flow computation, but go no further into the ramifications it entails for the PDG and the SDG.
Jiang et al.\cite{JiaZSJ06} describes a solution specific for the exception system in C++, which differs from Java's implementation of exceptions. They reuse the idea of non-executable edges in \textsl{throw} nodes, and introduce handling \textsl{catch} nodes as a switch, each trying to catch the exception before deferring onto the next \textsl{catch} or propagating it to the calling method. Their proposal is center around the IECFG (Improved Exception Control-Flow Graph), which propagates control dependencies onto the PDG and then the SDG. Finally, in their SDG, each normal and exceptional return and their data output are connected to all \textsl{catch} statements where the data may have arrived, which is fine for the example they propose, but could be inefficient if the method has many different call nodes.
Jiang et al. \cite{JiaZSJ06} describe\deleted{s} a solution specific for the exception system in C++, which differs from Java's implementation of exceptions. They reuse the idea of non-executable edges in \textsl{throw} nodes, and introduce handling \textsl{catch} nodes as a switch, each trying to catch the exception before deferring onto the next \textsl{catch} or propagating it to the calling method. Their proposal is center\added{ed} around the IECFG (Improved Exception Control-Flow Graph), which propagates control dependencies onto the PDG and then the SDG. Finally, in their SDG, each normal and exceptional return and their data output are connected to all \textsl{catch} statements where the data may have arrived, which is fine for the example they propose, but could be inefficient if the method has many different call nodes.
Others\cite{PraMB11} have worked specifically on the C++ exception framework. \carlos{remove or expand}.
Others \cite{PraMB11} have worked specifically on the C++ exception framework. \carlos{remove or expand}.
Finally, Hao\cite{JieS11} introduced a Object-Oriented System Dependence Graph with exception handling (EOSDG), which represented a generic object-oriented language, with exception handling capabilities. Its broadness allows for the EOSDG to fit into both Java and C++. It uses concepts from Jiang\cite{JiaZSJ06}, such as cascading \textsl{catch} statements, while adding explicit support for virtual calls, polymorphism and inheritance.
Finally, Hao \cite{JieS11} introduced a Object-Oriented System Dependence Graph with exception handling (EOSDG), which represented a generic object-oriented language, with exception handling capabilities. Its broadness allows for the EOSDG to fit into both Java and C++. It uses concepts from Jiang \cite{JiaZSJ06}, such as cascading \textsl{catch} statements, while adding explicit support for virtual calls, polymorphism and inheritance.
% TODO UNCOMPLETE
\hrulefill
\marginnote{Alternative explanation of \cite{AllH03}, with counter example. Maybe should move the counter example backwards.}
In her paper, Horwitz suggests treating exceptions in the following way:
In her\josep{their?} paper \added{\cite{pending}}, Horwitz \josep{et al.?} suggests treating exceptions in the following way:
\begin{itemize}
\item Statements are divided into statements, predicates (loops and conditional blocks) and pseudo-predicates (return and throw statements). Statements only have one successor in the CFG, predicates have two (one when the condition is true and another when false), pseudo-predicates have two, but the one labeled ``false'' is non-executable. The non-executable edge connects to the statement that would be executed if the unconditional jump was replaced by a ``nop''.
\item \textsl{try-catch-finally} blocks are treated differently, but it has fewer dependencies than needed. Each catch block is control-dependent on any statement that may throw the corresponding exception. The
\item \textsl{try-catch-finally} blocks are treated differently, but it has fewer dependencies than needed. Each catch block is control-dependent on any statement that may throw the corresponding exception. The \josep{???}
\end{itemize}
\begin{lstlisting}[title=Example]
\josep{Crea un entorno example}
\begin{lstlisting}[title=Example]
void main() {
int x = 0;
while (true) {
@ -62,9 +63,9 @@ static class ExceptionB extends Exception {}
static class ExceptionC extends Exception {}
\end{lstlisting}
In this example we can explore all the errors found with the current state of the art.
In this example we can explore all the errors found with the current state of the art. \josep{Seria mucho más claro si tenemos un grafo con la soluciones propuesta para cada problema.}
The first problem found is the lack of \texttt{catch} statements in the slice, as no edge is drawn from the catch. Some of the catch blocks will be included via data dependencies, but some may not be reached, though they are still necessary if the slice includes anything after a caught exception.
Therefore, an extra control dependency must be introduced, in order to always include a ``catch'' statement in the slice if the ``throw'' statement is in the slice. In the example, only the catch statement from line 20 will be included, and if ExceptionC or ExceptionB were thrown, they would not be caught. That would not be a problem if the function $f$ was not executed again, but it is, making the slice incorrect.
Therefore, an extra control dependency must be introduced, in order to always include a ``catch'' statement in the slice if the ``throw'' statement is in the slice. In the example, only the catch statement from line 20 will be included \josep{con que criterio? no has definido el ejemplo. El lector no sabe como interpretar esta figura}, and if ExceptionC or ExceptionB were thrown, they would not be caught. That would not be a problem if the function $f$ was not executed again, but it is, making the slice incorrect.
% vim: set noexpandtab:ts=2:sw=2:wrap