diff --git a/Secciones/background.tex b/Secciones/background.tex index fb283f0..df90fd0 100644 --- a/Secciones/background.tex +++ b/Secciones/background.tex @@ -11,13 +11,13 @@ \carlos{el resto, utilizar surveys (Tip95, Sil12)} \\ \carlos{mover párrafo a la intro, aquí poner definiciones formales de program slicing, citar a \cite{AgrH90b}} -\textsl{Program slicing} \cite{Wei81,Sil12}\sergio{hay alguna razon para que \cite{Sil12} no este en la intro?, la unica cita alli es\cite{Wei81}. Propongo eliminar \cite{Sil12} por homogeneidad}\josep{mas bien, tendria que estar 13 tambi\'en en la intro} is a debugging technique that +\textit{Program slicing} \cite{Wei81,Sil12}\sergio{hay alguna razon para que \cite{Sil12} no este en la intro?, la unica cita alli es\cite{Wei81}. Propongo eliminar \cite{Sil12} por homogeneidad}\josep{mas bien, tendria que estar 13 tambi\'en en la intro} is a debugging technique that answers the question: ``which parts of a program \josep{do?} affect a given statement and set of variables?'' The statement and the variables are the basic input to create a slice -and are called the \textsl{slicing criterion}. The criterion can be more +and are called the \textit{slicing criterion}. The criterion can be more complex, as different slicing techniques may require additional pieces of input. -The \textsl{slice} of a program is the list of statements from the original +The \textit{slice} of a program is the list of statements from the original program ---which constitutes a valid program--- whose execution will result in the same values for the variables \josep{frase enrrevesada. yo la. cambiaria. De todas formas, para que sea correcta le sobran los parentesis }(selected in the slicing criterion). There exist two fundamental dimensions along which the problem of slicing can be @@ -26,34 +26,34 @@ proposed \cite{Sil12}: \sergio{Mi propuesta es mover el concepto naive de aqui a la intro para que entiendan algo del ejemplo y aqui hacer referencia a la definicion anterior o introducir las dimensiones de slicing directamente con un pequenyo preambulo. Una fuerte razon para definirlo alli es que usamos todo el rato la palabra slice y de repente, despues de usarla un rato, la definimos.} \begin{itemize} - \item \textsl{Static} or \textsl{dynamic}: slicing can be performed + \item \textit{Static} or \textit{dynamic}: slicing can be performed statically or dynamically. - \textsl{Static slicing} \cite{Wei81} produces slices which\josep{that} consider all + \textit{Static slicing} \cite{Wei81} produces slices which\josep{that} consider all possible executions of the program: the slice will be correct regardless of the input supplied. - In contrast, \textsl{dynamic slicing} \cite{KorL88,AgrH90b} considers a single execution of the program, thus, limiting the slice to + In contrast, \textit{dynamic slicing} \cite{KorL88,AgrH90b} considers a single execution of the program, thus, limiting the slice to the statements present in an execution log. The slicing criterion is expanded to include a position in the log\josep{execution history} that corresponds to one instance of the selected statement, making it much more specific. It may help \josep{to}find a bug related to indeterministic behavior (such as a random or pseudo-random number generator), but \sergio{, despite selecting the same slicing criterion, the slice }must be recomputed for each case\sergio{different input value/execution considered?} being analyzed. - \item \textsl{Backward} or \textsl{forward}: \textsl{backward slicing} + \item \textit{Backward} or \textit{forward}: \textit{backward slicing} \cite{Wei81} is generally more used \sergio{habra que decir lo que es antes de decir que se usa mas no? Cambiar el orden y reescribir esta frase. Decimos que es y luego que es el que generalmente se estudia o algo de eso}, because it looks at the statements - that affect the slicing criterion. In contrast, \textsl{forward slicing} + that affect the slicing criterion. In contrast, \textit{forward slicing} \cite{BerC85} computes the statements that are affected by the slicing - criterion. There also exists a mixed approach called \textsl{chopping} + criterion. There also exists a mixed approach called \textit{chopping} \cite{JacR94}, which is used to find all statements that affect some variables in the slicing criterion and at the same time they are affected by some other variables in the slicing criterion. \end{itemize} Since the definition of program slicing\sergio{Since Weiser defined program slicing in 1981}, the most \deleted{extended form}\added{studied configuration?} of slicing has -been \textsl{static backward slicing}, which obtains the list of statements that +been \textit{static backward slicing}, which obtains the list of statements that affect the value of a variable in a given statement, in all possible executions of the program (i.e., for any input data). \begin{definition}[Strong static backward slice \cite{Wei81}] \label{def:strong-slice} Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where $s$ is a statement and $v$ is a set\sergio{los set no se representan con letras mayusculas?} \carlos{no} of variables in $P$ (the variables may - or may not be used in $s$), $S$ is the \textsl{strong slice} of $P$ with + or may not be used in $s$), $S$ is the \textit{strong slice} of $P$ with respect to $C$ if $S$ has\sergio{fulfils?} the following properties: \begin{enumerate} \item $S$ is an executable program. @@ -72,7 +72,7 @@ of the program (i.e., for any input data). \josep{Si esa cita no es, entonces puedes usar la de Binkley: \url{https://cgi.csc.liv.ac.uk/~coopes/comp319/2016/papers/ProgramSlicing-Binkley+Gallagher.pdf}} Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where $s$ is a statement and $v$ is a set of variables in $P$ (the variables may - or may not be used in $s$), $S$ is the \textsl{weak slice} of $P$ with + or may not be used in $s$), $S$ is the \textit{weak slice} of $P$ with respect to $C$ if $S$ has\sergio{fulfils?} the following properties: \begin{enumerate} \item $S$ is an executable program. @@ -93,8 +93,8 @@ used throughout the literature (see, e.g., \cite{pending}\carlos{Which citation? second. Though the definitions come from the corresponding citations, the naming was first used in a control dependency analysis by Danicic~\cite{DanBHHKL11}, where slices that produce the same output as the original are named -\textsl{strong}, and those where the original is a prefix of the slice, -\textsl{weak}. Weak slicing tends to be preferred ---specially for debugging--- for two reasons: the algorithm can be simpler and avoid dealing with termination \josep{termination no esta contemplada ni en weak ni en strong. Mas bien di que en debugging lo que importa es que el error se produzca. En general da igual cuantas veces se produzca o que se siga produciendo despues.}, and the slices can be smaller, narrowing the focus of the debugger. For some applications, \deleted{strong slices are preferred,} such as extracting a \josep{component or a specialized program}feature from a program, where there is a requirement that the resulting slice behave\josep{s} exactly like\josep{as} the original\added{, strong slices are preferred\josep{esto queda muy lejos ya. Yo partiria la frase en dos}}. In this paper we will \josep{Along the thesis, we indicate} indicate which kind of slice is produced with each new technique proposed. \sergio{Generamos alguna vez strong? Joder que cracks somos xD} +\textit{strong}, and those where the original is a prefix of the slice, +\textit{weak}. Weak slicing tends to be preferred ---specially for debugging--- for two reasons: the algorithm can be simpler and avoid dealing with termination \josep{termination no esta contemplada ni en weak ni en strong. Mas bien di que en debugging lo que importa es que el error se produzca. En general da igual cuantas veces se produzca o que se siga produciendo despues.}, and the slices can be smaller, narrowing the focus of the debugger. For some applications, \deleted{strong slices are preferred,} such as extracting a \josep{component or a specialized program}feature from a program, where there is a requirement that the resulting slice behave\josep{s} exactly like\josep{as} the original\added{, strong slices are preferred\josep{esto queda muy lejos ya. Yo partiria la frase en dos}}. In this paper we will \josep{Along the thesis, we indicate} indicate which kind of slice is produced with each new technique proposed. \sergio{Generamos alguna vez strong? Joder que cracks somos xD} \begin{example}[Strong, weak and incorrect slices] \carlos{The table is labeled execution logs of... but the execution log is a different thing.} @@ -152,9 +152,9 @@ dependencies among nodes. Those edges represent various\sergio{several} kinds of ---control, data, calls, parameter passing, summary--- which\josep{that are defined} will be defined\sergio{further explained?} in section~\ref{sec:first-def-sdg}. \carlos{add how a graph is sliced.} -To create the SDG, first \josep{yo dejaria el a (como estaba)}\deleted{a}\added{the corresponding} \textsl{control flow graph} (CFG) is built for each method +To create the SDG, first \josep{yo dejaria el a (como estaba)}\deleted{a}\added{the corresponding} \textit{control flow graph} (CFG) is built for each method in the program, then\added{,} its \added{associated }control and data dependencies are computed, resulting -in \added{a new graph representation known as }the \textsl{program dependence graph} (PDG)\sergio{cita??}\josep{si, a Ottenstein and Ottenstein}\carlos{TENSTEIN, K. J., AND O’ITENSTEIN, L. M. The program dependence graph in a software development environment}. Finally, all the graphs from every +in \added{a new graph representation known as }the \textit{program dependence graph} (PDG)\sergio{cita??}\josep{si, a Ottenstein and Ottenstein}\carlos{TENSTEIN, K. J., AND O’ITENSTEIN, L. M. The program dependence graph in a software development environment}. Finally, all the graphs from every method are joined \carlos{NO by the appearance of a new kind of inter-procedural arcs, the argument-in argument-out arcs that link function definitions with function calls, obtaining}\deleted{into} the \added{final} SDG. This process will be explained at greater lengths in section~\ref{sec:first-def-sdg}. @@ -178,7 +178,7 @@ lengths in section~\ref{sec:first-def-sdg}. %each method's PDG. When a call is made, the input arguments are passed %to subnodes of the call, and the result is obtained in another subnode. %There is an edge from the call to the beginning of the corresponding - %method, and an extra type of edge exists: \textsl{summary edges}, which + %method, and an extra type of edge exists: \textit{summary edges}, which %summarize the data dependencies between input and output variables. %\end{description} An example \added{of how an initial CFG is augmented and enhanced with all mentioned dependencies obtaining the corresponding PDG and the final SDG} is provided in figure~\ref{fig:basic-graphs}, where a \added{the process is illustrated for a} simple @@ -290,42 +290,42 @@ consists of the following elements:} \begin{description} \item[Throwable.] An interface that encompasses all the exceptions or errors that may be thrown. Its child classes are \texttt{Error} for internal errors in the Java Virtual Machine and \texttt{Exception} for normal errors. - Exceptions can be classified as \textsl{unchecked} + Exceptions can be classified as \textit{unchecked} (those that extend \texttt{RuntimeException}\sergio{se sale esto de la linea por el texttt} or \texttt{Error}) and - \textsl{checked} (all others, may inherit from \texttt{Throwable}, but typically they do so from \texttt{Exception}). The first kind may be thrown anywhere without warning, whereas + \textit{checked} (all others, may inherit from \texttt{Throwable}, but typically they do so from \texttt{Exception}). The first kind may be thrown anywhere without warning, whereas the second, if thrown, must be either caught in the same method or declared in the method header. \item[throws.] A statement that activates an exception, altering the normal - control-flow of the method. If the statement is inside a \textsl{try} - block with a \textsl{catch} clause for its type or any supertype, the + control-flow of the method. If the statement is inside a \textit{try} + block with a \textit{catch} clause for its type or any supertype, the control flow will continue in the first statement of such clause. Otherwise, the method is exited and the check performed again, until either the exception is caught or the last method in the stack - (\textsl{main}) is popped, and the execution of the program ends + (\textit{main}) is popped, and the execution of the program ends abruptly. \item[try.] This statement is followed by a block of statements and by one - or more \textsl{catch} clauses. All exceptions thrown in the statements + or more \textit{catch} clauses. All exceptions thrown in the statements contained or any methods called will be processed by the list of - catches. Optionally, after the \textsl{catch} clauses a \textsl{finally} + catches. Optionally, after the \textit{catch} clauses a \textit{finally} block may appear. \item[catch.] Contains two elements: a variable declaration (the type must be an exception \sergio{exception o exception type?}) and a block of statements to be executed when an exception of the corresponding type (or a subtype) is thrown. - \textsl{catch} clauses are processed sequentially, and if any matches + \textit{catch} clauses are processed sequentially, and if any matches the type of the thrown exception, its block is executed, and the rest are ignored. Variable declarations may be of multiple types \texttt{(T1|T2 exc)}, when two unrelated types of exception must be caught and the same code executed for both. When there is an inheritance relationship, the parent suffices.\footnotemark \item[finally.] Contains a block of statements that will always be executed - if the \textsl{try} is entered. It is used to tidy up, for example - closing I/O streams. The \textsl{finally} can be reached in two ways: - with an exception pending (thrown in \textsl{try} and not captured by - any \textsl{catch} or thrown inside a \textsl{catch}) or without it - (when the \textsl{try} or \textsl{catch} block end successfully). After + if the \textit{try} is entered. It is used to tidy up, for example + closing I/O streams. The \textit{finally} can be reached in two ways: + with an exception pending (thrown in \textit{try} and not captured by + any \textit{catch} or thrown inside a \textit{catch}) or without it + (when the \textit{try} or \textit{catch} block end successfully). After the last instruction of the block is executed, if there is an exception - pending, control will be passed to the corresponding \textsl{catch} or + pending, control will be passed to the corresponding \textit{catch} or the program will end. Otherwise, the execution continues in the next - statement after the \textsl{try-catch-finally} block. + statement after the \textit{try-catch-finally} block. \end{description} \sergio{Me han molao las explicaciones, se entiende muy bien como funciona Java, parece que sea hasta facil de usar :D} diff --git a/Secciones/incremental_slicing.tex b/Secciones/incremental_slicing.tex index 96bbceb..3c389ed 100644 --- a/Secciones/incremental_slicing.tex +++ b/Secciones/incremental_slicing.tex @@ -76,7 +76,7 @@ examples, \added{data and control dependencies are represented by thin solid red \begin{definition}[Program dependence graph] \label{def:pdg} - \josep{Given a program $P$,} The \textsl{program dependence graph} (PDG) \josep{associated with $P$} is a directed graph (and originally a tree\sergio{???}\josep{sobran las aclaraciones historicas en una definicion}) represented by \josep{a triple $\langle N, E_c, E_d \rangle$ where $N$ is...} three elements: a set of nodes $N$, a set of control edges $E_c$ and a set of data edges $E_d$. \sergio{$PDG = \langle N, E_c, E_d \rangle$} + \josep{Given a program $P$,} The \textit{program dependence graph} (PDG) \josep{associated with $P$} is a directed graph (and originally a tree\sergio{???}\josep{sobran las aclaraciones historicas en una definicion}) represented by \josep{a triple $\langle N, E_c, E_d \rangle$ where $N$ is...} three elements: a set of nodes $N$, a set of control edges $E_c$ and a set of data edges $E_d$. \sergio{$PDG = \langle N, E_c, E_d \rangle$} Method $M$, CFG $C = \langle N, E \rangle$, the PDG is $P = \langle N', E_c, E_d \rangle$, where % $$E_c = \{ (a, b) | a, b \in N' \wedge a \ctrldep b\}$$ @@ -105,7 +105,7 @@ program. Given a program $P$ composed of a set of $n$ methods $M = \{m_0 ... m_n\}$ and their associated PDGs (each method $m_i$ has a PDG $G_{PDG}^i = \langle N^i, E_c^i, E_d^i \rangle$), the \textit{system dependence graph} (SDG) of $P$ is a graph $G = \langle N', E'_c, E'_d, E_{fc}, E_s \rangle$ where $N = \bigcup_{i=0}^n N^i$, $ $, $ $, $ $, and $ $. \josep{Arreglar esta definicion como la del PDG. Ahora mismo es totalmente informal. Deberia definirse encima del PDG. Es decir, una SDG es la conexion adecuada de varios PDGs, uno por método. Y solo definir lo nuevo: call arcs, parameter-in arcs, parameter-out arcs y summary arcs.} - The \textsl{system dependence graph} (SDG) is a directed graph that represents the control and data dependencies of a whole program. It has three kinds of edges: control, data and function call. The graph is built combining multiple PDGs, with the ``Start'' nodes labeled after the function they begin. There exists one function call edge between each node containing one or more calls and each of the ``Start'' node\josep{s} of the method called. In a programming language where the function call is ambiguous (e.g. with pointers or polymorphism), there exists one edge leading to every possible function called.\sergio{Esta definicion ha quedado muy informal no? Donde han quedado los $E_c,~E_d,~E_{fc},$ Nodes del PDG...?} + The \textit{system dependence graph} (SDG) is a directed graph that represents the control and data dependencies of a whole program. It has three kinds of edges: control, data and function call. The graph is built combining multiple PDGs, with the ``Start'' nodes labeled after the function they begin. There exists one function call edge between each node containing one or more calls and each of the ``Start'' node\josep{s} of the method called. In a programming language where the function call is ambiguous (e.g. with pointers or polymorphism), there exists one edge leading to every possible function called.\sergio{Esta definicion ha quedado muy informal no? Donde han quedado los $E_c,~E_d,~E_{fc},$ Nodes del PDG...?} \end{definition} \begin{example}[Creation of a SDG from a simple program] diff --git a/Secciones/motivation.tex b/Secciones/motivation.tex index d860499..2d70e98 100644 --- a/Secciones/motivation.tex +++ b/Secciones/motivation.tex @@ -16,7 +16,7 @@ left of it are those that affect or are affected by the values of the selected v \sergio{Se me hace corta esta definicion y me faltan algunas utilidades del program slicing, por que se usa? Realmente no se usa solo en depuracion. Tiene mas usos, esto ademas da referencias a poner si queremos.} \sergio{Carpeta SAC 2017 (paper-poster 3 paginas): ``Program slicing is a technique for program analysis and transformation whose main objective is to extract from a program those statements (the slice) that -influence or are influenced by the values of one or more variables at some point of interest, often called slicing criterion [13, 12, 1, 9]. This technique has been adapted to practically all programming languages, and it has many applications such as debugging [3], program specialization [8], software maintenance [5], code obfuscation [7], etc.".} +influence or are influenced by the values of one or more variables at some point of interest, often called slicing criterion [13, 12, 1, 9]. This technique has been adapted to practically all programming languages, and it has many applications such as debugging [3], program specialization [8], software maintenance [5], code obfuscation [7], etc.''.} \sergio{Cogeria algo de aqui para hacer una definicion mas completa, ademas ya usamos terminologia de slicing como \textit{slice} y \textit{slicing criterion}.} \josep{De acuerdo con Sergio. Un par de cosas más: Entra muy a saco la introducción con una definición. :-) Por otra parte, tal y como está definido (para el lector profano), parece que un slice es todo lo que afecta O es afectado por el slicing criterion. Es decir, como si el "O" formara parte de la definición. Yo hablaría aquí solo de backward slicing, y dejaría forward para luego (igual que has dejado dynamic para luego).} diff --git a/Secciones/problem_solution.tex b/Secciones/problem_solution.tex index 1bfc7d4..b0a91af 100644 --- a/Secciones/problem_solution.tex +++ b/Secciones/problem_solution.tex @@ -366,24 +366,24 @@ Our solution makes slices complete again, but makes them much less correct. As a % \begin{description} % \item[Step 1 (static analysis):] Identify for each instruction the variables read and defined. Each method is annotated with the global variables that they access or modify. -% \item[Step 2 (build CFGs):] Build a CFG for each method of the program. The start of all methods is a vertex labeled \textsl{enter}, which also contains the assignments for parameters and global variables used (\texttt{var = var\_in}). The \textsl{enter} node is connected to the first instruction of the method. In a similar fashion, all methods end in an \textsl{exit} vertex with the corresponding output variables. There exists one \textsl{normal exit} to which the last instruction and all return instructions are connected. If the method can throw any exceptions, there exists one \textsl{error exit} for each type of exception that may be thrown. The normal and erroneous exits are connected to the \textsl{exit} node. +% \item[Step 2 (build CFGs):] Build a CFG for each method of the program. The start of all methods is a vertex labeled \textit{enter}, which also contains the assignments for parameters and global variables used (\texttt{var = var\_in}). The \textit{enter} node is connected to the first instruction of the method. In a similar fashion, all methods end in an \textit{exit} vertex with the corresponding output variables. There exists one \textit{normal exit} to which the last instruction and all return instructions are connected. If the method can throw any exceptions, there exists one \textit{error exit} for each type of exception that may be thrown. The normal and erroneous exits are connected to the \textit{exit} node. -% Every normal statement is connected to the subsequent one by an unlabeled edge. Predicates have two outgoing edges, labeled \textsl{true} and \textsl{false}. Pseudo-predicates also have two outgoing edges. The \textsl{true} edge is connected to the destination of the jump (\textsl{normal exit} in the case of return, the begin or end of the loop in the case of continue and break, etc.). The \textsl{false} edge is a non-executable edge, marked with a dashed line, and it is connected to the next instruction that would be executed if the pseudo-predicate was a \textsl{nop}. +% Every normal statement is connected to the subsequent one by an unlabeled edge. Predicates have two outgoing edges, labeled \textit{true} and \textit{false}. Pseudo-predicates also have two outgoing edges. The \textit{true} edge is connected to the destination of the jump (\textit{normal exit} in the case of return, the begin or end of the loop in the case of continue and break, etc.). The \textit{false} edge is a non-executable edge, marked with a dashed line, and it is connected to the next instruction that would be executed if the pseudo-predicate was a \textit{nop}. -% Nodes that represent a call to a method $M$ include the transfer of parameters and variables that may be read or written to, then execute the call, and finally the extraction of modified variables. Call nodes are an exception to the previous paragraph, as they can have an unlimited amount of outgoing edges. Each outgoing edge lands on a pseudo-predicate which indicates if the execution was correct or an exception was raised. The executable edge of each pseudo-predicate will lead to the next instruction to be executed, whereas the non-executable one will lead to the end of the try-catch block. All call nodes can lead to a \textsl{normal return} node, which is linked to the next instruction, and one error node for each type of exception that may be thrown. The erroneous returns are labeled \textsl{catch ExType}, and lead to the first instruction in the corresponding catch block\footnotemark. Any exception that may not be caught will lead to the erroneous exit node of the method it's in. See the example for more details. +% Nodes that represent a call to a method $M$ include the transfer of parameters and variables that may be read or written to, then execute the call, and finally the extraction of modified variables. Call nodes are an exception to the previous paragraph, as they can have an unlimited amount of outgoing edges. Each outgoing edge lands on a pseudo-predicate which indicates if the execution was correct or an exception was raised. The executable edge of each pseudo-predicate will lead to the next instruction to be executed, whereas the non-executable one will lead to the end of the try-catch block. All call nodes can lead to a \textit{normal return} node, which is linked to the next instruction, and one error node for each type of exception that may be thrown. The erroneous returns are labeled \textit{catch ExType}, and lead to the first instruction in the corresponding catch block\footnotemark. Any exception that may not be caught will lead to the erroneous exit node of the method it's in. See the example for more details. % \footnotetext{A problem presents itself here, as some exceptions may be able to trigger different catch blocks, due to the secuential nature of catches and polymorphism in Java. A way to fix this is to make catch blocks behave as a switch.}. %TODO % \item[Step 3 (compute dependences):] For each node in the CFG, compute the control and data dependencies. Non-executable edges are only included when computing control dependencies.\\ % \carlos{put inside definition} -% A node $a$ is \textsl{control dependent} on node $b$ iff $a$ post-dominates one but not all of $b$'s successors.\\ -% A node $a$ is \textsl{data dependent} on node $b$ iff $b$ defines or may define a variable $x$, $a$ uses or may use $x$, and there is an $x$-definition-free path in the CFG from $b$ to $a$.\\ -% \item[Step 4 (convert each CFG into a PDG):] each node of the CFG is one node of the PDG, with two exceptions. The first are the \textsl{enter}, \textsl{exit} and method call nodes, where the variable input and output assignments are split and placed as control-dependent on their original node. The second is the \textsl{exit} node, which is to be removed (the control-dependencies from \textsl{exit} to the variable outputs is transferred to the \textsl{enter} node). Then all the dependencies computed in the previous step are drawn. -% \item[Step 5 (connect PDGs to form a SDG):] each method call to $M$ must be connected to the \textsl{enter} node in $M$'s PDG, as a control dependence. Each variable input from the method call is connected to a variable input of the method definition via a data dependence. Each variable output from the method definition is connected to the variable output of the method call via a data dependence. Each method exit is connected \carlos{complete}. +% A node $a$ is \textit{control dependent} on node $b$ iff $a$ post-dominates one but not all of $b$'s successors.\\ +% A node $a$ is \textit{data dependent} on node $b$ iff $b$ defines or may define a variable $x$, $a$ uses or may use $x$, and there is an $x$-definition-free path in the CFG from $b$ to $a$.\\ +% \item[Step 4 (convert each CFG into a PDG):] each node of the CFG is one node of the PDG, with two exceptions. The first are the \textit{enter}, \textit{exit} and method call nodes, where the variable input and output assignments are split and placed as control-dependent on their original node. The second is the \textit{exit} node, which is to be removed (the control-dependencies from \textit{exit} to the variable outputs is transferred to the \textit{enter} node). Then all the dependencies computed in the previous step are drawn. +% \item[Step 5 (connect PDGs to form a SDG):] each method call to $M$ must be connected to the \textit{enter} node in $M$'s PDG, as a control dependence. Each variable input from the method call is connected to a variable input of the method definition via a data dependence. Each variable output from the method definition is connected to the variable output of the method call via a data dependence. Each method exit is connected \carlos{complete}. % \end{description} % \begin{itemize} -% \item An extra type of control dependency represented by an ``exception edge''. It will represent the need to include a \textsl{catch} clause when an exception can be thrown. It is represented with a dotted line (dashed line is for data dependency). These edges have a special characteristic: when one is traversed, only ``exception edges'' may be traversed from the new nodes included in the slice. If the same node is reached by another kind of edge, the restriction is lifted. The behavior is documented in algorithm \ref{alg:2pass}, with changes from the original algorithm are \underline{underlined}. +% \item An extra type of control dependency represented by an ``exception edge''. It will represent the need to include a \textit{catch} clause when an exception can be thrown. It is represented with a dotted line (dashed line is for data dependency). These edges have a special characteristic: when one is traversed, only ``exception edges'' may be traversed from the new nodes included in the slice. If the same node is reached by another kind of edge, the restriction is lifted. The behavior is documented in algorithm \ref{alg:2pass}, with changes from the original algorithm are \underline{underlined}. % \item Add an extra ``exception edge'' from each ``exit with exception of type T'' node, where the type of the exception is \texttt{t} to all the corresponding ``\texttt{throw e}'', such that \texttt{e} is or inherits from \texttt{T}. % \item Add an extra ``exception edge'' from each catch statement to every statement that can throw that error. % \item The exception edges will only be placed when the method or the try-catch statement are loop-carrier\footnote{Loop-carrier, when referring to a statement, is the property that in a CFG for the complete program, the node representing the statement is part of a loop, meaning that it could be executed again once it is executed.}. diff --git a/Secciones/state_of_the_art.tex b/Secciones/state_of_the_art.tex index ad0ad24..d3ab8ad 100644 --- a/Secciones/state_of_the_art.tex +++ b/Secciones/state_of_the_art.tex @@ -7,15 +7,15 @@ Slicing was proposed \cite{Wei81} and improved until the proposal of the current system (the SDG) \carlos{(citation)}. Specifically in the context of exceptions, multiple approaches have been attempted, with varying degrees of success. There exist commercial solutions for various programming languages: \carlos{name them and link}. In the realm of academia, there exists no definite solution. One of the most relevant initial proposal\added{s} \cite{AllH03}, although not the first one \cite{SinH98,SinHR99} to target Java specifically. -It uses the existing proposals for \textsl{return}, \textsl{goto} and other unconditional jumps to model the behavior of \textsl{throw} statements. Control flow inside \textsl{try-catch-finally} statements is simulated, both for explicit \textsl{throw} and those nested inside a method call. The base algorithm is presented, and then the proposal is detailed as changes. Unchecked exceptions are considered but regarded as ``worthless'' to include, due to the increase in size of the slices, which reduces their effectiveness as a debugging tool. This is due to the number of unchecked exceptions embedded in normal Java instructions, such as \texttt{NullException} in any instance field or method, \texttt{IndexOutOfBoundsException} in array accesses and countless others. On top of that, handling \textsl{unchecked} exceptions opens the problem of calling an API to which there is no analyzable source code, either because the module was compiled before-hand or because it is part of a distributed system. The first should not be an obstacle, as class files can be easily decompiled. The only information that may be lost is variable names and comments, which \added{do not}\deleted{don't} affect a slice's precision, only its readability. +It uses the existing proposals for \textit{return}, \textit{goto} and other unconditional jumps to model the behavior of \textit{throw} statements. Control flow inside \textit{try-catch-finally} statements is simulated, both for explicit \textit{throw} and those nested inside a method call. The base algorithm is presented, and then the proposal is detailed as changes. Unchecked exceptions are considered but regarded as ``worthless'' to include, due to the increase in size of the slices, which reduces their effectiveness as a debugging tool. This is due to the number of unchecked exceptions embedded in normal Java instructions, such as \texttt{NullException} in any instance field or method, \texttt{IndexOutOfBoundsException} in array accesses and countless others. On top of that, handling \textit{unchecked} exceptions opens the problem of calling an API to which there is no analyzable source code, either because the module was compiled before-hand or because it is part of a distributed system. The first should not be an obstacle, as class files can be easily decompiled. The only information that may be lost is variable names and comments, which \added{do not}\deleted{don't} affect a slice's precision, only its readability. Chang and Jo \cite{JoC04} present an alternative to the CFG by computing exception-induced control flow separately from the traditional control flow computation, but go no further into the ramifications it entails for the PDG and the SDG. -Jiang et al. \cite{JiaZSJ06} describe\deleted{s} a solution specific for the exception system in C++, which differs from Java's implementation of exceptions. They reuse the idea of non-executable edges in \textsl{throw} nodes, and introduce handling \textsl{catch} nodes as a switch, each trying to catch the exception before deferring onto the next \textsl{catch} or propagating it to the calling method. Their proposal is center\added{ed} around the IECFG (Improved Exception Control-Flow Graph), which propagates control dependencies onto the PDG and then the SDG. Finally, in their SDG, each normal and exceptional return and their data output are connected to all \textsl{catch} statements where the data may have arrived, which is fine for the example they propose, but could be inefficient if the method has many different call nodes. +Jiang et al. \cite{JiaZSJ06} describe\deleted{s} a solution specific for the exception system in C++, which differs from Java's implementation of exceptions. They reuse the idea of non-executable edges in \textit{throw} nodes, and introduce handling \textit{catch} nodes as a switch, each trying to catch the exception before deferring onto the next \textit{catch} or propagating it to the calling method. Their proposal is center\added{ed} around the IECFG (Improved Exception Control-Flow Graph), which propagates control dependencies onto the PDG and then the SDG. Finally, in their SDG, each normal and exceptional return and their data output are connected to all \textit{catch} statements where the data may have arrived, which is fine for the example they propose, but could be inefficient if the method has many different call nodes. Others \cite{PraMB11} have worked specifically on the C++ exception framework. \carlos{remove or expand}. -Finally, Hao \cite{JieS11} introduced a Object-Oriented System Dependence Graph with exception handling (EOSDG), which represented a generic object-oriented language, with exception handling capabilities. Its broadness allows for the EOSDG to fit into both Java and C++. It uses concepts from Jiang \cite{JiaZSJ06}, such as cascading \textsl{catch} statements, while adding explicit support for virtual calls, polymorphism and inheritance. +Finally, Hao \cite{JieS11} introduced a Object-Oriented System Dependence Graph with exception handling (EOSDG), which represented a generic object-oriented language, with exception handling capabilities. Its broadness allows for the EOSDG to fit into both Java and C++. It uses concepts from Jiang \cite{JiaZSJ06}, such as cascading \textit{catch} statements, while adding explicit support for virtual calls, polymorphism and inheritance. % TODO UNCOMPLETE @@ -25,7 +25,7 @@ Finally, Hao \cite{JieS11} introduced a Object-Oriented System Dependence Graph In her\josep{their?} paper \added{\cite{pending}}, Horwitz \josep{et al.?} suggests treating exceptions in the following way: \begin{itemize} \item Statements are divided into statements, predicates (loops and conditional blocks) and pseudo-predicates (return and throw statements). Statements only have one successor in the CFG, predicates have two (one when the condition is true and another when false), pseudo-predicates have two, but the one labeled ``false'' is non-executable. The non-executable edge connects to the statement that would be executed if the unconditional jump was replaced by a ``nop''. - \item \textsl{try-catch-finally} blocks are treated differently, but it has fewer dependencies than needed. Each catch block is control-dependent on any statement that may throw the corresponding exception. The \josep{???} + \item \textit{try-catch-finally} blocks are treated differently, but it has fewer dependencies than needed. Each catch block is control-dependent on any statement that may throw the corresponding exception. The \josep{???} \end{itemize} \josep{Crea un entorno example} diff --git a/paper.pdf b/paper.pdf index 3312b68..a07d7c9 100644 Binary files a/paper.pdf and b/paper.pdf differ