diff --git a/Makefile b/Makefile index a5f5e16..23beb5b 100644 --- a/Makefile +++ b/Makefile @@ -4,6 +4,8 @@ images: $(MAKE) -C img paper.pdf: paper.tex images + pdflatex -synctex=1 -interaction=nonstopmode paper.tex + bibtex paper.aux pdflatex -synctex=1 -interaction=nonstopmode paper.tex pdflatex -synctex=1 -interaction=nonstopmode paper.tex diff --git a/bibliography.tex b/bibliography.tex index 253f4d9..a7e43c1 100644 --- a/bibliography.tex +++ b/bibliography.tex @@ -1,39 +1,17 @@ +% !TeX encoding = UTF-8 +% !TeX spellcheck = en_US +% !TeX root = paper.tex \begin{thebibliography}{99} -\bibitem{weiser79} - Mark D. Weiser. - \textsl{Program Slices: Formal, Psychological, and Practical Investigations of an Automatic Program Abstraction Method.} - 1979. - -\bibitem{sinha98} - Saurabh Sinha, Mary Jean Harrold. - \textsl{Analysis of Programs with Exception-Handling Constructs.} - 1998. - -\bibitem{sinha99} - Saurabh Sinha, Mary Jean Harrold, Gregg Rothermel. - \textsl{System-Dependence-Graph-Based Slicing of Programs With Arbitrary Interprocedural Control Flow.} - 1999. - \bibitem{sinha00} Saurabh Sinha, Mary Jean Harrold. \textsl{Analysis and Testing of Programs with Exception-Handling Constructs.} 2000. -\bibitem{allen03} - Mathew Allen, Susan Horwitz. - \textsl{Slicing Java Programs That Throw and Catch Exceptions.} - 2003. - \bibitem{jo04} Jang-Wu Jo, Byeong-Mo Chang. \textsl{Constructing Control Flow Graph for Java by Decoupling Exception Flow from Normal Flow.} 2004. -\bibitem{jiang06} - Shujuan Jiang, Shengwu Zhou, Yuqin Shi, Yuanpeng Jiang. - \textsl{Improving the Preciseness of Dependence Analysis using Exception Analysis.} - 2006. - \bibitem{jiang07} Shujuan Jiang, Yuanpeng Jiang. \textsl{An Analysis Approach for Testing Exception Handling Programs.} @@ -59,20 +37,10 @@ \textsl{Static Analysis for Java Exception Propagation Structure.} 2010. -\bibitem{prabhu11} - Prakash Prabhu, Naoto Maeda, Gogul Balakrishnan, Franjo Ivančić, Aarti Gupta. - \textsl{Interprocedural Exception Analysis for C++.} - 2011. - -\bibitem{jie11} - Hao Jie, Jiang Shu-juan. - \textsl{An Approach of Slicing for Object-oriented Language with Exception Handling.} - 2011. - \bibitem{chang15} Byeong-Mo Chang, Kwanghoon Choi. \textsl{A review on exception analysis.} 2015. %\bibitem{citekey} -\end{thebibliography} \ No newline at end of file +\end{thebibliography} diff --git a/incremental_slicing.tex b/incremental_slicing.tex index 3ef7fc7..d594b87 100644 --- a/incremental_slicing.tex +++ b/incremental_slicing.tex @@ -1,8 +1,11 @@ +% !TeX encoding = UTF-8 +% !TeX spellcheck = en_US +% !TeX root = paper.tex \chapter{Main explanation?} \section{First definition of the SDG} -The system dependence graph (SDG) is a method for program slicing that was first proposed by Horwitz, Reps and Blinkey \cite{horwitz90}. It builds upon the existing control flow graph (CFG), defining dependencies between vertices of the CFG, and building a program dependence graph (PDG), which represents them. The system dependence graph (SDG) is then build from the assembly of the different PDGs (each representing a method of the program), linking each method call to its corresponding definition. Because each graph is built from the previous one, new constructs can be added with to the CFG, without the need to alter the algorithm that converts CFG to PDG and then to SDG. The only modification possible is the redefinition of a dependency or the addition of new kinds of dependence. +The system dependence graph (SDG) is a method for program slicing that was first proposed by Horwitz, Reps and Blinkey \cite{HorwitzRB88}. It builds upon the existing control flow graph (CFG), defining dependencies between vertices of the CFG, and building a program dependence graph (PDG), which represents them. The system dependence graph (SDG) is then build from the assembly of the different PDGs (each representing a method of the program), linking each method call to its corresponding definition. Because each graph is built from the previous one, new constructs can be added with to the CFG, without the need to alter the algorithm that converts CFG to PDG and then to SDG. The only modification possible is the redefinition of a dependency or the addition of new kinds of dependence. The language covered by the initial proposal was a simple one, featuring procedures with modifiable parameters and basic instructions, including calls to procedures, variable assignments, arithmetic and logic operators and conditional instructions (branches and loops): the basic features of an imperative programming language. The control flow graph was as simple as the programs themselves, with each graph representing one procedure. The instructions of the program are represented as vertices of the graph and are split into two categories: statements, which have no effect on the control flow (assignments, procedure calls) and predicates, whose execution may lead to one of multiple ---though traditionally two--- paths (conditional instructions). Statements are connected sequentially to the next instruction. Predicates have two outgoing edges, each connected to the first statement that should be executed, according to the result of evaluating the conditional expression in the guard of the predicate. @@ -83,7 +86,7 @@ An alternative approach is to represent the instruction as an edge, not a vertex Both of these approaches fail to generate a control dependence from the unconditional jump, as the definition of control dependence (see Definition~\ref{def:ctrl-dep}) requires a vertex to have more than one successor for it to be possible to be a source of control dependence. A possible ---but difficult--- solution would be to redefine control dependence, as some\todo{citation-needed} have done. -The most popular solution was proposed by Ball and Horwitz\cite{ball??}, and represents unconditional jumps as a predicate. +The most popular solution was proposed by Ball and Horwitz\cite{BalH93}, and represents unconditional jumps as a predicate. The true edge would lead to the next instruction to be executed, and the false edge would be non-executable or \textit{dummy} edges, connected to the instruction that would be executed were the unconditional jump a \textit{nop}. The consequence of this solution is that every instruction placed after the unconditional jump is control dependent on the jump, as can be seen in Figure~\ref{fig:break-graphs}. In the example, when slicing with respect to variable $a$ on line 5, every statement would be included, save for ``print(a)''. @@ -115,7 +118,7 @@ static void f() { \section{Exceptions} -As seen in section~\ref{sec:intro-exception}, exception handling adds two constructs: the \texttt{throw} and the \texttt{try-catch} statements. The first one resembles an unconditional control flow statement, with an unknown (on compile time) destination. The exception will be caught by a \texttt{catch} of the corresponding type or a supertype ---if it exists. , but polymorphism and inheritance make the analysis difficult. +As seen in section~\ref{sec:intro-exception}, exception handling in Java adds two constructs: the \texttt{throw} and the \texttt{try-catch} statements. The first one resembles an unconditional control flow statement, with an unknown (on compile time) destination. The exception will be caught by a \texttt{catch} of the corresponding type or a supertype ---if it exists. Otherwise, it will crash the corresponding thread (or in single-threaded programs, stop the Java Virtual Machine). Therefore, as the \subsection{\texttt{throw} statement} diff --git a/introduction.tex b/introduction.tex index a1b5fcd..5e51f80 100644 --- a/introduction.tex +++ b/introduction.tex @@ -1,3 +1,6 @@ +% !TeX encoding = UTF-8 +% !TeX spellcheck = en_US +% !TeX root = paper.tex \chapter{Introduction} \section{Program slicing} @@ -9,7 +12,7 @@ There exist two dimensions along which the problem of slicing can be proposed: \end{itemize} The default choice tends to be a \textsl{static backward slice}, which obtains the list of statements that affect the value of a variable in a given statement in all possible executions of the program. -The \textsl{slice} of a program is a list of statements from the original program which constitutes a valid program, whose execution will result in the same values for the variable being read by a debugger in the selected statement\cite{weiser79}. +The \textsl{slice} of a program is a list of statements from the original program which constitutes a valid program, whose execution will result in the same values for the variable being read by a debugger in the selected statement\cite{Wei81}. Some definitions of slicing\todo{Citation needed} allow for the slice to continue producing values after the program has stopped, making the slices simpler to produce and smaller in size at the cost of different endings\footnotemark. We will name the exact slice ---one that produces exactly the same values--- a \textit{strong} slice, and the permissive one, a \textit{weak} slice. See table \ref{tab:slice-permissive} for an example; with each row showing the values logged at the slicing criterion from the execution of 4 different programs. The first is the original, which computes $3!$. Slice A is one slice, whose execution is identical and therefore is a strong slice. Slice B is correct but continues producing values after the original stops ---a weak slice. It would fit the relaxed definition but not a strict one. Slice C is incorrect, as the values differ from the original. Some data or control dependency has not been included in the slice and the program is behaving in a different way. \footnotetext{POSSIBLE ADDITION: It could be argued that permissive or weak slicing is enough for most uses of slicing, as if we suppose that the bug is present before the end of the program, then the bug must show up in the slice as well, regardless of whether the sliced program continues producing extra values or not.} @@ -27,7 +30,7 @@ Some definitions of slicing\todo{Citation needed} allow for the slice to continu \caption{Execution logs of different slices and their original program.} \end{table} -The most efficient and broadly used tool for slicing is the system dependence graph (SDG), first introduced by Horwitz, Reps and Blinkey\cite{horwitz90}. It represents the statements of a program as vertices, and their dependencies as directed edges. Method calls are connected to method definitions, and so are the corresponding input and output parameters. SDGs show two different kinds of dependencies: \textsl{data} and \textsl{control}. The first one connects nodes that write to variables to the nodes that use (or \textsl{may} use) the value, and it is represented as a dashed\todo{check} line. The latter represents which nodes have control over the execution of others (conditional jumps and loops, mainly), and its representation is a solid line. In order to obtain a slice of a program, its SDG must be built from the source code. Then a two pass search ($\mathcal{O}(n)$ each) is performed to obtain the slice. The SDG can be reused to obtain a different slice of the same program (with a different criterion or kind\footnotemark of slice). The efficiency derives from the linear cost of the search on the SDG, so most modifications\todo{citation needed} modify the complexity of the SDG's construction, but try to keep the slice process linear. +The most efficient and broadly used tool for slicing is the system dependence graph (SDG), first introduced by Horwitz, Reps and Blinkey\cite{HorwitzRB88}. It represents the statements of a program as vertices, and their dependencies as directed edges. Method calls are connected to method definitions, and so are the corresponding input and output parameters. SDGs show two different kinds of dependencies: \textsl{data} and \textsl{control}. The first one connects nodes that write to variables to the nodes that use (or \textsl{may} use) the value, and it is represented as a dashed\todo{check} line. The latter represents which nodes have control over the execution of others (conditional jumps and loops, mainly), and its representation is a solid line. In order to obtain a slice of a program, its SDG must be built from the source code. Then a two pass search ($\mathcal{O}(n)$ each) is performed to obtain the slice. The SDG can be reused to obtain a different slice of the same program (with a different criterion or kind\footnotemark of slice). The efficiency derives from the linear cost of the search on the SDG, so most modifications\todo{citation needed} modify the complexity of the SDG's construction, but try to keep the slice process linear. \footnotetext{TODO: change this word to the proper one.} @@ -51,7 +54,7 @@ An example is provided in figure \ref{fig:basic-graphs}, where a simple multipli \label{fig:basic-graphs} \end{figure} -The original proposal by Weiser\cite{weiser79} covers the simplest of an imperative programming language. The various iterations\todo{cite} until reaching the SDG\todo{cite} have added other elements, such as return statements\todo{cite}, global variables\todo{cite}, object oriented features\todo{cite} and finally exception handling\cite{horwitz03}. +The original proposal by Weiser\cite{Wei81} covers the simplest of an imperative programming language. The various iterations\todo{cite} until reaching the SDG\todo{cite} have added other elements, such as return statements\todo{cite}, global variables\todo{cite}, object oriented features\todo{cite} and finally exception handling\cite{AllH03}. \subsection{Metrics} diff --git a/listings-config.tex b/listings-config.tex index a66ebc5..4aafb1f 100644 --- a/listings-config.tex +++ b/listings-config.tex @@ -1,3 +1,6 @@ +% !TeX encoding = UTF-8 +% !TeX spellcheck = en_US +% !TeX root = paper.tex \lstset{ % Numbering numbers=left, diff --git a/paper.tex b/paper.tex index 312aefb..550ab32 100644 --- a/paper.tex +++ b/paper.tex @@ -1,3 +1,6 @@ +% !TeX encoding = UTF-8 +% !TeX spellcheck = en_US +% !TeX root = paper.tex \documentclass[a4paper,twoside]{report} \usepackage[spanish,english]{babel} @@ -50,9 +53,7 @@ \include{state_of_the_art} \include{solution} -\input{bibliography} - -%\bibliography{mybib} -%\bibliographystyle{plain} +\bibliographystyle{plain} +\bibliography{../../../../../../Biblio/biblio.bib} \end{document} diff --git a/solution.tex b/solution.tex index 1cfa95d..abd606d 100644 --- a/solution.tex +++ b/solution.tex @@ -1,6 +1,9 @@ +% !TeX encoding = UTF-8 +% !TeX spellcheck = en_US +% !TeX root = paper.tex \chapter{Proposed solution} -This solution is an extension of Allen's\cite{allen03}, with some modifications to solve the problem found. Before starting, we need to split all instructions in three categories: +This solution is an extension of Allen's\cite{AllH03}, with some modifications to solve the problem found. Before starting, we need to split all instructions in three categories: \begin{description} \item[statement] non-branching instruction, e.g. an assignment or method call. diff --git a/state_of_the_art.tex b/state_of_the_art.tex index e45ab5d..7ddc8e9 100644 --- a/state_of_the_art.tex +++ b/state_of_the_art.tex @@ -1,22 +1,25 @@ +% !TeX encoding = UTF-8 +% !TeX spellcheck = en_US +% !TeX root = paper.tex \chapter{State of the art} -Slicing was proposed\cite{weiser79} and improved until the proposal of the current system (the SDG) \todo{(citation)}. Specifically in the context of exceptions, multiple approaches have been attempted, with varying degrees of success. There exist commercial solutions for various programming languages: \todo{name them and link}. -In the realm of academia, there exists no definite solution. One of the most relevant initial proposal\cite{allen03}, although not the first one\cite{sinha98,sinha99} to target Java specifically. +Slicing was proposed\cite{Wei81} and improved until the proposal of the current system (the SDG) \todo{(citation)}. Specifically in the context of exceptions, multiple approaches have been attempted, with varying degrees of success. There exist commercial solutions for various programming languages: \todo{name them and link}. +In the realm of academia, there exists no definite solution. One of the most relevant initial proposal\cite{AllH03}, although not the first one\cite{SinH98,SinHR99} to target Java specifically. It uses the existing proposals for \textsl{return}, \textsl{goto} and other unconditional jumps to model the behavior of \textsl{throw} statements. Control flow inside \textsl{try-catch-finally} statements is simulated, both for explicit \textsl{throw} and those nested inside a method call. The base algorithm is presented, and then the proposal is detailed as changes. Unchecked exceptions are considered but regarded as ``worthless'' to include, due to the increase in size of the slices, which reduces their effectiveness as a debugging tool. This is due to the number of unchecked exceptions embedded in normal Java instructions, such as \texttt{NullException} in any instance field or method, \texttt{IndexOutOfBoundsException} in array accesses and countless others. On top of that, handling \textsl{unchecked} exceptions opens the problem of calling an API to which there is no analyzable source code, either because the module was compiled before-hand or because it is part of a distributed system. The first should not be an obstacle, as class files can be easily decompiled. The only information that may be lost is variable names and comments, which don't affect a slice's precision, only its readability. -Chang and Jo\cite{chang04} present an alternative to the CFG by computing exception-induced control flow separately from the traditional control flow computation, but go no further into the ramifications it entails for the PDG and the SDG. +Chang and Jo\cite{JoC04} present an alternative to the CFG by computing exception-induced control flow separately from the traditional control flow computation, but go no further into the ramifications it entails for the PDG and the SDG. -Jiang et al.\cite{jiang06} describes a solution specific for the exception system in C++, which differs from Java's implementation of exceptions. They reuse the idea of non-executable edges in \textsl{throw} nodes, and introduce handling \textsl{catch} nodes as a switch, each trying to catch the exception before deferring onto the next \textsl{catch} or propagating it to the calling method. Their proposal is center around the IECFG (Improved Exception Control-Flow Graph), which propagates control dependencies onto the PDG and then the SDG. Finally, in their SDG, each normal and exceptional return and their data output are connected to all \textsl{catch} statements where the data may have arrived, which is fine for the example they propose, but could be inefficient if the method has many different call nodes. +Jiang et al.\cite{JiaZSJ06} describes a solution specific for the exception system in C++, which differs from Java's implementation of exceptions. They reuse the idea of non-executable edges in \textsl{throw} nodes, and introduce handling \textsl{catch} nodes as a switch, each trying to catch the exception before deferring onto the next \textsl{catch} or propagating it to the calling method. Their proposal is center around the IECFG (Improved Exception Control-Flow Graph), which propagates control dependencies onto the PDG and then the SDG. Finally, in their SDG, each normal and exceptional return and their data output are connected to all \textsl{catch} statements where the data may have arrived, which is fine for the example they propose, but could be inefficient if the method has many different call nodes. -Others\cite{prabhu11} have worked specifically on the C++ exception framework. \todo{remove or expand}. +Others\cite{PraMB11} have worked specifically on the C++ exception framework. \todo{remove or expand}. -Finally, Hao\cite{hao11} introduced a Object-Oriented System Dependence Graph with exception handling (EOSDG), which represented a generic object-oriented language, with exception handling capabilities. Its broadness allows for the EOSDG to fit into both Java and C++. It uses concepts from Jiang\cite{jiang06}, such as cascading \textsl{catch} statements, while adding explicit support for virtual calls, polymorphism and inheritance. +Finally, Hao\cite{JieS11} introduced a Object-Oriented System Dependence Graph with exception handling (EOSDG), which represented a generic object-oriented language, with exception handling capabilities. Its broadness allows for the EOSDG to fit into both Java and C++. It uses concepts from Jiang\cite{JiaZSJ06}, such as cascading \textsl{catch} statements, while adding explicit support for virtual calls, polymorphism and inheritance. % TODO UNCOMPLETE \hrulefill -\marginnote{Alternative explanation of \cite{allen03}, with counter example. Maybe should move the counter example backwards.} +\marginnote{Alternative explanation of \cite{AllH03}, with counter example. Maybe should move the counter example backwards.} In her paper, Horwitz suggests treating exceptions in the following way: \begin{itemize}