respond to feedback of chapters 1 & 2
This commit is contained in:
parent
62e4799563
commit
2b008ded78
10 changed files with 262 additions and 301 deletions
|
@ -5,59 +5,55 @@
|
|||
\chapter{Background}
|
||||
\label{cha:background}
|
||||
|
||||
Before delving into the specific problems that exist in program slicing currently, let's explore the surface of this thesis' relevant fields: program slicing and exception handling. The last one will be focused specifically on the Java programming language, but could be generalized to other popular programming languages which feature a similar exception handling system (e.g., Python, JavaScript, C++).
|
||||
|
||||
\section{Program slicing}
|
||||
|
||||
\carlos{citar a Weiser solo hablando del inicio del campo} \\
|
||||
\carlos{el resto, utilizar surveys (Tip95, Sil12)} \\
|
||||
\carlos{mover párrafo a la intro, aquí poner definiciones formales de program slicing, citar a \cite{AgrH90b}}
|
||||
This section provides a series of definitions and background information so that future definitions can be grounded in a common foundation. \carlos{ampliar intro?}
|
||||
|
||||
\textit{Program slicing} \cite{Wei81,Sil12}\sergio{hay alguna razon para que \cite{Sil12} no este en la intro?, la unica cita alli es\cite{Wei81}. Propongo eliminar \cite{Sil12} por homogeneidad}\josep{mas bien, tendria que estar 13 tambi\'en en la intro} is a debugging technique that
|
||||
answers the question: ``which parts of a program \josep{do?}
|
||||
affect a given statement and
|
||||
set of variables?'' The statement and the variables are the basic input to create a slice
|
||||
and are called the \textit{slicing criterion}. The criterion can be more
|
||||
complex, as different slicing techniques may require additional pieces of input.
|
||||
The \textit{slice} of a program is the list of statements from the original
|
||||
program ---which constitutes a valid program--- whose execution will result in
|
||||
the same values for the variables \josep{frase enrrevesada. yo la. cambiaria. De todas formas, para que sea correcta le sobran los parentesis }(selected in the slicing criterion).
|
||||
There exist two fundamental dimensions along which the problem of slicing can be
|
||||
proposed \cite{Sil12}:
|
||||
\begin{definition}[Program slicing] \label{def:program-slicing}
|
||||
\textit{Program slicing} is the process of extracting a slice $S$ given a program $P$ and a slicing criterion $SC$.
|
||||
\end{definition}
|
||||
|
||||
\sergio{Mi propuesta es mover el concepto naive de aqui a la intro para que entiendan algo del ejemplo y aqui hacer referencia a la definicion anterior o introducir las dimensiones de slicing directamente con un pequenyo preambulo. Una fuerte razon para definirlo alli es que usamos todo el rato la palabra slice y de repente, despues de usarla un rato, la definimos.}
|
||||
\begin{definition}[Slicing criterion] \label{def:slicing-criterion}
|
||||
Given a program $P$, composed of statements and containing variables $x_1, x_2 ... x_n \in \textnormal{vars}$, a \textit{slicing criterion} is a tuple $SC = \langle s, v \rangle$ where $s \in P$ is a single statement that belongs to the program, and $v$ is a set of variables from $P$. Each variable in $v$ may not appear in $s$.
|
||||
\end{definition}
|
||||
|
||||
\begin{definition}[Slice] \label{def:slice}
|
||||
Given a program $P$ and a slicing criterion $SC = \langle s, v \rangle$, a \textit{slice} is a subset of statements of $P$ ($S \subset P$), which behaves like the original program $P$, when considering the values of the variables in $v$ in statement $s$.
|
||||
\end{definition}
|
||||
|
||||
\begin{definition}[Execution history] \label{def:execution-history}
|
||||
Given a program $P$, composed of a set of statements $S = \{s_1, s_2, s_3 ... s_n\}$, and a set of input values $I$, the \textit{execution history} of $P$ given $I$ is the list of statements $H$ that is executed, in the order that they were executed.
|
||||
\end{definition}
|
||||
|
||||
Until now, the concept of slicing has been centred around finding the instructions that affect a variable.
|
||||
That is the original definition, but as time has progressed, variations have been proposed, with the one described in definitions \ref{def:program-slicing}, \ref{def:slicing-criterion} and \ref{def:slice} is called \textit{static backward slicing}.
|
||||
It is also the one that will be used throughout this thesis, though the errors detected and solutions proposed can be easily generalized to others.
|
||||
The different variations are described later in this chapter, but there exist two fundamental dimensions along which the slicing problem can be proposed \cite{Sil12}:
|
||||
|
||||
\begin{itemize}
|
||||
\item \textit{Static} or \textit{dynamic}: slicing can be performed
|
||||
statically or dynamically.
|
||||
\textit{Static slicing} \cite{Wei81} produces slices which\josep{that} consider all
|
||||
possible executions of the program: the slice will be correct regardless of the input supplied.
|
||||
In contrast, \textit{dynamic slicing} \cite{KorL88,AgrH90b} considers a single execution of the program, thus, limiting the slice to
|
||||
the statements present in an execution log. The slicing criterion is
|
||||
expanded to include a position in the log\josep{execution history} that corresponds to one
|
||||
instance of the selected statement, making it much more specific. It may
|
||||
help \josep{to}find a bug related to indeterministic behavior (such as a random
|
||||
or pseudo-random number generator), but \sergio{, despite selecting the same slicing criterion, the slice }must be recomputed for each case\sergio{different input value/execution considered?}
|
||||
being analyzed.
|
||||
\item \textit{Backward} or \textit{forward}: \textit{backward slicing}
|
||||
\cite{Wei81} is generally more used \sergio{habra que decir lo que es antes de decir que se usa mas no? Cambiar el orden y reescribir esta frase. Decimos que es y luego que es el que generalmente se estudia o algo de eso}, because it looks at the statements
|
||||
that affect the slicing criterion. In contrast, \textit{forward slicing}
|
||||
\cite{BerC85} computes the statements that are affected by the slicing
|
||||
criterion. There also exists a mixed approach called \textit{chopping}
|
||||
\cite{JacR94}, which is used to find all statements that affect some variables in the slicing criterion and at the same time they are affected by some other variables in the slicing criterion.
|
||||
\item \textit{Static} or \textit{dynamic}: slicing can be performed statically or dynamically.
|
||||
\textit{Static slicing} \cite{Sil12} produces slices that consider all possible executions of the program: the slice will be correct regardless of the input supplied.
|
||||
In contrast, \textit{dynamic slicing} \cite{KorL88,AgrH90b} considers a single execution of the program, thus, limiting the slice to the statements present in an execution log.
|
||||
The slicing criterion is expanded to include a position in the execution history that corresponds to one instance of the selected statement, making it much more specific.
|
||||
It may help find \carlos{idk if I need the ``to''} a bug related to indeterministic behaviour ---such as a random or pseudo-random number generator--- but, despite selecting the same slicing criterion in the same program, the slice must be recomputed for each set of input values or execution considered. \carlos{Talk about quasi-static as a middle ground?}
|
||||
\item \textit{Backward} or \textit{forward}: \textit{backward slicing} \cite{Sil12} looks for the statements that affect the slicing criterion.
|
||||
It sits among the most commonly used slicing technique.
|
||||
In contrast, \textit{forward slicing} \cite{BerC85} computes the statements that are affected by the slicing criterion.
|
||||
There also exists a middle-ground approach called \textit{chopping} \cite{JacR94}, which is used to find all the statements that affect some variables in the slicing criterion and at the same time they are affected by some other variables in the slicing criterion.
|
||||
\end{itemize}
|
||||
|
||||
Since the definition of program slicing\sergio{Since Weiser defined program slicing in 1981}, the most \deleted{extended form}\added{studied configuration?} of slicing has
|
||||
been \textit{static backward slicing}, which obtains the list of statements that
|
||||
affect the value of a variable in a given statement, in all possible executions
|
||||
of the program (i.e., for any input data).
|
||||
\begin{definition}[Strong static backward slice \cite{Wei81}]
|
||||
Since the seminal definition of program slicing by Weiser \cite{Wei81}, the most studied variation of slicing has been \textit{static backward slicing}, which has been defined in previous sections of this thesis.
|
||||
That definition can be split in two sub-types, \textit{strong} and \textit{weak} slices, with different levels of requirements and uses in different fields.
|
||||
|
||||
\begin{definition}[Strong static backward slice \cite{Tip95}]
|
||||
\label{def:strong-slice}
|
||||
Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where
|
||||
$s$ is a statement and $v$ is a set\sergio{los set no se representan con letras mayusculas?} \carlos{no} of variables in $P$ (the variables may
|
||||
or may not be used in $s$), $S$ is the \textit{strong slice} of $P$ with
|
||||
respect to $C$ if $S$ has\sergio{fulfils?} the following properties:
|
||||
Given a program $P$ and a slicing criterion $SC = \langle s,v \rangle$, $S$ is a \textit{strong static backward slice} of $P$ with
|
||||
respect to $SC$ if $S$ fulfils the following properties:
|
||||
\begin{enumerate}
|
||||
\item $S$ is an executable program.
|
||||
\item $S \subseteq P$, or $S$ is the result of removing code\sergio{code o 0 or more statements?} from $P$.
|
||||
\item $S \subseteq P$, or $S$ is the result of removing 0 or more statements from $P$.
|
||||
\item For any input $I$, the values produced on each execution of $s$
|
||||
for each of the variables in $v$ is the same when executing $S$ as
|
||||
when executing $P$. \label{enum:exact-output}
|
||||
|
@ -68,15 +64,11 @@ of the program (i.e., for any input data).
|
|||
|
||||
\begin{definition}[Weak static backward slice \cite{RepY89}]
|
||||
\label{def:weak-slice}
|
||||
\carlos{Check citation and improve ``formalization''?}
|
||||
\josep{Si esa cita no es, entonces puedes usar la de Binkley: \url{https://cgi.csc.liv.ac.uk/~coopes/comp319/2016/papers/ProgramSlicing-Binkley+Gallagher.pdf}}
|
||||
Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where
|
||||
$s$ is a statement and $v$ is a set of variables in $P$ (the variables may
|
||||
or may not be used in $s$), $S$ is the \textit{weak slice} of $P$ with
|
||||
respect to $C$ if $S$ has\sergio{fulfils?} the following properties:
|
||||
\josep{Si esa cita no es, entonces puedes usar la de Binkley: \cite{BinG96}}
|
||||
Given a program $P$ and a slicing criterion $SC = \langle s,v \rangle$, $S$ is the \textit{weak static backward slice} of $P$ with respect to $SC$ if $S$ fulfils the following properties:
|
||||
\begin{enumerate}
|
||||
\item $S$ is an executable program.
|
||||
\item $S \subseteq P$, or $S$ is the result of removing code from $P$. \sergio{idem}
|
||||
\item $S \subseteq P$, or $S$ is the result of removing 0 or more statements from $P$.
|
||||
\item For any input $I$, the values produced on each execution of $s$
|
||||
for each of the variables in $v$ when executing $P$ is a prefix of
|
||||
those produced while executing $S$ ---which means that the slice
|
||||
|
@ -89,108 +81,83 @@ of the program (i.e., for any input data).
|
|||
\josep{Si se formaliza con el uso de seq, entonces puedes mirar la definicion del paper de POI testing (Sergio sabe cual es).}
|
||||
|
||||
Both definitions (\ref{def:strong-slice} and~\ref{def:weak-slice}) are
|
||||
used throughout the literature (see, e.g., \cite{pending}\carlos{Which citation? Most papers on exception slicing do not indicate or hint whether they use strong or weak.}\sergio{Josep?}\josep{para Strong se puede poner a Weiser. Para Weak se puede poner a Binkley \url{https://cgi.csc.liv.ac.uk/~coopes/comp319/2016/papers/ProgramSlicing-Binkley+Gallagher.pdf}}), \josep{este final de frase lo quitaria:}with some cases \deleted{favoring}\added{favouring} the first and some the
|
||||
second. Though the definitions come from the corresponding citations, the naming
|
||||
was first used in a control dependency analysis by Danicic~\cite{DanBHHKL11},
|
||||
where slices that produce the same output as the original are named
|
||||
\textit{strong}, and those where the original is a prefix of the slice,
|
||||
\textit{weak}. Weak slicing tends to be preferred ---specially for debugging--- for two reasons: the algorithm can be simpler and avoid dealing with termination \josep{termination no esta contemplada ni en weak ni en strong. Mas bien di que en debugging lo que importa es que el error se produzca. En general da igual cuantas veces se produzca o que se siga produciendo despues.}, and the slices can be smaller, narrowing the focus of the debugger. For some applications, \deleted{strong slices are preferred,} such as extracting a \josep{component or a specialized program}feature from a program, where there is a requirement that the resulting slice behave\josep{s} exactly like\josep{as} the original\added{, strong slices are preferred\josep{esto queda muy lejos ya. Yo partiria la frase en dos}}. In this paper we will \josep{Along the thesis, we indicate} indicate which kind of slice is produced with each new technique proposed. \sergio{Generamos alguna vez strong? Joder que cracks somos xD}
|
||||
used throughout the literature (see, e.g., \cite{pending}\carlos{Which citation? Most papers on exception slicing do not indicate or hint whether they use strong or weak.}\sergio{Josep?}\josep{para Strong se puede poner a Weiser. Para Weak se puede poner a Binkley \cite{BinG96}}).
|
||||
Most do not differentiate them, or acknowledge the other variant, because most publications focus on one variant exclusively.
|
||||
Therefore, although the definitions come from different authors, the \textit{weak} and \textit{strong} nomenclature employed here originates from a control dependency analysis by Danicic~\cite{DanBHHKL11}, where slices that produce the same output as the original are named \textit{strong}, and those where the original is a prefix of the slice, \textit{weak}.
|
||||
|
||||
Different applications of program slicing use the option that fits their needs, though \textit{weak} is used if possible, because the resulting slices are smaller statement-wise, and the algorithms used tend to be simpler.
|
||||
Of course, if the application of program slices requires the slice to behave exactly like the original program, then \textit{strong} slices are the only option.
|
||||
As an example, debugging uses weak slicing, as it does not matter what the program does after reaching the slicing criterion, which is typically the point where an error has been detected.
|
||||
In contrast, program specialization requires strong slicing, as it extracts features or computations from a program to create a smaller, standalone unit which performs in the exact same way.
|
||||
|
||||
Along the thesis, we indicate which kind of slice is produced with each problem detected and technique proposed.
|
||||
|
||||
\begin{example}[Strong, weak and incorrect slices]
|
||||
\carlos{The table is labeled execution logs of... but the execution log is a different thing.}
|
||||
In table~\ref{tab:slice-weak} we can observe examples for the various
|
||||
definitions. Each row shows the values \sergio{for a specific variable $v$ in the slicing criterion,} produced by \deleted{the}\added{a particular} execution of \deleted{a}\sergio{the original}
|
||||
program or one of its slices.
|
||||
The first \added{row stands for}\deleted{is} the original \added{program}, which computes $3!$.
|
||||
Slice A's \deleted{execution log}\added{generated sequence of values} is identical to the original and therefore it is a strong slice.
|
||||
Slice B is a weak slice: its execution correctly produces the same \added{sequence of }values as the original program, but it continues producing values after the original stops.
|
||||
Slice C is incorrect, as the \added{generated sequence of} values differ\added{s} from the \added{sequence generated by the }original \added{program}.
|
||||
\sergio{Taking a closer look, one could think that }Some data or control dependency has not been included in the slice \josep{lo que sigue quitarlo. Lia...}and the program produce\josep{s} different results, in this case the slice computes Fibonacci numbers instead of factorials.\sergio{Esto no parece muy relevante, plantearse quitarlo para no liar con Fibonacci.}
|
||||
\end{example}
|
||||
Consider table~\ref{tab:slice-weak}, which displays the sequence of values or execution history obtained with respect to different slices of a program and the same slicing criterion.
|
||||
|
||||
\begin{table}
|
||||
The first row stands for the original program, which computes $3!$.
|
||||
|
||||
Slice A's execution history is identical to the original and therefore it is a strong slice.
|
||||
|
||||
Slice B's execution history does not stop after producing the same first 3 values as the original: it is a weak slice. An instruction responsible for stopping the loop may have been excluded from the slice.
|
||||
|
||||
Slice C is incorrect, as the execution history differs from the original program in the second column. It seems that some dependency has not been accounted for and the value is not updating.
|
||||
|
||||
\begin{table}
|
||||
\centering
|
||||
\label{tab:slice-weak}
|
||||
\begin{tabular}{r | r | r | r | r | r }
|
||||
\deleted{Iteration}\added{Evaluation Number} & \textbf{1} & \textbf{2} & \textbf{3} & \textbf{4} & \textbf{5} \\ \hline
|
||||
Original & 1 & 2 & 6 & - & - \\ \hline
|
||||
% Evaluation Number & \textbf{1} & \textbf{2} & \textbf{3} & \textbf{4} & \textbf{5} \\ \hline
|
||||
Original program & 1 & 2 & 6 & - & - \\ \hline
|
||||
Slice A & 1 & 2 & 6 & - & - \\ \hline
|
||||
Slice B & 1 & 2 & 6 & 24 & 120 \\ \hline
|
||||
Slice C & 1 & 1 & 3 & 5 & 8 \\
|
||||
Slice C & 1 & 1 & 1 & 1 & 1 \\
|
||||
\end{tabular}
|
||||
\caption{\deleted{Execution logs of different slices and their original program.}\added{Sequence of values obtained for a certain variable of the original program and three different slices A, B and C for a particular input.}}
|
||||
\end{table}
|
||||
\caption{Sequence of values obtained for a certain variable of the original program and three different slices A, B and C for a particular input.}
|
||||
\end{table}
|
||||
\end{example}
|
||||
|
||||
\carlos{The following paragraph has already been repeated in previous sections, mainly the motivation. Consider its removal and the addition of citations to the previous mention.}
|
||||
|
||||
\josep{Even though the original proposal by Weiser~\cite{Wei81} focussed on an imperative language, program slicing is a language--agnostic technique.} Program slicing is a language--agnostic tool\sergio{program slicing es tool o technique?}, but the original proposal by
|
||||
\josep{Even though the original proposal by Weiser~\cite{Wei81} focussed on an imperative language, program slicing is a language--agnostic technique.} Program slicing is a language--agnostic technique, but the original proposal by
|
||||
Weiser~\cite{Wei81} covered a simple imperative programming language.
|
||||
Since then, the literature has been expanded by dozens of authors, that have
|
||||
described and implemented slicing for more complex structures, such as
|
||||
uncontrolled control flow~\cite{HorwitzRB88}, global variables~\cite{???},
|
||||
exception handling~\cite{AllH03}; and for other programming paradigms, such as
|
||||
object--oriented languages~\cite{???} or functional languages~\cite{???}.
|
||||
\carlos{Se pueden poner más, faltan las citas correspondientes.}\sergio{Guay, hay que buscarlas y ponerlas, la biblio la veo corta para todos los papers que hay, yo creo que cuando este todo deberia haber sobre 30 casi, si no mas.} \josep{Si. Muchas de esas referencias puedes sacarlas de los ultimos surveys de slicing. }
|
||||
\carlos{Se pueden poner más, faltan las citas correspondientes.}\sergio{Guay, hay que buscarlas y ponerlas, la biblio la veo corta para todos los papers que hay, yo creo que cuando este todo deberia haber sobre 30 casi, si no mas.} \josep{Si. Muchas de esas referencias puedes sacarlas de los ultimos surveys de slicing.}
|
||||
|
||||
\subsection{The System Dependence Graph (SDG)}
|
||||
\subsection{Computing program slices with the system dependence graph}
|
||||
|
||||
There exist multiple approaches to compute a slice\sergio{esto me suena raro, yo diria program representations o data structures that allow the use of program slicing techniques o algo asi, debatirlo}\carlos{DENIED} from a given program and
|
||||
slicing criterion, but the most efficient and broadly used \josep{technique is based on a data structure called} data structure is the System
|
||||
Dependence Graph (SDG), first introduced by Horwitz, Reps\josep{,} and
|
||||
Blinkey \sergio{in 1988}\sergio{Todos los autores o los citamos con et al.? lo digo por seguir la misma regla durante todo el document}~\cite{HorwitzRB88}. It is computed from the program's statements\sergio{source code}, and
|
||||
once built, a slicing criterion is chosen \josep{and mapped on the graph, then} , the graph \added{is} traversed using a specific
|
||||
algorithm, and the slice \added{is} obtained. Its efficiency resides\josep{relies? on} in the fact that\added{,} for
|
||||
multiple slices \deleted{that share}\added{calculated for} the same program, the graph \deleted{must only be built}\added{generation process is only performed} once.
|
||||
On top of that, building the graph has a complexity of $\mathcal{O}(n^2)$ \carlos{uso $\mathcal{O}$ o $O$?}\sergio{Josep?\josep{$\mathcal{O}$}} with
|
||||
respect to the number of statements in \deleted{a}\added{the} program, but the traversal is linear
|
||||
with respect to the number of nodes in the graph (each corresponding to a
|
||||
statement) \sergio{footnote?}.
|
||||
There exist multiple program representations, data structures and algorithms that can be used to compute a slice, but the most efficient and broadly used data structure is the \textit{system dependence graph} (SDG), introduced by Horwitz et al. \cite{HorRB90}.
|
||||
It is computed from the program's source code, and once built, a slicing criterion is chosen and mapped on the graph, then the graph is traversed using a specific algorithm, and the slice is obtained.
|
||||
Its efficiency relies on the fact that, for multiple slices performed on the same program, the graph generation process is only performed once.
|
||||
Performance-wise, building the graph has quadratic complexity ($\mathcal{O}(n^2)$), and its traversal to compute the slice has linear complexity ($\mathcal{O}(n)$); both with respect to the number of statements in the program being sliced.
|
||||
|
||||
The SDG is a directed graph, and as such it has vertices or nodes, each
|
||||
representing an \deleted{instruction}\added{statement} in the program ---barring some auxiliary nodes
|
||||
introduced by some approaches--- and directed edges, which represent the
|
||||
dependencies among nodes. Those edges represent various\sergio{several} kinds of dependencies
|
||||
---control, data, calls, parameter passing, summary--- which\josep{that are defined} will be defined\sergio{further explained?} in
|
||||
section~\ref{sec:first-def-sdg}. \carlos{add how a graph is sliced.}
|
||||
The SDG is a directed graph, and as such it has a set of nodes, each representing a statement in the program ---barring some auxiliary nodes introduced by some approaches--- and a set of directed edges, which represent the dependencies among nodes.
|
||||
Those edges represent several kinds of dependencies ---control, data, calls, parameter passing, summary.
|
||||
|
||||
To create the SDG, first \josep{yo dejaria el a (como estaba)}\deleted{a}\added{the corresponding} \textit{control flow graph} (CFG) is built for each method
|
||||
in the program, then\added{,} its \added{associated }control and data dependencies are computed, resulting
|
||||
in \added{a new graph representation known as }the \textit{program dependence graph} (PDG)\sergio{cita??}\josep{si, a Ottenstein and Ottenstein}\carlos{TENSTEIN, K. J., AND O’ITENSTEIN, L. M. The program dependence graph in a software development environment}. Finally, all the graphs from every
|
||||
method are joined \carlos{NO by the appearance of a new kind of inter-procedural arcs, the argument-in argument-out arcs that link function definitions with function calls, obtaining}\deleted{into} the \added{final} SDG. This process will be explained at greater
|
||||
lengths in section~\ref{sec:first-def-sdg}.
|
||||
To create the SDG, first a \textit{control flow graph} (CFG) is built for each method in the program, some dependencies are computed based on the CFG.
|
||||
With that data, a new graph representation is created, called the \textit{program dependence graph} (PDG) \cite{OttO84}.
|
||||
Each method's PDG is then connected to form the SDG.
|
||||
For a simple visual example, see Example~\ref{exa:create-sdg} below, which briefly illustrates the intermediate steps in the SDG creation. The whole process is explained in detail in section~\ref{sec:first-def-sdg}.
|
||||
|
||||
\carlos{falta mencionar el recorrido del grafo.}
|
||||
%TODO: marked for removal --- this process is repeated later in ref{sec:first-deg-sdg}
|
||||
%\begin{description}
|
||||
%\item[CFG] The control flow graph is the representation of the control
|
||||
%dependencies in a method of a program. Every statement has an edge from
|
||||
%itself to every statement that can immediately follow. This means that
|
||||
%most will only have one outgoing edge, and conditional jumps and loops
|
||||
%will have two. The graph starts in a ``Begin'' or ``Start'' node, and
|
||||
%ends in an ``End'' node, to which the last statement and all return
|
||||
%statements are connected. It is created directly from the source code,
|
||||
%without any need for data dependency analysis.
|
||||
%\item[PDG] The program dependence graph is the result of restructuring and
|
||||
%adding data dependencies to a CFG. All statements are placed below and
|
||||
%connected to a ``Begin'' node, except those which are inside a loop or
|
||||
%conditional block. Then data dependencies are added (red or dashed
|
||||
%edges), adding an edge between two nodes if there is a data dependency.
|
||||
%\item[SDG] Finally, the system dependence graph is the interconnection of
|
||||
%each method's PDG. When a call is made, the input arguments are passed
|
||||
%to subnodes of the call, and the result is obtained in another subnode.
|
||||
%There is an edge from the call to the beginning of the corresponding
|
||||
%method, and an extra type of edge exists: \textit{summary edges}, which
|
||||
%summarize the data dependencies between input and output variables.
|
||||
%\end{description}
|
||||
An example \added{of how an initial CFG is augmented and enhanced with all mentioned dependencies obtaining the corresponding PDG and the final SDG} is provided in figure~\ref{fig:basic-graphs}, where a \added{the process is illustrated for a} simple
|
||||
multiplication program\josep{pon el codigo del programa. asi pueden entender de que va esto los que no sepan de slicing. Sin el programa lo tienen mas complicado... Acabo de ver que ya esta el codigo. Entonces referencialo, presentalo: Consider the multiplication program in Figure X. The standard CFG and PDG generated for this code are... bla bla bla }\deleted{ is converted to CFG, then PDG and finally SDG}. For
|
||||
simplicity, \josep{quita el only}only the CFG and PDG of \texttt{main} are omitted\sergio{no entiendo esto de main. Donde esta main?}. Control
|
||||
dependencies are \added{represented with }black \added{arcs}, data dependencies \added{with} red \added{arcs}, and summary edges \added{are depicted with }blue \added{arcs}.
|
||||
Once the SDG has been created, a slicing criterion can be mapped on the graph and the edges are traversed backwards starting.
|
||||
The process is performed twice, the first time ignoring a specific kind of edge, and the second, ignoring another kind.
|
||||
Once the second pass has finished, all the nodes visited form the slice.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\begin{minipage}{0.4\linewidth}
|
||||
\begin{example}[The creation of a system dependence graph]
|
||||
\label{exa:create-sdg} \sergio{Este ejemplo da demasiados detalles en cuanto a los grafos.}
|
||||
Consider the code provided in Figure~\ref{fig:create-sdg-code}, where a simple Java program containing two methods (\texttt{main} and \texttt{multiply}) is displayed.
|
||||
|
||||
\begin{figure}[h]
|
||||
\begin{lstlisting}
|
||||
int multiply(int x, int y) {
|
||||
void main() {
|
||||
multiply(3, 2);
|
||||
}
|
||||
|
||||
int multiply(int x, int y) {
|
||||
int result = 0;
|
||||
while (x > 0) {
|
||||
result += y;
|
||||
|
@ -198,115 +165,105 @@ dependencies are \added{represented with }black \added{arcs}, data dependencies
|
|||
}
|
||||
System.out.println(result);
|
||||
return result;
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
\end{minipage}
|
||||
\begin{minipage}{0.59\linewidth}
|
||||
\includegraphics[width=\linewidth]{img/multiplycfg}
|
||||
\end{minipage}
|
||||
\caption{A simple Java program with two methods.}
|
||||
\label{fig:create-sdg-code}
|
||||
\end{figure}
|
||||
|
||||
Now turn your attention to Figure~\ref{fig:create-sdg-cfg}\carlos{is this too personal? the second person is used in other places, but not as directly}: a CFG has been created for each method. The CFG has a unique source node (without incoming edges) and a unique sink node (without outgoing edges), named ``Entry'' and ``Exit''. In between, the statements are structured according to all possible executions that could happen.
|
||||
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
\includegraphics[width=0.6\linewidth]{img/multiplycfg}
|
||||
\caption{The control flow graphs for the code in Figure~\ref{fig:create-sdg-code}.}
|
||||
\label{fig:create-sdg-cfg}
|
||||
\end{figure}
|
||||
|
||||
Next is Figure~\ref{fig:create-sdg-pdg}, which is a reordering of the CFG's nodes according to the dependencies between statements: the PDG. Finally, both PDGs are connected into the SDG.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=\linewidth]{img/multiplypdg}
|
||||
\includegraphics[width=\linewidth]{img/multiplysdg}
|
||||
\caption{A simple multiplication program, its CFG, PDG and SDG}
|
||||
\label{fig:basic-graphs}
|
||||
\end{figure}
|
||||
\sergio{nose si vale la pena poner la Figure~\ref{fig:basic-graphs} aqui, no hemos contado aun como se genera, sino que se genera y se supone que se cuenta mas adelante, tal vez sea mas util hacer referencia forward solo y no poner esta figura aqui, sino mas adelante. Plantearselo}
|
||||
\caption{The program dependence graphs (above) and system dependence graph (below) generated from the code in Figure~\ref{fig:create-sdg-code}.}
|
||||
\label{fig:create-sdg-pdg}
|
||||
\end{figure}
|
||||
\end{example}
|
||||
|
||||
\subsection{Metrics}
|
||||
\sergio{Metrics o slicing indicators/features? }
|
||||
\subsection{Program slicing metrics}
|
||||
|
||||
In the area of program slicing, there exist many slicing techniques and tools implementing them.
|
||||
This fact has created the need to classify them by defining a set of metrics.
|
||||
These metrics are commonly associated to some features of the generated slices, or to the resources used by the slicing tool.
|
||||
The following list details the most relevant metrics considered when evaluating a program slice:
|
||||
|
||||
\josep{The main four metrics used to assess a program slicing algorithm are:}There are four relevant metrics considered when evaluating a slicing algorithm:
|
||||
\sergio{Se me hace muy escueto esto, yo meteria algo de bullshit como dice Tama.}
|
||||
|
||||
\sergio{PROPOSAL:}
|
||||
|
||||
\sergio{In the area of program slicing, there are many different slicing techniques and tools implementing them. This fact has created the necessity to classify them by defining a set of different metrics. These metrics are commonly associated to some features of the generated slices. In the following, we list the most relevant metrics considered when evaluating a program slice:}
|
||||
|
||||
\begin{description}
|
||||
\item[Completeness.] The solution includes all the statements that affect
|
||||
the slicing criterion. This is the most important feature, and almost all
|
||||
publications\josep{techniques and implemented tools} achieve at least completeness. Trivial completeness is
|
||||
easily achievable, as simple as including the whole program in the
|
||||
slice.
|
||||
\item[Correctness.] The solution excludes all statements that do not affect
|
||||
the slicing criterion. Most solutions are complete, but the degree of correctness is
|
||||
what sets them apart, as solutions that are more correct will produce smaller slices, which will execute fewer instructions to compute the same values, decreasing the executing time and complexity.
|
||||
\item[Features covered.] Which features \josep{(polymorphism, global variables, arrays, etc.)} or language\josep{s/paradigms} a slicing algorithm
|
||||
covers. Different approaches to slicing cover different programming
|
||||
languages and even paradigms. There are slicing techniques (published or
|
||||
commercially available) for most popular programming languages, from C++
|
||||
to Erlang. Some slicing techniques only cover a subset of the targeted
|
||||
language, and as such are less useful for commercial applications, but
|
||||
can be a stepping stone in the betterment of the field.\sergio{Tambien estan las valen para todos los lenguajes, ORBS entraria en ese caso no Josep?}\josep{si, hay algunas tecnicas que son independiente del paradigma, entre ellas ORBS. A cambio pagan un precio que suele ser una perdida de precision. Yo no me extenderia en ese tema, pero si estaria bien meter una cita a ORBS y sus semejantes al decir lo de even paradigms}
|
||||
\item[Speed.] Speed of graph generation and slice creation. As previously
|
||||
stated, slicing is a two-step process: building a graph and traversing it \sergio{esta frase hace parece que hacer slicing es dibujo libre... darle algo de importancia hablando de traducir el codigo a una representacion en forma de grafo con un estructura de datos compleja bla bla bla...}.
|
||||
The traversal is a linear two--pass analysis of a graph in most proposals, with small variations.
|
||||
Graph generation tends to be a longer process, but it is not as
|
||||
relevant, because it is only done once (per program being analyzed), making this the least important metric. \sergio{Puedes anyadir que aunque la metrica del proceso de generacion no se suele tener muy en cuenta, esta existe porque es donde hay que hacer el analisis mas costoso sobre el programa y tal... relleno a saco! Que parece que no tiene ni merito generar el grafo :(}
|
||||
Only proposals that deviate from the aforementioned schema of building a graph and traversing it show a wider variation in speed.
|
||||
\item[Completeness.] The solution includes all the statements that affect the slicing criterion. This is the most important feature, and almost all techniques and implemented tools set to achieve at least the generation of complete slices. There exists a trivial way of achieving completeness, by including the whole program in the slice.
|
||||
\item[Correctness.] The solution excludes all statements that do not affect the slicing criterion. Most solutions are complete, but the degree of correctness is what sets them apart, as solutions that are more correct will produce smaller slices, which will execute fewer instructions to compute the same values, decreasing the executing time and complexity.
|
||||
\item[Features covered.] Which features (polymorphism, global variables, arrays, etc.), programming languages or paradigms a slicing tool is able to cover. There are slicing tools (publicly published or commercially available) for most popular programming languages, from C++ to Erlang. Some slicing techniques only cover a subset of the targeted language, and as such are less useful, but can be a stepping stone in the betterment of the field. There also exist tools that cover multiple languages or that are language-independent \cite{BinGHI14}. A small set-back of language-independent tools is that they are not as efficient in other metrics.
|
||||
\item[Resource consumption.] Speed and memory consumption for the graph generation and slice creation. As previously stated, slicing is a two-step process: building a graph and traversing it, with the first process being quadratic and the second lineal (in time). Proposals that build upon the SDG try to keep traversal linear, even if that means making the graph bigger or slowing down its building process.
|
||||
|
||||
Though this metric may not seem as important as others, program slicing is not a simple analysis. On top of that, some applications of software slicing like debugging constantly change the program and slicing criterion, which makes faster slicing software preferable for them.
|
||||
|
||||
Memory consumption is less relevant, mainly due to its availability, but could become a concern in big systems with millions of lines of code. \carlos{Check this.}
|
||||
\end{description}
|
||||
|
||||
\subsection{Program slicing as a debugging technique}
|
||||
|
||||
\sergio{Soy pesado pero esto se me vuelve a hacer muy corto :/. Retoco esto un poco}
|
||||
|
||||
\added{As stated before, there are many uses for program slicing: program specialization, software maintenance, code obfuscation... but there is no doubt that p}\deleted{P}rogram slicing is first and foremost a debugging technique\added{.} \deleted{, having e}\added{E}ach\deleted{variation}\added{configuration of different dimensions serves} a different purpose:
|
||||
\subsection{Variations and applications of program slicing}
|
||||
|
||||
As stated before, there are many uses for program slicing: program specialization, software maintenance, code obfuscation... but there is no doubt that program slicing is first and foremost a debugging technique.
|
||||
Program slicing can also be performed with small variations on the algorithm or on the meaning of ``slice'' and ``slicing criterion'', so that it answers a slightly or totally different question.
|
||||
Each variation of program slicing answers a different question and serves a different purpose:
|
||||
|
||||
\begin{description}
|
||||
\item[Backward static.] Used to obtain the lines that affect a statement,
|
||||
normally used on a line which outputs an incorrect value, to narrow\sergio{track?} down
|
||||
\item[Backward static.] Used to obtain the lines that affect the slicing criterion,
|
||||
normally used on a line which contains an incorrect value, to track down
|
||||
the source of the bug.
|
||||
\item[Forward static.] Used to obtain the lines affected by a statement,
|
||||
used to identify dead code, to check the effects a line has on the rest
|
||||
of the program.\josep{la principal aplicacion de forward slicing es software maintenance: Predecir a que partes del programa va a afectar un cambio.}
|
||||
\carlos{https://ieeexplore.ieee.org/document/83912}
|
||||
\item[Forward static \cite{GalL91}.] Used to obtain the lines affected by the slicing criterion,
|
||||
used to perform software maintenance: when changing a statement, slice the program w.r.t. that statement to discover the parts of the program that will be affected by the change.
|
||||
\item[Chopping static.] Obtains both the statements affected by and the
|
||||
statements that affect the selected statement.
|
||||
statements that affect the selected statement. \carlos{Add application and verify question.}
|
||||
\item[Dynamic.] Can be combined with any of the previous variations, and
|
||||
limits the slice to an execution log\josep{history}, only including statements that
|
||||
limits the slice to an execution history, only including statements that
|
||||
have run in a specific execution. The slice produced is much smaller and
|
||||
useful.
|
||||
\item[Quasi--static.] \added{In this slicing method s}\deleted{S}ome input values are given, and some are left
|
||||
unspecified: the result is a slice between the small dynamic slice and
|
||||
useful, but must be recomputed each time. It can be used for debugging when the input values that cause the error are known.
|
||||
\item[Quasi--static.] In this slicing variant, some input values are given, and some are left
|
||||
unspecified: the result is a slice sized between the small dynamic slice and
|
||||
the general but bigger static slice. It can be specially useful when
|
||||
debugging a set of function calls which have a specific static input for
|
||||
some parameters, and variable input for others.
|
||||
\item[Simultaneous.] Similar to dynamic slicing, but considers multiple
|
||||
executions instead of only one. Similarly to quasy--static slicing, it
|
||||
can offer a slightly bigger slice while keeping the scope focused on the
|
||||
source of the bug.
|
||||
\item
|
||||
\carlos{añadir más quizá???}
|
||||
\sergio{a mi me parecen suficientes, puedes decir una frasecita de 2 o 3 lineas diciendo que hay mas y algun uso de alguno de los otros que queden asi a lo general, pero yo los veo suficientes.}
|
||||
\josep{suficientes. Añade un p\'arrafo diciendo que existen otras dimensiones que dan lugar a otras tecnicas y que en [16] se puede encontrar una an\'alisis de las diferentes dimensiones que pueden usarse para clasificar tecnicas de slicing}
|
||||
executions instead of only one. It is another middle ground between static and dynamic slicing, similarly to quasy-static slicing.
|
||||
Likewise, it can offer a slightly bigger slice than pure dynamic slicing while keeping the scope focused on the slicing criterion and the set of executions.
|
||||
\end{description}
|
||||
|
||||
There exist many more, which have been detailed in surveys of the field, such as \cite{Sil12}, which analyzes the different dimensions that can be used to classify slicing techniques.
|
||||
|
||||
\section{Exception handling in Java}
|
||||
\label{sec:intro-exception}
|
||||
|
||||
Exception handling is common in most modern programming languages. \added{Exception handling generally consists in a set of statements that modify the normal execution flow noticing the existence of an abnormal program behaviour (controlled or not), and can be handled manually by the programmer or automatically by the system, depending on the programming language. In our work we focus on the Java programming language, so in the following, we describe the elements that Java uses to represent and handle exceptions:} \deleted{In Java, it
|
||||
consists of the following elements:}
|
||||
Exception handling is common in most modern programming languages. It generally consists of a few new instructions used to modify the normal execution flow and later return to it. Exceptions are used to react to an abnormal program behaviour (controlled or not), and either solve the error and continue the execution, or stop the program gracefully. In our work we focus on the Java programming language, so in the following, we describe the elements that Java uses to represent and handle exceptions:
|
||||
|
||||
\begin{description}
|
||||
\item[Throwable.] An interface that encompasses all the exceptions or errors
|
||||
that may be thrown. Its child classes are \texttt{Error} for internal errors in the Java Virtual Machine and \texttt{Exception} for normal errors.
|
||||
Exceptions can be classified as \textit{unchecked}
|
||||
(those that extend \texttt{RuntimeException}\sergio{se sale esto de la linea por el texttt} or \texttt{Error}) and
|
||||
\textit{checked} (all others, may inherit from \texttt{Throwable}, but typically they do so from \texttt{Exception}). The first kind may be thrown anywhere without warning, whereas
|
||||
the second, if thrown, must be either caught in the same method or declared in the method header.
|
||||
that may be thrown. Its two main implementations are \texttt{Error} for internal errors in the Java Virtual Machine and \texttt{Exception} for normal errors. The first ones are generally not caught, as they indicate a critical internal error, such as running out of memory, or overflowing the stack. The second kind encompasses the rest of exceptions that occur in Java.
|
||||
All exceptions can be classified as either \textit{unchecked}
|
||||
(those that extend \texttt{RuntimeException} or \texttt{Error}) or
|
||||
\textit{checked} (all others, may inherit from \texttt{Throwable}, but typically they do so from \texttt{Exception}). Unchecked exceptions may be thrown anywhere without warning, whereas
|
||||
checked exceptions, if thrown, must be either caught in the same method or declared in the method header.
|
||||
\item[throws.] A statement that activates an exception, altering the normal
|
||||
control-flow of the method. If the statement is inside a \textit{try}
|
||||
block with a \textit{catch} clause for its type or any supertype, the
|
||||
control-flow of the method. If the statement is inside a \texttt{try}
|
||||
block with a \texttt{catch} clause for its type or any supertype, the
|
||||
control flow will continue in the first statement of such clause.
|
||||
Otherwise, the method is exited and the check performed again, until
|
||||
either the exception is caught or the last method in the stack
|
||||
(\textit{main}) is popped, and the execution of the program ends
|
||||
(the \texttt{main} method) is popped, and the execution of the program ends
|
||||
abruptly.
|
||||
\item[try.] This statement is followed by a block of statements and by one
|
||||
or more \textit{catch} clauses. All exceptions thrown in the statements
|
||||
contained or any methods called will be processed by the list of
|
||||
catches. Optionally, after the \textit{catch} clauses a \textit{finally}
|
||||
block may appear.
|
||||
\carlos{Review stopped here.}
|
||||
\item[try.] This statement contains a block of statements and one
|
||||
or more \texttt{catch} clauses and/or a \texttt{finally} block.
|
||||
All exceptions thrown in the statements contained or any methods called will be processed by the list of catches.
|
||||
\item[catch.] Contains two elements: a variable declaration (the type must
|
||||
be an exception \sergio{exception o exception type?}) and a block of statements to be executed when an
|
||||
exception of the corresponding type (or a subtype) is thrown.
|
||||
|
@ -327,7 +284,6 @@ consists of the following elements:}
|
|||
the program will end. Otherwise, the execution continues in the next
|
||||
statement after the \textit{try-catch-finally} block.
|
||||
\end{description}
|
||||
\sergio{Me han molao las explicaciones, se entiende muy bien como funciona Java, parece que sea hasta facil de usar :D}
|
||||
|
||||
\footnotetext{Introduced in Java 7, see \url{https://docs.oracle.com/javase/7/docs/technotes/guides/language/catch-multiple.html} for more details.}
|
||||
|
||||
|
|
|
@ -3,5 +3,6 @@
|
|||
% !TEX root = ../paper.tex
|
||||
|
||||
\chapter{Conclusion}
|
||||
\label{cha:conclusion}
|
||||
|
||||
\carlos{todo}
|
|
@ -4,29 +4,17 @@
|
|||
|
||||
\chapter{Introduction}
|
||||
\label{cha:introduction}
|
||||
|
||||
\section{Motivation}
|
||||
\label{sec:motivation}
|
||||
|
||||
\carlos{Presentar más que definir program slicing.}
|
||||
\textit{Program slicing} is a technique for program analysis and transformation whose main objective is to extract from a program the set of statements that affect a specific statement and set of variables, called a \textit{slicing criterion} \cite{Wei81,Tip95}. It answers the question ``Which parts of a program affect a set of variables in a specific statement?'' The program obtained by program slicing is called a \textit{slice}, and it has many uses, such as debugging \cite{DeMPS96}, program specialization \cite{OchSV05}, software maintenance \cite{HajF12}, code obfuscation \cite{MajDT07}, etc. This technique was originally defined \cite{Wei81} for a simple imperative programming language, but now can be used with practically all programming languages and paradigms.
|
||||
|
||||
Program slicing~\cite{Wei81} is a debugging technique that, given a line of
|
||||
code and a set of variables of a program, simplifies such program so that the only parts
|
||||
left of it are those that affect or are affected by the values of the selected variables. \josep{aqui, antes del ejemplo, habria que decir de manera informal que es un slice y que es un SC}
|
||||
|
||||
\sergio{Se me hace corta esta definicion y me faltan algunas utilidades del program slicing, por que se usa? Realmente no se usa solo en depuracion. Tiene mas usos, esto ademas da referencias a poner si queremos.}
|
||||
|
||||
\sergio{Carpeta SAC 2017 (paper-poster 3 paginas): ``Program slicing is a technique for program analysis and transformation whose main objective is to extract from a program those statements (the slice) that
|
||||
influence or are influenced by the values of one or more variables at some point of interest, often called slicing criterion [13, 12, 1, 9]. This technique has been adapted to practically all programming languages, and it has many applications such as debugging [3], program specialization [8], software maintenance [5], code obfuscation [7], etc.''.}
|
||||
|
||||
\sergio{Cogeria algo de aqui para hacer una definicion mas completa, ademas ya usamos terminologia de slicing como \textit{slice} y \textit{slicing criterion}.}
|
||||
\josep{De acuerdo con Sergio. Un par de cosas más: Entra muy a saco la introducción con una definición. :-) Por otra parte, tal y como está definido (para el lector profano), parece que un slice es todo lo que afecta O es afectado por el slicing criterion. Es decir, como si el "O" formara parte de la definición. Yo hablaría aquí solo de backward slicing, y dejaría forward para luego (igual que has dejado dynamic para luego).}
|
||||
|
||||
\begin{example}[Program slicing \josep{\deleted{in}\added{applied to}} a simple method]
|
||||
\sergio{Consider the code shown below / in Figure XX, containing a simple method written in Java.} If the left program is sliced on (line 5, variable \texttt{x}),\sergio{Si hemos usado ya slice y slicing criterion aqui podemos decir que el slicing criterion es tal y el slice es cual y empezar a usar la terminologia correcta de manera mas natural.} the
|
||||
result would be the program on the right, with the \texttt{if} block
|
||||
removed, as it does not affect the value of \texttt{x}.
|
||||
\begin{example}[Program slicing applied a simple Java method]
|
||||
Consider the code shown on the left side of figure~\ref{fig:program-slicing-code}, which is a simple method written in Java. If that method is sliced with respect to the slicing criterion (line 5, variable \texttt{x}), the slice would be the program on the right. The \texttt{if} and print statements would be excluded from the slice, as they do not affect the value of \texttt{x}. As a test, the execution of line 5 on both programs would yield the same result ---assuming both the original program and the slice are executed with the same input value.
|
||||
\label{exa:program-slicing}
|
||||
\begin{center}
|
||||
|
||||
\begin{figure}[h]
|
||||
\begin{minipage}{0.49\linewidth}
|
||||
\begin{lstlisting}[stepnumber=1]
|
||||
void f(int x) {
|
||||
|
@ -47,39 +35,23 @@ void f(int x) {
|
|||
}
|
||||
\end{lstlisting}
|
||||
\end{minipage}
|
||||
\end{center}
|
||||
\caption{A simple Java method (left) and its slice w. r. t. slicing criterion (line 5, \texttt{x}).}
|
||||
\label{fig:program-slicing-code}
|
||||
\end{figure}
|
||||
\end{example}
|
||||
|
||||
\carlos{Detallar los distintos usos y evitar relacionar debugging con ejecutable.}
|
||||
As depicted in example~\ref{exa:program-slicing}, slices are subsets of the original program. In the most general form, the execution of slices produces the same values in the slicing criterion as the original program would. In other words, the slice criterion behaves identically in the slice as in the original. Some uses of program slicing, such as program specialization, require the slices to be executable, which is useful to extract an independent process from a bigger program or software library. Other uses do not, as the slices are used to find the complete set of dependencies of a slicing criterion.
|
||||
|
||||
Slices are executable programs whose execution produces the same values \sergio{OJO!, cuidao con ese jardin que luego esta el weak slice.}\josep{puedes evitar el jard\'in empezando la frase así: ``In its more general form, slices are..."} \carlos{Alternativa: programa que se comporta igual (luego se define mismos valores o lista prefija.)}
|
||||
for the specified line and variable as the original program, and they are used to
|
||||
facilitate debugging of large and complex programs, where the control and data flow may not
|
||||
be easily understandable. \josep{en realidad los executable slices no suelen usarse en debugging. M\'as bien en Program specialization...}
|
||||
Though it may seem a really powerful technique, many programming languages lack a mature program slicer which covers the whole language. Even commonly widespread languages like Java does not have a complete program slicer that is publicly available, or documented in the literature; which makes it difficult to use program slicing where it may be needed. Nevertheless, there exist commercial program slicers that cover Java, such as CodeSurfer\footnote{Created by GrammaTech. For more information, consult their website at \url{https://www.gramatech.com/}}.
|
||||
|
||||
Building a program slicer is not a simple task, requiring a considerable amount of analysis to obtain a valid slice. Smaller slices are preferable, but even more difficult to create. In Java specifically many situations lead to several scenarios, such as arrays, polymorphism and inheritance, and exception handling that are quite difficult to analyze. This is the reason there does not exist a universal solution for all the existent problems in the field of program slicing. Conversely, many approaches are usually proposed to solve the same slicing problem. Program slicing is used in so many applications ---debugging, program comprehension, parallelization, dead code removal--- that any improvement to the state of the art improves those processes.
|
||||
|
||||
|
||||
Though it may seem a really powerful technique, \josep{many languages lack of a mature program slicer or one that covers the whole language. For instance,} the whole Java \sergio{Primera aparicion de Java, mencionar que el ejemplo es Java porque sino parece que te aparece Java out of the blue.} language is not
|
||||
completely covered by it, and that makes it difficult to apply in practical
|
||||
settings.
|
||||
|
||||
\sergio{Propongo algo asi para conectar program slicing y las exceptions:}
|
||||
|
||||
\sergio{Though it may seem a really powerful technique, the amount of analysis that need to be done to properly obtain a correct slice is very considerable. Many situations of the Java language lead to several scenarios (podriamos poner algun ejemplo de cosas chungas, rollo recursividad, arrays, objetos... para que se vea que no todo tiene una solucion unica ni perfecta, sino que muchas propuestas son mejorables.) that are quite difficult to analyse, which is the reason because there does not exist a universal solution for all the existent problems in the field of program slicing. Conversely, many different approaches are usually proposed to solution the same slicing problem.}
|
||||
|
||||
\sergio{Se que hay mucha verborrea, pero es para hacer la lectura menos agresiva xD.}\josep{Carlos va directo al grano, no se anda con rodeos. :-). Pero, efectivamente es necesario (luego no, pero aqu\'i en a introducci\'on s\'i) darle un poco de cremita al lector. Esto es la motivaci\'on y de momento no ha habido motivaci\'on. Echo en falta decirle que la t\'ecnica ha sido aplicada y estudiada en pr\'acticamente todos los lenguajes de programaci\'on y que es una t\'ecnica de optimizaci\'on que usan los compiladores y muchas t\'ecnicas de an\'alisis est\'atico. Que se aplica en debugging, program compehension, paralelizaci\'on, eliminaci\'on de c\'odigo muerto, etc. falta motivar que este area es importante, no es solo una paja mental y una vuelta de tuerca mas... todo esto para los profanos (e.g., el tribunal ;-))}
|
||||
|
||||
\sergio{Inside all this slicing problems, there is } An area that has been investigated, \sergio{but? (por evitar el yet ... yet)}yet does not have a definitive
|
||||
solution yet\sergio{,} \deleted{is}\josep{el is no hay que borrarlo} exception handling. Example~\ref{exa:program-slicing2}
|
||||
demonstrates\josep{shows} how, even using the latest developments \josep{to handle exceptions in} in \sergio{exception handling slicing }program
|
||||
slicing~\cite{AllH03}, the sliced version does not include the catch block \sergio{this approach is not able to include the catch block in the obtained slice}, and
|
||||
therefore does not produce a correct slice.
|
||||
Among others, there is an area that has been investigated, but does not have a definitive solution yet: exception handling. Example~\ref{exa:program-slicing2} shows how, even using the latest developments to handle exceptions in program slicing~\cite{AllH03,JiaZSJ06}, the slice produced is not valid.
|
||||
|
||||
\begin{example}[Program slicing with exceptions]
|
||||
\added{Consider}\deleted{If} the following program \josep{on the left that has been sliced (on the right) using}\deleted{is} sliced using Allen and Horwitz's proposal~\cite{AllH03} with respect to (line 17, variable \texttt{a})\added{. As \josep{it} can be appreciated, t}\deleted{, t}he
|
||||
slice is incomplete, \josep{because}as it lacks the \texttt{catch} block from lines 4-6.
|
||||
Consider figure~\ref{fig:program-slicing2-code}: the Java program on the left has been sliced (on the right) using Allen et al.'s proposal~\cite{AllH03}; with respect to the slicing criterion (line 17, variable \texttt{a}).
|
||||
\label{exa:program-slicing2}
|
||||
\begin{center}
|
||||
\begin{figure}[h]
|
||||
\begin{minipage}{0.49\linewidth}
|
||||
\begin{lstlisting}[stepnumber=1]
|
||||
void f(int x) throws Exception {
|
||||
|
@ -124,38 +96,47 @@ void g(int a) throws Exception {
|
|||
}
|
||||
\end{lstlisting}
|
||||
\end{minipage}
|
||||
\end{center}
|
||||
\sergio{Captions? para referirnos a ellas por separado como programa original (izquierda) y slice (derecha)?. Indicar en ella el SC en negrita en el codigo o de otro color o algo para destacarlo.}
|
||||
\caption{A simple Java program with exception (left) and its slice w. r. t. line 17, variable \texttt{a} (right).}
|
||||
\label{fig:program-slicing2-code}
|
||||
\end{figure}
|
||||
|
||||
When the program is executed \josep{from the call}as \texttt{f(0)}, the execution log \sergio{hay que decir que esto es la lista de instrucciones que se ejecutan y en el orden en el que lo hacen.}\josep{en program slicing se llama execution history, y puedes poner una cita a Corel y Lasky} would be: \texttt{1, 2, 3, 13, 14, 15, 4, 5, 8, 10, 13, 14, 17}. In the only execution of line \texttt{17}, variable \texttt{a} has value 1 in that line. However, in the slice produced, the execution log is \texttt{1, 2, 3, 13, 14, 15}. The exception thrown in \texttt{g()} is not caught in \texttt{f()}, so it returns with an exception and line \texttt{17} never executes.
|
||||
As a test of the validity of the slice, we can execute both (with the initial call being \texttt{f(0)}). We can define the \textit{execution history} as the list of instructions executed by a program \cite{KorL88}.
|
||||
As an example, the execution log of \texttt{g(1)} is \texttt{13, 14, 17}, and the execution log of \texttt{g(0)}, \texttt{13, 14, 15}.
|
||||
When the program is executed from the call \texttt{f(0)}, the execution history of the original program (left) is: \texttt{1, 2, 3, 13, 14, 15, 4, 5, 8, 10, 13, 14, 17}.
|
||||
The slicing criterion executes once: \texttt{a} has value 1.
|
||||
In contrast, the execution history for the slice is \texttt{1, 2, 3, 13, 14, 15}.
|
||||
Method \texttt{g} throws an exception, which is not caught, and the program ends with an error, stopping abruptly before reaching the slicing criterion.
|
||||
|
||||
The problem in this example is that the \texttt{catch} block in line \texttt{4} is not included, because ---according to the dependency graph \josep{computed by \cite{} and} shown \josep{in Figure~\ref{}}below--- it does not influence the execution of line \texttt{17}. Two kinds of dependencies among statements are considered: data dependence (a variable is read that may have gotten its value from a given statement) and control dependence (\josep{an} the instruction controls whether another \josep{instruction} executes).
|
||||
In the graph, the \josep{node associated with the} slicing criterion is marked in bold, the nodes that represent the slice are filled in grey\josep{demasiado clarito. En mi ordenador no se ve}, and dependencies are displayed as edges, with control dependencies in black and data dependencies in red. Nodes with a dashed outline represent elements that are not statements of the program.
|
||||
The problem in this example is that the \texttt{catch} block in line 4 is not included.
|
||||
This is because ---according to the system dependence graph \cite{HorwitzRB88} computed using Allen et al.'s algorithm \cite{AllH03} and shown in Figure~\ref{fig:program-slicing2-graph} below--- it does not influence the execution of line 17.
|
||||
The graph displays the statements of the methods as nodes; and the dependencies between statements as edges. Some nodes have its outline dashed; as they do not correspond to a statement, but are needed by the algorithm.
|
||||
The node associated with the slicing criterion is marked in bold and the nodes that represent the slice are filled in grey. Note that there are some edges between both methods that are not shown. The only relevant ones (the ones traversed to create the slice) are shown, and the rest are hidden for clarity.
|
||||
|
||||
\begin{center}
|
||||
The graph traversal will be explained later, but the basic rule is that edges are traversed backwards starting from the slicing criterion. Any node that is reached is part of the slice, the rest can be disregarded.
|
||||
|
||||
\begin{figure}[h]
|
||||
\includegraphics[width=\linewidth]{img/motivation-example-pdg}
|
||||
\josep{transforma todas las figuras en figuras reales (referenciables) y con caption}
|
||||
\josep{Yo ver\'ia m\'as claro el grafo conectando llamada y llamado}
|
||||
\end{center}
|
||||
\caption{The system dependence graph for the method shown in Figure \ref{fig:program-slicing2-code}.}
|
||||
\label{fig:program-slicing2-graph}
|
||||
\end{figure}
|
||||
\end{example}
|
||||
|
||||
\carlos{mover todas las imágenes y segmentos de código a figuras separadas} \\
|
||||
\carlos{indicar la conexión entre grafos} \\
|
||||
\carlos{mover el grafo y la explicación a después del background; el porqué y la solución se presenta en sección X}
|
||||
\carlos{mover el grafo y la explicación a después del background; el porqué y la solución se presenta en sección X???}
|
||||
|
||||
Example~\ref{exa:program-slicing2} \josep{is a contribution of this work because it} showcases an important error in the current slicing procedure for programs that handle errors with exceptions\josep{\deleted{; because}\added{where}} the \texttt{catch} block is disregarded. The only way a \texttt{catch} block can be included in the slice is if a statement inside it is needed for another reason. However, Allen and Horwitz~\cite{AllH03} did not encounter\josep{tackle? account for?} this problem in their paper, as the values outputted by method calls are extracted after the \texttt{normal return} and each \texttt{catch}, and in a typical method call with output, the \texttt{catch} is included by default when the outputted value is used. This detail makes the error much smaller, as most \texttt{try-catch} structures are run to obtain a value. \sergio{Anyadir el nodo \textit{out} para que lo que has explicado aqui quede mas comprensible. Viendo que existe el nodo \textit{out}, pero que nadie el SC no lo necesita.}
|
||||
Example~\ref{exa:program-slicing2} is a contribution of this thesis, because it showcases an important error in the current state of the art.
|
||||
This example is later generalized (see chapter \ref{cha:solution}), as under some conditions all \texttt{catch} statements are ignored, regardless of if it is needed or not.
|
||||
The only way a \texttt{catch} block can be included in the slice is if a statement inside it is needed for another reason.
|
||||
However, Allen et al. \cite{AllH03} did not tackle this problem, as for some examples the \texttt{catch} statement is included or unnecessary.
|
||||
|
||||
\added{There is also another }\deleted{A} notable case where a method that may throw an exception is run and no value is recovered (at least from the point of view of program slicing)\added{. It occurs}\deleted{is} when writing to the filesystem or making connections to servers, such as a database or a webservice to store information. In this case, if no confirmation is outputted signaling whether the storage of information was correct, the \texttt{catch} block \deleted{will be}\added{is} omitted, and the \josep{program} slicer \josep{\deleted{software}} \deleted{will} produce\added{s} an incorrect result.
|
||||
A real-life, commonly used instance of example~\ref{exa:program-slicing2} is the writing of any information to a file or a database; or any other instruction that has no data output (excluding side effects) and may throw an exception.
|
||||
|
||||
\section{Contributions}
|
||||
|
||||
The main contribution of this paper\carlos{thesis}\sergio{paper?research?work?}\josep{work o research} is a
|
||||
\added{new approach for program slicing with exception handling for Java programs.} \deleted{complete technique for program slicing programs in the presence of exception handling constructs for Java}. \added{Our approach}\deleted{This technique} extends the previous technique \added{proposed} by Allen et al. \cite{AllH03}. It \added{is able to properly slice}\deleted{considers} all cases considered in \deleted{that}\added{their} work, but it also provides a solution to \sergio{some other} cases not \deleted{considered}\added{contemplated}\josep{considered} by them.
|
||||
The main contribution of this thesis is a new approach for program slicing with exception handling for Java programs.
|
||||
Our approach extends the existing techniques proposed by Allen et al. \cite{AllH03}.
|
||||
It is able to generate valid slices for all cases considered in their work, but it also provides a solution to other cases not contemplated by them. For the sake of completeness and in order to explain the process that leaded us to this solution, we first summarize the fundamentals of program slicing and its terminology; delving deeper in the progress of program slicing techniques related to exception handling.
|
||||
|
||||
For the sake of completeness and in order to understand the process that leaded us to this solution, we \josep{\deleted{will present}\josep{first summarize the fundamentals o background}} a brief history\sergio{background?} of program slicing \added{terminology}, specifically those changes that have affected exception handling.\sergio{delving deeper in the progress of program slicing techniques related to exception handling.?} Furthermore, we provide a summary of the
|
||||
different contributions each author has made to the field.
|
||||
|
||||
The rest of the paper is structured as follows: chapter~\ref{cha:background} summarizes the theoretical background required in program slicing and exception handling, chapter~\ref{cha:incremental} \josep{analyzes}will analyze each structure used in exception handling, explore\josep{s} the already available solution and propose\josep{s} a new technique that subsumes all of the existing solutions and provides correct slices for each case.\josep{frase demasiado larga}
|
||||
Chapter~\ref{cha:state-art} provides a bird's eye view of the current state of the art, chapter~\ref{cha:solution} provides a summarized description of the new algorithm with all the changes proposed in chapter~\ref{cha:incremental}, and finally, chapter~\ref{cha:conclusion} \josep{concludes?}summarizes the paper\sergio{work?} and explores future avenues of work\sergio{possible improvements?}.
|
||||
The rest of this thesis is structured as follows: chapter~\ref{cha:background} summarizes the theoretical background required in program slicing and exception handling, chapter~\ref{cha:incremental} analyzes each structure used in exception handling and explores the already available solution.
|
||||
Chapter~\ref{cha:solution} provides a list of problems that occur in the state of the art, detailing the scope and importance of each one, and proposes an appropriate solution, chapter~\ref{cha:state-art} provides a bird's eye view of the current state of the art, and finally, chapter~\ref{cha:conclusion} concludes the thesis and explores future avenues of work, such as improvements or optimizations that have not been explored in our solution.
|
||||
|
||||
% vim: set noexpandtab:tabstop=2:shiftwidth=2:softtabstop=2:wrap
|
||||
|
|
|
@ -55,4 +55,11 @@ subgraph cluster_f {
|
|||
//l3_in -> nr3 -> l4;
|
||||
//l10_in -> nr10 -> fee;
|
||||
}
|
||||
{
|
||||
edge [constraint = false, style = dashed];
|
||||
{l3 l10} -> enter_g [style = bold];
|
||||
{l3_in l10_in} -> a_in;
|
||||
//gee -> {fee l4};
|
||||
//gne -> {nr10 nr3};
|
||||
}
|
||||
}
|
Binary file not shown.
|
@ -1,5 +1,15 @@
|
|||
digraph g {
|
||||
Start [shape=box];
|
||||
End [shape=box];
|
||||
Start -> "int result = 0" -> "while (x > 0)" -> "result += y" -> "x--" -> "while (x > 0)" -> "System.out.println(result)" -> "return result" -> "End";
|
||||
digraph g {
|
||||
subgraph a {
|
||||
E [label = "Entry", shape = box];
|
||||
e [label = "Exit", shape = box];
|
||||
c [label = <x_in = 2<br/>y_in = 3<br/>multiply(2, 3)>];
|
||||
E -> c -> e;
|
||||
}
|
||||
|
||||
subgraph b {
|
||||
Entry [shape=box, label = <Entry<br/>x = x_in<br/>y = y_in>];
|
||||
Exit [shape=box];
|
||||
Entry -> "int result = 0" -> "while (x > 0)" -> "result += y" -> "x--" -> "while (x > 0)" -> "System.out.println(result)" -> "return result" -> Exit;
|
||||
{ rank = same; "while (x > 0)"; "System.out.println(result)"}
|
||||
}
|
||||
}
|
Binary file not shown.
|
@ -1,5 +1,11 @@
|
|||
digraph g { "multiply()" [shape=box, rank=min];
|
||||
digraph g {
|
||||
"main()" [shape=box, rank=min]
|
||||
"main()" -> "multiply(2, 3)" -> {"x_in = 2" "y_in = 3"};
|
||||
|
||||
"multiply()" [shape=box, rank=min];
|
||||
"multiply()" ->
|
||||
// Rank adjustment
|
||||
{ rank = same; "x = x_in" "y = y_in" }
|
||||
{ rank = same; "int result = 0"; "while (x > 0)"; "System.out.println(result)"; "return result"; }
|
||||
{ rank = same; "result += y"; "x--"; }
|
||||
// Control flow
|
||||
|
@ -10,17 +16,17 @@ digraph g { "multiply()" [shape=box, rank=min];
|
|||
"while (x > 0)" -> "result += y" [style=bold];
|
||||
"while (x > 0)" -> "x--" [style=bold];
|
||||
// Data flow
|
||||
"int result = 0" -> "result += y" [color=red];
|
||||
"int result = 0" -> "System.out.println(result)" [color=red];
|
||||
"int result = 0" -> "return result" [color=red];
|
||||
"result += y" -> "result += y" [color=red];
|
||||
"result += y" -> "System.out.println(result)" [color=red];
|
||||
"result += y" -> "return result" [color=red];
|
||||
"x--" -> "x--" [color=red];
|
||||
"x--" -> "while (x > 0)" [color=red];
|
||||
{ edge [color = red]
|
||||
{"int result = 0" "result += y"} -> {"result += y" "System.out.println(result)" "return result"};
|
||||
{"x--" "x = x_in"} -> {"x--" "while (x > 0)"};
|
||||
"y = y_in" -> "result += y";
|
||||
}
|
||||
// Order adjustment
|
||||
"int result = 0" -> "while (x > 0)" [style=invis];
|
||||
"while (x > 0)" -> "System.out.println(result)" [style=invis];
|
||||
"System.out.println(result)" -> "return result" [style=invis];
|
||||
"result += y" -> "x--" [style=invis];
|
||||
{ edge [style = invis];
|
||||
"int result = 0" -> "while (x > 0)";
|
||||
"while (x > 0)" -> "System.out.println(result)";
|
||||
"System.out.println(result)" -> "return result";
|
||||
"result += y" -> "x--";
|
||||
"x = x_in" -> "int result = 0";
|
||||
}
|
||||
}
|
Binary file not shown.
BIN
paper.pdf
BIN
paper.pdf
Binary file not shown.
Loading…
Reference in a new issue