Chapter 2 medio-reviewed
This commit is contained in:
parent
2547aa7220
commit
fbd0f679b5
1 changed files with 48 additions and 41 deletions
|
@ -6,16 +6,19 @@
|
|||
\label{cha:background}
|
||||
|
||||
\section{Program slicing}
|
||||
\textsl{Program slicing} \cite{Wei81,Sil12} is a debugging technique that
|
||||
\textsl{Program slicing} \cite{Wei81,Sil12}\sergio{hay alguna razon para que \cite{Sil12} no este en la intro?, la unica cita alli es\cite{Wei81}. Propongo eliminar \cite{Sil12} por homogeneidad} is a debugging technique that
|
||||
answers the question: ``which parts of a program affect a given statement and
|
||||
set of variables?'' The statement and the variables are the basic input to create a slice
|
||||
and are called the \textsl{slicing criterion}. The criterion can be more
|
||||
complex, as different slicing techniques may require additional pieces of input.
|
||||
The \textsl{slice} of a program is the list of statements from the original
|
||||
program ---which constitutes a valid program--- whose execution will result in
|
||||
the same values for the variables (selected in the slicing criterion).
|
||||
the same values for the variables (selected in the slicing criterion).
|
||||
There exist two fundamental dimensions along which the problem of slicing can be
|
||||
proposed \cite{Sil12}:
|
||||
|
||||
\sergio{Mi propuesta es mover el concepto naive de aqui a la intro para que entiendan algo del ejemplo y aqui hacer referencia a la definicion anterior o introducir las dimensiones de slicing directamente con un pequenyo preambulo. Una fuerte razon para definirlo alli es que usamos todo el rato la palabra slice y de repente, despues de usarla un rato, la definimos.}
|
||||
|
||||
\begin{itemize}
|
||||
\item \textsl{Static} or \textsl{dynamic}: slicing can be performed
|
||||
statically or dynamically.
|
||||
|
@ -26,17 +29,17 @@ proposed \cite{Sil12}:
|
|||
expanded to include a position in the log that corresponds to one
|
||||
instance of the selected statement, making it much more specific. It may
|
||||
help find a bug related to indeterministic behavior (such as a random
|
||||
or pseudo-random number generator), but must be recomputed for each case
|
||||
or pseudo-random number generator), but \sergio{, despite selecting the same slicing criterion, the slice }must be recomputed for each case\sergio{different input value/execution considered?}
|
||||
being analyzed.
|
||||
\item \textsl{Backward} or \textsl{forward}: \textsl{backward slicing}
|
||||
\cite{Wei81} is generally more used, because it looks at the statements
|
||||
\cite{Wei81} is generally more used \sergio{habra que decir lo que es antes de decir que se usa mas no? Cambiar el orden y reescribir esta frase. Decimos que es y luego que es el que generalmente se estudia o algo de eso}, because it looks at the statements
|
||||
that affect the slicing criterion. In contrast, \textsl{forward slicing}
|
||||
\cite{BerC85} computes the statements that are affected by the slicing
|
||||
criterion. There also exists a mixed approach called \textsl{chopping}
|
||||
\cite{JacR94}, which is used to find all statements that affect some variables in the slicing criterion and at the same time they are affected by some other variables in the slicing criterion.
|
||||
\end{itemize}
|
||||
|
||||
Since the definition of program slicing, the most extended form of slicing has
|
||||
Since the definition of program slicing\sergio{Since Weiser defined program slicing in 1981}, the most \deleted{extended form}\added{studied configuration?} of slicing has
|
||||
been \textsl{static backward slicing}, which obtains the list of statements that
|
||||
affect the value of a variable in a given statement, in all possible executions
|
||||
of the program (i.e., for any input data).
|
||||
|
@ -44,17 +47,18 @@ of the program (i.e., for any input data).
|
|||
\label{def:strong-slice}
|
||||
\carlos{One of the citations is the correct one.}
|
||||
Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where
|
||||
$s$ is a statement and $v$ is a set of variables in $P$ (the variables may
|
||||
$s$ is a statement and $v$ is a set\sergio{los set no se representan con letras mayusculas?} of variables in $P$ (the variables may
|
||||
or may not be used in $s$), $S$ is the \textsl{strong slice} of $P$ with
|
||||
respect to $C$ if $S$ has the following properties:
|
||||
respect to $C$ if $S$ has\sergio{fulfils?} the following properties:
|
||||
\begin{enumerate}
|
||||
\item $S$ is an executable program.
|
||||
\item $S \subseteq P$, or $S$ is the result of removing code from $P$.
|
||||
\item $S \subseteq P$, or $S$ is the result of removing code\sergio{code o 0 or more statements?} from $P$.
|
||||
\item For any input $I$, the values produced on each execution of $s$
|
||||
for each of the variables in $v$ is the same when executing $S$ as
|
||||
when executing $P$. \label{enum:exact-output}
|
||||
\end{enumerate}
|
||||
\end{definition}
|
||||
\sergio{Esta definicion no obligaba tambien a acabar con el mismo error en caso de que la ejecucion no termine? Si es asi, plantearse poner algo al respecto.}
|
||||
|
||||
\begin{definition}[Weak static backward slice \cite{RepY89}]
|
||||
\label{def:weak-slice}
|
||||
|
@ -62,10 +66,10 @@ of the program (i.e., for any input data).
|
|||
Given a program $P$ and a slicing criterion $C = \langle s,v \rangle$, where
|
||||
$s$ is a statement and $v$ is a set of variables in $P$ (the variables may
|
||||
or may not be used in $s$), $S$ is the \textsl{weak slice} of $P$ with
|
||||
respect to $C$ if $S$ has the following properties:
|
||||
respect to $C$ if $S$ has\sergio{fulfils?} the following properties:
|
||||
\begin{enumerate}
|
||||
\item $S$ is an executable program.
|
||||
\item $S \subseteq P$, or $S$ is the result of removing code from $P$.
|
||||
\item $S \subseteq P$, or $S$ is the result of removing code from $P$. \sergio{idem}
|
||||
\item For any input $I$, the values produced on each execution of $s$
|
||||
for each of the variables in $v$ when executing $P$ is a prefix of
|
||||
those produced while executing $S$ ---which means that the slice
|
||||
|
@ -74,73 +78,76 @@ of the program (i.e., for any input data).
|
|||
\end{enumerate}
|
||||
\end{definition}
|
||||
|
||||
\sergio{$\forall~i~\in~I, v\in~V~\rightarrow~seq(i,v,P)~Pref~seq(i,v,S)$ where $seq(i,a,A)$ representa la secuencia de valores obtenidos para $a$ al ejecutar el input $i$ en el programa $A$. $I$ es el conjunto de todos los inputs posibles para $P$. Por ahi irian los tiros creo yo.}
|
||||
|
||||
Both definitions (\ref{def:strong-slice} and~\ref{def:weak-slice}) are
|
||||
used throughout the literature (see, e.g., \cite{pending}\carlos{Which citation? Most papers on exception slicing do not indicate or hint whether they use strong or weak.}), with some cases favoring the first and some the
|
||||
used throughout the literature (see, e.g., \cite{pending}\carlos{Which citation? Most papers on exception slicing do not indicate or hint whether they use strong or weak.}\sergio{Josep?}), with some cases \deleted{favoring}\added{favouring} the first and some the
|
||||
second. Though the definitions come from the corresponding citations, the naming
|
||||
was first used in a control dependency analysis by Danicic~\cite{DanBHHKL11},
|
||||
where slices that produce the same output as the original are named
|
||||
\textsl{strong}, and those where the original is a prefix of the slice,
|
||||
\textsl{weak}. Weak slicing tends to be preferred ---specially for debugging--- for two reasons: the algorithm can be simpler and avoid dealing with termination, and the slices can be smaller, narrowing the focus of the debugger. For some applications, strong slices are preferred, such as extracting a feature from a program, where there is a requirement that the resulting slice behave exactly like the original. In this paper we will indicate which kind of slice is produced with each new technique proposed.
|
||||
\textsl{weak}. Weak slicing tends to be preferred ---specially for debugging--- for two reasons: the algorithm can be simpler and avoid dealing with termination, and the slices can be smaller, narrowing the focus of the debugger. For some applications, \deleted{strong slices are preferred,} such as extracting a feature from a program, where there is a requirement that the resulting slice behave exactly like the original\added{, strong slices are preferred}. In this paper\sergio{??} we will indicate which kind of slice is produced with each new technique proposed. \sergio{Generamos alguna vez strong? Joder que cracks somos xD}
|
||||
|
||||
\begin{example}[Strong, weak and incorrect slices]
|
||||
\carlos{The table is labeled execution logs of... but the execution log is a different thing.}
|
||||
In table~\ref{tab:slice-weak} we can observe examples for the various
|
||||
definitions. Each row shows the values produced by the execution of a
|
||||
definitions. Each row shows the values \sergio{for a specific variable $v$ in the slicing criterion,} produced by \deleted{the}\added{a particular} execution of \deleted{a}\sergio{the original}
|
||||
program or one of its slices.
|
||||
The first is the original, which computes $3!$.
|
||||
Slice A's execution log is identical to the original and therefore it is a strong slice.
|
||||
Slice B is a weak slice: its execution correctly produces the same values as the original program, but it continues producing values after the original stops.
|
||||
Slice C is incorrect, as the values differ from the original.
|
||||
Some data or control dependency has not been included in the slice and the program produce different results, in this case the slice computes Fibonacci numbers instead of factorials.
|
||||
The first \added{row stands for}\deleted{is} the original \added{program}, which computes $3!$.
|
||||
Slice A's \deleted{execution log}\added{generated sequence of values} is identical to the original and therefore it is a strong slice.
|
||||
Slice B is a weak slice: its execution correctly produces the same \added{sequence of }values as the original program, but it continues producing values after the original stops.
|
||||
Slice C is incorrect, as the \added{generated sequence of} values differ\added{s} from the \added{sequence generated by the }original \added{program}.
|
||||
\sergio{Taking a closer look, one could think that }Some data or control dependency has not been included in the slice and the program produce different results, in this case the slice computes Fibonacci numbers instead of factorials.\sergio{Esto no parece muy relevante, plantearse quitarlo para no liar con Fibonacci.}
|
||||
\end{example}
|
||||
|
||||
\begin{table}
|
||||
\centering
|
||||
\label{tab:slice-weak}
|
||||
\begin{tabular}{r | r | r | r | r | r }
|
||||
Iteration & \textbf{1} & \textbf{2} & \textbf{3} & \textbf{4} & \textbf{5} \\ \hline
|
||||
\deleted{Iteration}\added{Evaluation Number} & \textbf{1} & \textbf{2} & \textbf{3} & \textbf{4} & \textbf{5} \\ \hline
|
||||
Original & 1 & 2 & 6 & - & - \\ \hline
|
||||
Slice A & 1 & 2 & 6 & - & - \\ \hline
|
||||
Slice B & 1 & 2 & 6 & 24 & 120 \\ \hline
|
||||
Slice C & 1 & 1 & 3 & 5 & 8 \\
|
||||
\end{tabular}
|
||||
\caption{Execution logs of different slices and their original program.}
|
||||
\caption{\deleted{Execution logs of different slices and their original program.}\added{Sequence of values obtained for a certain variable of the original program and three different slices A, B and C for a particular input.}}
|
||||
\end{table}
|
||||
|
||||
Program slicing is a language--agnostic tool, but the original proposal by
|
||||
|
||||
Program slicing is a language--agnostic tool\sergio{program slicing es tool o technique?}, but the original proposal by
|
||||
Weiser~\cite{Wei81} covered a simple imperative programming language.
|
||||
Since then, the literature has been expanded by dozens of authors, that have
|
||||
described and implemented slicing for more complex structures, such as
|
||||
uncontrolled control flow~\cite{HorwitzRB88}, global variables~\cite{???},
|
||||
exception handling~\cite{AllH03}; and for other programming paradigms, such as
|
||||
object--oriented languages~\cite{???} or functional languages~\cite{???}.
|
||||
\carlos{Se pueden poner más, faltan las citas correspondientes.}
|
||||
\carlos{Se pueden poner más, faltan las citas correspondientes.}\sergio{Guay, hay que buscarlas y ponerlas, la biblio la veo corta para todos los papers que hay, yo creo que cuando este todo deberia haber sobre 30 casi, si no mas.}
|
||||
|
||||
\subsection{The System Dependence Graph (SDG)}
|
||||
|
||||
There exist multiple approaches to compute a slice from a given program and
|
||||
There exist multiple approaches to compute a slice\sergio{esto me suena raro, yo diria program representations o data structures that allow the use of program slicing techniques o algo asi, debatirlo} from a given program and
|
||||
slicing criterion, but the most efficient and broadly used data structure is the System
|
||||
Dependence Graph (SDG), first introduced by Horwitz, Reps and
|
||||
Blinkey~\cite{HorwitzRB88}. It is computed from the program's statements, and
|
||||
once built, a slicing criterion is chosen, the graph traversed using a specific
|
||||
algorithm, and the slice obtained. Its efficiency resides in the fact that for
|
||||
multiple slices that share the same program, the graph must only be built once.
|
||||
On top of that, building the graph has a complexity of $\mathcal{O}(n^2)$ \carlos{uso $\mathcal{O}$ o $O$?} with
|
||||
respect to the number of statements in a program, but the traversal is linear
|
||||
Blinkey \sergio{in 1988}\sergio{Todos los autores o los citamos con et al.? lo digo por seguir la misma regla durante todo el document}~\cite{HorwitzRB88}. It is computed from the program's statements\sergio{source code}, and
|
||||
once built, a slicing criterion is chosen, the graph \added{is} traversed using a specific
|
||||
algorithm, and the slice \added{is} obtained. Its efficiency resides in the fact that\added{,} for
|
||||
multiple slices \deleted{that share}\added{calculated for} the same program, the graph \deleted{must only be built}\added{generation process is only performed} once.
|
||||
On top of that, building the graph has a complexity of $\mathcal{O}(n^2)$ \carlos{uso $\mathcal{O}$ o $O$?}\sergio{Josep?} with
|
||||
respect to the number of statements in \deleted{a}\added{the} program, but the traversal is linear
|
||||
with respect to the number of nodes in the graph (each corresponding to a
|
||||
statement).
|
||||
statement) \sergio{footnote?}.
|
||||
|
||||
The SDG is a directed graph, and as such it has vertices or nodes, each
|
||||
representing an instruction in the program ---barring some auxiliary nodes
|
||||
representing an \deleted{instruction}\added{statement} in the program ---barring some auxiliary nodes
|
||||
introduced by some approaches--- and directed edges, which represent the
|
||||
dependencies among nodes. Those edges represent various kinds of dependencies
|
||||
---control, data, calls, parameter passing, summary--- which will be defined in
|
||||
dependencies among nodes. Those edges represent various\sergio{several} kinds of dependencies
|
||||
---control, data, calls, parameter passing, summary--- which will be defined\sergio{further explained?} in
|
||||
section~\ref{sec:first-def-sdg}.
|
||||
|
||||
To create the SDG, first a \textsl{control flow graph} (CFG) is built for each method
|
||||
in the program, then its control and data dependencies are computed, resulting
|
||||
in the \textsl{program dependence graph} (PDG). Finally, all the graphs from every
|
||||
method are joined into the SDG. This process will be explained at greater
|
||||
To create the SDG, first \deleted{a}\added{the corresponding} \textsl{control flow graph} (CFG) is built for each method
|
||||
in the program, then\added{,} its \added{associated }control and data dependencies are computed, resulting
|
||||
in \added{a new graph representation known as }the \textsl{program dependence graph} (PDG)\sergio{cita??}. Finally, all the graphs from every
|
||||
method are joined \added{by the appearance of a new kind of inter-procedural arcs, the argument-in argument-out arcs that link function definitions with function calls, obtaining}\deleted{into} the \added{final} SDG. This process will be explained at greater
|
||||
lengths in section~\ref{sec:first-def-sdg}.
|
||||
%TODO: marked for removal --- this process is repeated later in ref{sec:first-deg-sdg}
|
||||
%\begin{description}
|
||||
|
@ -164,10 +171,10 @@ lengths in section~\ref{sec:first-def-sdg}.
|
|||
%method, and an extra type of edge exists: \textsl{summary edges}, which
|
||||
%summarize the data dependencies between input and output variables.
|
||||
%\end{description}
|
||||
An example is provided in figure~\ref{fig:basic-graphs}, where a simple
|
||||
multiplication program is converted to CFG, then PDG and finally SDG. For
|
||||
simplicity, only the CFG and PDG of \texttt{main} are omitted. Control
|
||||
dependencies are black, data dependencies red, and summary edges blue.
|
||||
An example \added{of how an initial CFG is augmented and enhanced with all mentioned dependencies obtaining the corresponding PDG and the final SDG} is provided in figure~\ref{fig:basic-graphs}, where a \added{the process is illustrated for a} simple
|
||||
multiplication program\deleted{ is converted to CFG, then PDG and finally SDG}. For
|
||||
simplicity, only the CFG and PDG of \texttt{main} are omitted\sergio{no entiendo esto de main. Donde esta main?}. Control
|
||||
dependencies are \added{represented with }black \added{arcs}, data dependencies \added{with} red \added{arcs}, and summary edges \added{are depicted with }blue \added{arcs}.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
|
|
Loading…
Reference in a new issue