%include fhm-preamble.fmt
%include fhm-preamble-extras.fmt
\section{The Type System}
\label{sec:type-sys}
% * Explain the basis of the type system
% - Equational language embedded in the lambda-calculus
% - Restricted system F.
% - Hindley-Milner inference.
% - Notation conventions.
The type system is presented as an embedding of an equation-based
language into the simply-typed $\lambda$-calculus. The type system has
been implemented in the dependently typed programming language Agda
\cite{Agda}, giving us assurances that the algorithm is both total
and terminating.
\newcommand\T{\rule{0pt}{2.6ex}}
\newcommand\B{\rule[-1.2ex]{0pt}{0pt}}
\begin{figure}
\begin{minipage}[b]{0.5\linewidth}
\centering
\begin{tabular}{ll}
\hline
Description \T \B & Symbol \T \B \\
\hline \\
$\lambda$-bound variables & |x|, |y| \\
Expressions ($\lambda$-terms) & |e| $\in\Lambda$ \\
Signal-variables & |z| \\
Balance type-variables ~~~~~~ & |n|,|m|,|o| $\in\mathbb{Z}$ \\
Signal level expressions & |s|
\end{tabular}
\end{minipage}
\begin{minipage}[b]{0.5\linewidth}
\centering
\begin{tabular}{ll}
\hline
Description \T \B & Symbol \T \B \\
\hline \\
Equations & |q| \\
Simple types & |tau| \\
Type schemes & |sigma| \\
Typing environments ~~~~~~ & |Gamma| \\
Constraint sets & |C|
\end{tabular}
\end{minipage}
\caption{Notation Conventions}
\label{fig:notation}
\end{figure}
The notation $\overline{\chi}$ is used to denote a sequence
$\chi_1,\ldots,\chi_n$ without repetition of elements. We will also
allow ourselves to treat $\overline{\chi}$ as sets equipped with the
usual set-theoretic operations. One should also note that |x| (and
|y|) and |z| are \emph{meta-variables}, ranging over the names of
concrete function-level and signal-level variables, respectively.
\subsection{Overview}
% * Motivation for using the type system
% - We can only assume types are known.
% - The structure of SR may also be unknown.
% * Outline of the refinements.
% - SR types indexed by the number of equations they contribute.
% - Both the structure and usage context of a signal relation
% determine the balance constraints generated.
% - Similar to dependent types, but far more restricted.
% - Constraints may contain several balance variables, and as such
% are separated from the ``types''. Adoption of Haskell style
% class constraint syntax.
As signal relations are first-class entities, it cannot be assumed
that components can be flattened in order to determine the
equation-variable balance.
% The only reasonable assumption is that only
% the \emph{type} of a signal relation is known statically.
The only reasonable assumption is that all that is known statically
is the \emph{type} of a relation.
To track the equation-variable balance, the type of a signal relation
is refined by annotating it with the number of equations it is able to
contribute to a system. The contribution of a signal relation may also
depend on the contribution of the parameters to the signal
relation. Hence, signal relations can behave in a polymorphic fashion,
contributing varying numbers of equations depending on the context in
which the relation is used. See Sect. \ref{sec:related} for a
comparative review of alternative type system designs.
Since structural information required to determine a precise
contribution may not always be available, the context in which a
signal relation is applied is used to generate \emph{balance
constraints} (from now on, simply constraints). These constraints
restrict the balance of a component to an interval.
Note that a representation of integers and linear inequalities has
been introduced at the type level. This extension may appear to be a
restricted form of dependent types \cite{McKinna2006}. However, these
type level representations, whilst determined by the structure of
terms, are not value level terms themselves. As such, we do not
consider our system to be dependently typed.
Constraints may mention the contributions of several components, and
hence are not directly associated with a single signal relation. As a
result, the type of a signal relation is restricted to being annotated
by a \emph{balance variable} which is then further refined using
constraints. The type checking algorithm generates a fresh balance
variable for each signal relation, with type equality defined
up to alpha equivalence of balance variables. As an example, the
refined type for |resistor| from Sect.~\ref{sec:mse-fah} is:
> resistor :: (n = 2) => Resistance -> SR (Pin, Pin) n
Haskell's type class constraint syntax has been adopted to express
that the balance type variable |n| is constrained to the value
|2|. This can be verified by first flattening the signal relation
applications to obtain a set of |3| equations over |5| variables (note
that each |Pin| contains two variables), then removing one equation
which must be used to solve for the local variable $u$, giving a
contribution of two equations.
\subsection{Generating Constraints}
\label{sec:type-sys-gs}
% * Explain the purpose/requirement of the constraints
% - What constraints to impose
% - What impact these constraints have on type checking.
% * Define the simplified syntax of types and terms:
% - Superficiality of simplifications
% * Formally define:
% - Equations ``mentioning'' variables
% - Number of concrete equations in a signal relation.
% - Local and interface variables
% - Local, interface and mixed equations
% - Contribution of a signal relation.
% * Provide the 5 criteria.
% * Provide motivation example (par circuit)
% - Show what constraints are generated for the example.
Now let us consider constraint generation. It would be desirable to
catch as many faults in a system of equations as possible. However,
one cannot hope to retain a perfect interpretation of unsolvability of
variables in the type, instead an approximation must be admitted. This
is not a real issue as the goal of the proposed type system is not to
guarantee that a system of equations can be solved, but rather to
detect classes of errors that will definitely lead to structurally
unsound systems, e.g. unbalanced systems.
This leads to the question of what constraints should be generated. It
is conceivable that different application domains could generate
constraints specific to that domain. This is not a problem, as the
system developed is independent of the constraints generated. For the
purposes of this paper, 5 criteria for generating constraints have
been chosen. Before introducing the criteria, a number of definitions
are required.
Fig.~2 and 3 give the syntax of terms and types from which the type
checking algorithm will be developed. A number of simplifications have
been made to the FHM framework in order to keep the presentation of
the type system concise. Note that all simplifications are superficial
and do not fundamentally change the nature of the problem.
\begin{figure}
%% \vspace{-3em}
\begin{minipage}[t]{0.33\linewidth}
\centering
\begin{code}
e ::= x
| e1 e2
| \x . e
| let x = e1 in e2
| sigrel (bar z) where (bar q)
^^
q ::= Atomic (bar z)
| e <> (bar z)
\end{code}
\label{syntax:terms}
\end{minipage}
\begin{minipage}[t]{0.33\linewidth}
\centering
\begin{code}
sigma ::= overline C => tau
^^
tau ::= tau1 -> tau2
| SR rm n
| LEqn n
| IEqn n
| MEqn n
\end{code}
\label{syntax:types}
\end{minipage}
\begin{minipage}[t]{0.32\linewidth}
\centering
\begin{code}
C ::= ce1 = ce2
| ce1 >= ce2
^^
ce ::= n
| INT
| ce + ce
| - ce
\end{code}
\label{syntax:constraints}
\end{minipage}
\vspace{-1.5em}
\caption{Syntax of terms, types, and constraints.}
\end{figure}
We consider the simply-typed $\lambda$-calculus, given by |e|,
augmented with first-class signal relation constructs. Signal
relations abstract over sets of signal variables, denoted $\overline{z}$,
and embed a new syntactic category of |equations| into the calculus,
given by |q|.
Signal relations range over sets of equations, which may take one of
two forms. An atomic equation of the form |s1 = s2| is abstracted to
just the set of distinct signal variables occurring in the signal
expressions |s1| and |s2|. Similarly, an equation of the form |e <>
(bar s)| is abstracted to the expression denoting the applied signal
relation and the set of signal variables that occur on the
right-hand-side of the application. More detailed comments on theses
syntactic categories are given in
Sect. \ref{sec:type-sys-formalising}.
An equation |q| is said to \emph{mention} a signal variable |z| if and
only if |z elem vars(q)|. The function |total| returns the raw number
of atomic equations contributed by an equation. On the other hand, |abs
(overline q)| denotes the cardinality of the set of \emph{modular}
equations. Both |vars| and |total| are also overloaded for sets of
equations.
\begin{minipage}{0.45\linewidth}
\begin{code}
vars (Atomic (bar z)) = (bar z)
vars (_ <> (bar z)) = (bar z)
vars ((bar q)) =
Union { vars (q) | q elem (bar q) }
\end{code}
\end{minipage}
\begin{minipage}{0.45\linewidth}
\begin{code}
total (Atomic _) = 1
total (e : SR _ n <> _) = n
total ((bar q)) =
Sum { total (q) | q elem (bar q) }
\end{code}
\end{minipage}
Given a signal relation |sigrel (bar z) where (bar q)|, the set of
interface variables is defined |subZ I = (bar z)|, and the set of
local variables | subZ L = vars((bar q)) \\ (bar z) |. The set of
equations |(bar q)| can be partitioned into the disjoint subsets of
interface equations | subQ I |, local equations | subQ L |, and mixed
equations | subQ M |, where | subQ I | is the set of equations
mentioning only interface variables, | subQ L | is the set of
equations mentioning only local variables, and | subQ M = ((bar q)
\\ subQ I) \\ subQ L |. Finally, the balance of a signal relation is
written $bal(sr)$. It is now straightforward to specify the constraint
criteria:
%% Given a signal relation |sigrel (bar z) where (bar q)|, over a set of
%% signal variables |Z = (bar z) union vars((bar q))|, we define the set
%% of interface variables |subZ I = (bar z)|, and the set of local
%% variables | subZ L = vars((bar q)) \\ (bar z) |. The set of equations
%% |(bar q)| can be partitioned into the disjoint subsets of interface
%% equations | subQ I |, local equations | subQ L |, and mixed equations
%% | subQ M |, where | subQ I | is the set of equations mentioning only
%% interface variables, | subQ L | is the set of equations mentioning
%% only local variables, and | subQ M = ((bar q) \\ subQ I) \\ subQ L
%% |. Finally, the balance of a signal relation is written $bal(sr)$. It
%% is now straightforward to specify the constraint criteria:
\begin{enumerate}
\item |bal(sigrel (bar z) where (bar q)) = total((bar q)) - abs(subQ
L)|. The balance of a signal relation is an aggregate of the
equations in its body, excluding sufficiently many equations to solve for
the local variables.
\item |abs(subQ L) + abs(subQ M) >= abs(subZ L)|. The local variables
are not under-determined.
\item |abs(subQ L) <= abs(subZ L)|. The local variables are not
over-determined.
\item |abs(subQ I) <= abs(subZ I)|. The interface variables are not
over-determined.
\item |0 <= bal(sr) <= abs(subZ I)|. A signal relation must
contribute equations only for its interface variables. It should not
be capable of removing equations from other components (negative
balance), or adding equations for variables not present in its
interface.
\end{enumerate}
The above criteria produce constraints that give adequate assurances
for detecting structural anomalies. There is potential to further
refine these criteria. However, for the purposes of this paper, these
criteria are sufficient to demonstrate the value of the type system.
To illustrate the application of the above five criteria, consider the
Hydra example |par| that connects two circuit components in
parallel. The type signature gives the type of |par| under the simply
typed approach. The operational details of this example are not
important, the only important concept is that of equations
\emph{mentioning} variables.
\noindent
\begin{minipage}[b]{0.55\linewidth}
\begin{code}
par :: SR (Pin , Pin) -> SR (Pin , Pin) -> SR (Pin , Pin)
par sr1 sr2 =
sigrel ((pi, pv), (ni, nv)) where
sr1 <> ((p1i, p1v), (n1i, n1v))
sr2 <> ((p2i, p2v), (n2i, n2v))
pi + p1i + p2i = 0
ni + n1i + n2i = 0
pv = p1v = p2v
nv = n1v = n2v
\end{code}
\end{minipage}
\begin{minipage}[b]{0.45\linewidth}
\includegraphics{twoParRes.eps}
\vspace{0.9cm}
\end{minipage}
Under the new type system, the signal relations in |par| are annotated
by balance variables, which are then constrained by the criteria
producing the following refined type:
\begin{code}
par :: { m = n + o - 2, 6 >= n + o >= 2, 0 <= m <= 4, 0 <= n <= 4, 0 <= o <= 4 } =>
SR (Pin , Pin) n -> SR (Pin , Pin) o -> SR (Pin , Pin) m
\end{code}
While this type may appear daunting at first, all balance variables
and constraints can be inferred without requiring the programmer to
annotate the terms explicitly. It is also useful to see an example of
a program that fails to type check under the new type system -- a
program that previously would have been accepted, despite being
faulty.
\begin{code}
broken sr = sigrel (a , b) where
sr <> (w + x , y + z)
sr <> (a , b)
x + z = 0
\end{code}
The above function |broken| is flawed in that there is no relation to
which it can be safely applied. The relation |sr| is required to
provide at least 3 equations for local variables, but must not exceed
a contribution of 2 variables as dictated by the second relation
application. As expected, our type system catches this error by
attempting to impose the following inconsistent set of constraints:
\begin{code}
broken :: (m = n + n - 3, 0 <= m <= 2, 0 <= n <= 2, 4 <= n + 1 <= 4)
=> SR (Real , Real) n -> SR (Real , Real) m
\end{code}
There is little use in generating constraints without a mechanism to
check their consistency. To this end, the Fourier-Motzkin elimination
method is employed \cite{Kuhn}. The method allows one to
check not only if a set of linear inequalities is satisfiable, but
also finds a continuous interval for each bound variable. It is
expected that this will be useful when reporting type errors to the
programmer.
The elimination algorithm has worst case exponential time complexity
in the number of balance variables. However, as shown by Pugh
\cite{Pugh91}, the modified variant that searches for integer
solutions is capable of solving most common problem sets in low-order
polynomial time. Furthermore, systems typically involve only a handful
of balance variables, making most exponential cases still feasible to
check.
\subsection{Formalising the Type System}
\label{sec:type-sys-formalising}
% * Brief recap of the strategy.
% * Explain syntax of types:
% - SR p n
% - Eqn n m o
% - Pattern type (R,...,R)
% - Constraint sets
% - Quantification of balance and type variables.
% * Explain aspects of the typing rules
% - Read-only environment I
% - Bottom-up calculation of Eqn indices (m, n, o)
% * Brief overview of the typing algorithm?
% * Properties of the type system
% - Totality, termination
% - Totality, termination of Fourier-Motzkin
% - Correctness w.r.t flattening semantics
% * Implementations of the ty pe system
% - Haskell implementation
% - Ongoing work toward Agda implementation
Fig. \ref{syntax:terms} from Sect. \ref{sec:type-sys-gs} provide the
syntax of terms and types upon which the semantics and typing
judgements presented in this section are based.
Fig. \ref{fig:semantics} presents a small-step semantics for our
calculus by way of a flattening for a system of equations. Values in
our system are closed lambda-terms of the form |\x . e|, signal relations
encapsulating atomic equations, and atomic equations.
The notation $\{\overline{z_1}/\overline{z_2}\}$ denotes the
substitution that occurs when reducing signal relation
application. Our abstract treatment of equations allows us to read
this notation as substituting every variable in $\overline{z_1}$ for
all variables in $\overline{z_2}$, a simplification of the
substitution given in Sect. \ref{sec:mse-aosoe}. The symbol |fresh|
denotes a fresh sequence of signal variables, used in \textsc{S-SigApp2}
to rename local variables to prevent name clashes during flattening.
\newcommand{\sigrel}[2]{\textbf{sigrel}~{#1}~\textbf{where}~{#2}}
\begin{figure}
\begin{mathpar}
\infer
{e_1 \leadsto e_2}
{e_1~e_3 \leadsto e_2~e_3}
\quad \textsc{(S-App1)}
\infer
{ }
{(\lambda x . e_1)~e_2 \leadsto [x \mapsto e_2]~e_1}
\quad \textsc{(S-App2)}
\infer
{ }
{\textbf{let}~x = e_1~\textbf{in}~e_2 \leadsto [x \mapsto e_1]~e_2}
\quad \textsc{(S-Let)}
\infer
{e_1 \leadsto e_2}
{e_1 \diamond \overline{z} \leadsto e_2 \diamond \overline{z}}
\quad \textsc{(S-SigApp1)}
\infer
{\exists q_1 \in \overline{q}.~q_1 \leadsto q_2}
{\sigrel{\overline{z}}{\overline{q}} \leadsto \sigrel{\overline{z}}
{[q_1 \mapsto q_2]~\overline{q}}}
\quad \textsc{(S-SigRel)}
\infer{\overline{q_2} = \{(vars(\overline{q})\backslash\overline{z_1})/
fresh\}~\overline{q_1} }
{(\sigrel{\overline{z_1}}{\overline{q_1}}) \diamond \overline{z_2} \leadsto
\{\overline{z_1}/\overline{z_2}\}~\overline{q_2}}
\quad \textsc{(S-SigApp2)}
\end{mathpar}
\caption{Small-step semantics.}
\label{fig:semantics}
\end{figure}
The simplification of substitution discussed above has introduced a
slight disparity between our abstract formalisation and the concrete
system. In the FHM system, applying a signal relation contributing |n|
equations to a mixed set of variables results in |n| mixed
equations. However, during evaluation, it may be discovered that some
of the equations within the signal relation do not mention both local
and interface variables. Hence, the number of mixed, local, and
interface equations may be refined as a result of evaluation.
This problem is avoided in our semantics by the simplification to
substitution mentioned above. However, this should not pose a real
problem in the concrete system either. The preservation problem is
reminiscent of the record subtyping problem addressed in Peirce
\cite{Pierce}, pages 259--260. It should be possible to adapt the
technique of \emph{stupid casts} used in Pierce to solve the
preservation problems that would be present in a more concrete
semantics. To be more precise, one could allow a \emph{stupid cast} of
local and interface equations back into mixed equations, thus
retaining the same contribution and maintaining the same
constraints. We leave this alteration as future work, as the current
semantics are sufficient for the purposes of this paper.
The syntax of types is similar to that of the simply-typed
$\lambda$-calculus. The usual function type is augmented with 4
additional constructors for signal relations and equations.
Signal relation types mirror signal relation terms by recording the
cardinality of the set of variables, denoted |rm|. Additionally,
relations are annotated with a \emph{balance type variable} which may then
appear in a constraint set. Equation fragments may be assigned one of
the following types: |IEqn|, |LEqn|, or |MEqn|, representing equations
mentioning interface variables, local variables, and a mixture,
respectively. Type schemes are given by |sigma|, allowing types to be
constrained and polymorphic in the number of equations they
contribute. |C| and |ce| describe the syntax of constraints and
constraint expressions, respectively.
\begin{figure}[ht]
\centering
\begin{mathpar}
\inferrule
{\Gamma (x) = C \Rightarrow \tau}
{\Gamma \vdash x : C \Rightarrow \tau}
\quad \textsc{(T--Var)}
\inferrule
{\Gamma,x:C_1 \Rightarrow \tau_1 \vdash e : C_2 \Rightarrow \tau_2}
{\Gamma \vdash \lambda x.e : C_1 \cup C_2 \Rightarrow \tau_1 \rightarrow \tau_2}
\quad \textsc{(T--Abs)}
\inferrule
{\Gamma \vdash e_1 : C_1 \Rightarrow \tau_2 \rightarrow \tau_1
\and
\Gamma \vdash e_2 : C_2 \Rightarrow \tau_2}
{\Gamma \vdash e_1~e_2 : C_1 \cup C_2 \Rightarrow \tau_1}
\quad \textsc{(T--App)}
\inferrule
{\Gamma \vdash e_1 : C_1 \Rightarrow \tau_2
\and
\Gamma,x: C_2 \Rightarrow \tau_2 \vdash e_2 : C_1 \Rightarrow \tau_1 }
{\Gamma \vdash \textbf{let}~x=e_1~\textbf{in}~e_2 : C_1 \cup C_2 \Rightarrow \tau_1}
\quad \textsc{(T--Let)}
\inferrule
{ }
{I\cdot\Gamma \vdash Atomic~\overline{z} : \emptyset \Rightarrow
eqkind_I(\overline{z},1)}
\quad \textsc{(T--Atomic)}
\inferrule
{\Gamma \vdash e : C \Rightarrow |SR|~\rtothem~n
\and
\mathopen{||}\overline{z}\mathclose{||} \geq n
}
{I\cdot\Gamma \vdash e \diamond |(bar z)| : C \Rightarrow eqkind_I(|(bar z)|, n)}
\quad \textsc{(T--RelApp)}
\inferrule
{L=vars(\overline{q})\backslash \ \overline{z}
\and
%% \forall q \in \overline{q}.~\Gamma \vdash q : C_q \Rightarrow \tau_q
%% \overline{\Gamma \vdash {q} : c \Rightarrow \tau}^{~q \in \overline{q}}
\overline{z} \cdot \Gamma \vdash \overline{q} : \overline{C} \Rightarrow \overline{\tau}
\and
n_{X} = \Sigma \{~b~||~X\hspace{-0.3em}Eqn~b \in \overline{\tau}~\}
\\\\
C = \{ n = n_I + n_L + n_M - |abs L| , 0 \leq n \leq \mathopen{||}\overline{z}\mathclose{||}
, n_I \leq \mathopen{||}\overline{z}\mathclose{||}, n_L \leq |abs L| , n_L + n_M \geq |abs L| \}
}
{\Gamma \vdash \textbf{sigrel}~\overline{z}~\textbf{where}~\overline{q}
: \bigcup \overline{C} \cup C \Rightarrow |SR|~ \mathbb{R}^{\mathopen{||}\overline{z}\mathclose{||}} ~n}
\end{mathpar}
\[
eqkind_I(Z,n) =
\begin{cases}
IEqn~n & \text{if}~\emptyset \subset Z \subseteq I \\
LEqn~n & \text{if}~Z \cap I = \emptyset \\
MEqn~n & \text{otherwise}
\end{cases}
\]
\caption{Typing rules.}
\label{fig:rules}
\end{figure}
Fig. \ref{fig:rules} specifies 7 typing judgements for deriving types
in our type system. The rules for $\lambda$-terms are similar to those
of the simply-typed $\lambda$-calculus, with the addition of
constraint sets given by |C|. Operations that render a constraint
sets inconsistent indicate that a term is ill-typed.
\textsc{T-Atomic} and \textsc{T-RelApp} specify the rules for judging
the types of equations. These rules show how the integer indices of
the constructors of the form |XEqn| are calculated. These rules depend
on the read-only environment |I| which stores the set of interface
variables the equations range over. The remaining rule gives a
strategy for checking the types of signal relations, assigning a
balance, and generating constraints on that balance.
Balance type variables are treated in a similar manner to type
variables in the restricted polymorphic $\lambda$-calculus. Type
checking signal relations consists of generating a fresh balance type
variable for the balance, and where necessary, constraining this
variable. Unifying two equations or signal relations proceeds as
expected by unifying the the appropriate balance variables.
The type system has been implemented in the functional programming
language Haskell. Furthermore, the totality and termination of the
algorithm have been mechanically verified using the dependently typed
programming language Agda \cite{Agda}. It should be noted that the
function for computing the \emph{most general unifier} of two types is
postulated, this is not a problem as it is well known that unification
is total and terminating \cite{DamasMilner}. The source code can be
found on the primary authors website at
\texttt{http://cs.nott.ac.uk/\textasciitilde jjc}.