|
Logic and Computation Group: Potential Projects
The following are suggested projects which members of the Logic
and Computation group have indicated they are willing to
supervise. Some of them could be undertaken either briefly for a
summer internship or in greater depth and detail for a Masters
or PhD thesis. Suitability for PhD, Masters, Honours or Summer
Scholarship is indicated in each case.
Students interested in pursuing research at any level in the
group are invited to contact the group members directly,
mentioning any of the advertised projects which they find
interesting.
|
|
|
|
|
-
Optimal policies for stochastic local search
Study of the performance limits of local search methods for
propositional satisfiability problems.
Supervisor: John Slaney
-
Stochastic Local Search (SLS) algorithms for propositional
satisfiability (SAT) problems have been shown to perform
spectacularly well where the problems are fairly
unstructured (random k-SAT, for instance) but to be less
impressive in the cases of more structured problems. In this
project, we treat the search problem as a Markov decision
process (MDP) in which the various moves available to the
SLS solver, such as "flip a random bit" or "flip a random
bit in a random unsatisfied clause", are the actions
determining (stochastic) transitions and in which the
optimal policy is that determining the stochastic shortest
path to a goal state.
Of course, practical algorithms cannot really solve an MDP
with trillions of states in reasonable time, so this is not
proposed as a solution method. The point of the project is
to carry out a theoretical and experimental examination of
the effect of different moves and different strategies with
a view to comparing strategies to the optimum.
Requirements on the part of the student are an interest in
automated reasoning, some mathematical background and a
willingness to undertake a good deal of experimental work.
The project could be a suitable basis for an honours or
maybe a Masters thesis. Alternatively it could form part of
a larger PhD project on local search or on the mathematics
of boolean logic.
-
Son of SCOTT: a high-performance automatic theorem prover
Development of a semantically guided theorem prover with
world-beating performance.
Supervisor: John Slaney
-
The SCOTT project, which has been running intermittently for
about 15 years, is concerned with using (small) models of
sets of formulae to enhance the performance of a theorem
prover for first order logic. Two recent developments have
made it interesting to start a new phase of the
project. Firstly, in 2004 we developed prototype of SOS (Son
of SCOTT), a new prover with a much more efficient way of
using semantic information. Secondly, the underlying theorem
prover Otter was superceded by the much faster
Prover9. Putting all of this together, it is now possible to
construct one of the best systems in the world for general
purpose first order reasoning.
Writing, testing, refining and using an important new piece
of reasoning software is a project requiring creativity as
well as programming skills. Investigating the effectiveness
of semantic guidance as an element of first order deduction
further requires an understanding of logic and proof. For
the right student, this project offers the opportunity to
"push the envelope" of what mechanised reasoning software
can do and thus to make an impact on the applications of
automated reasoning.
This is a PhD project. Fragments of it could be undertaken
as the research components of lower degrees.
-
ARIA: Automated reasoning in algebra
A critical examination of the application of automated
systems to algebraic reasoning.
Supervisors: Tomasz Kowalski, John Slaney
-
Some of the most impressive successes of automatic proof
generation have been in abstract algebra, the most
celebrated being the proof of the Robbins conjecture by
McCune using the theorem prover EQP. Moreover, a brief
examination of the TPTP (Thousands of Problems for Theorem
Provers) library reveals a large collection of problems from
elementary algebra: in addition to the ALG section, there
are whole sections on group theory, ring theory, field
theory, lattice theory and boolean algebra. This emphasis on
algebraic reasoning is perhaps not too surprising given that
algebra naturally consists of theories determined by small
sets of axioms and features symbolic rather than, say,
numerical reasoning. TPTP contains many problems from both
classical and non-classical propositional logic for much the
same reasons. Unlike propositional logic, however, algebra
engages the interest of a large mathematical community, so a
success of automated reasoning such as the settling of the
Robbins conjecture is a significant event. At the same time,
it has been often enough said that the Robbins conjecture -
what we should now call EQP's Theorem - is "just algebra",
as though that makes it less important than the much more
difficult reasoning in analysis or number theory being
attempted in other automated reasoning research
programs. Behind this reaction is perhaps the thought that
algebraic manipulations are expected to be easy for
machines, however painful they may be for human
mathematicians.
An examination of the relationship between automated
reasoning and abstract algebra is overdue. The proposed ARIA
project will survey the past and present state of research
into automatic proof of algebraic theorems and will help to
shape such research by directing it towards what is of
interest to algebraists rather than towards what is easy for
the existing tools. That is, in a sentence, we intend to
substitute mathematical pull for technological push.
This is quite a flexible project. It could be undertaken as
a one-year project by an honours student with a strong
mathematical background, or extended to form the basis for
PhD level research.
-
Real-time Logic
Concepts and techniques for reasoning with unstable data.
Supervisor: John Slaney
-
In all standard views and applications of logic, we are
concerned with the timeless notion of logical closure of a
set (or sometimes a multiset or whatever) of formulae. The
consequences of data just are what they are,
instantaneously, even though in practice it takes time to
enumerate them - indeed, it would take forever to enumerate
them all. In this project, we ask whether the standard
notion is the most appropriate way to think of the matter,
given that sometimes a reasoning agent must make and
continually update its deductions, and base decisions upon
them, against a background of changing data. Does it make
sense to think of consequence as somehow rippling out from
the input data, so that at any given moment the "correct"
answer as to whether a given conclusion follows depends on
its deductive distance from the input? Are there useful
notions of logical robustness (insensitivity to small
changes), for instance, that can be used to qualify proof
systems or databases? Does the reliability of inference
depend on time as well as on logical accuracy?
These questions are very radical - there is virtually no
literature on the subject. We can approach them both from
the standpoint of logical theory, as they may suggest new
foundations for the very notion of consequence, and from
that of practical automated reasoning, as they may lead to
new design features for implementations aimed at real-time
applications.
This is a challenging PhD project, potentially with a high
impact but also with a risk that we find nothing useful. One
I would be keen to explore in collaboration with an
adventurous student!
-
Logical analysis of constraint models
Application of automated deduction to data-independent CSPs
Supervisor: Peter Baumgartner
-
Constraint
programming is a highly successful technology for
solving all kinds of combinatorial problems. It is one of
the closest approaches computer science has yet made to the
Holy Grail of programming: the user states the problem, the
computer solves it. A constraint model consists of a
(declarative) specification of the problem domain (e.g. what
a schedule is) and a data part specifying the concrete
problem.
Now, an interesting situation appears when the data part is
absent. Can we still do useful things then? Yes! Logical
techniques based on automated theorem proving can be used to
analyze the domain specification. For instance, one might
want to optimize the given specification. The situation is
comparable to query optimization in databases, which amounts
to modifying a query (without knowing the database) so that
it can be solved more efficiently.
Here are some possible topics. Each of them admits practical
and/or theoretical focus.
-
Translation of problem descriptions ("models") into
logical formalism.
-
Model optimization via mapping to logic proof tasks.
-
Application of
SMT technology
for problem analysis.
-
Integration and extension of the above mentioned base
technology into the G12 platform.
The work would be embedded in
a larger project
that develops NICTA's software platform for solving large
scale industrial combinatorial optimisation problems.
-
Instance-based methods
(Mainly) extending the theorem prover
Darwin
which is (co-)developed in our group
Supervisor: Peter Baumgartner
-
Instance-based methods
for first-order theorem proving search for proofs by
maintaining a set of instances of clauses and analyzing it
for satisfiability until completion. IBMs are conceptually
different from well established methods like resolution or
free-variable analytic tableaux, with a different search
space and different termination behaviour. This makes them
attractive from a practical point of view as a complementary
method. For instance, IBMs are decision procedures for
function-free clause sets and thus capture the complexity
class NEXPTIME. Topics include:
-
Equality reasoning: reasoning with equations is of
paramount importance in virtually all application
areas. As the theory of doing this within Darwin is
already developed (one could always improve, though) this
is mostly a programming project. Programming language is
OCaml.
-
Numbers: first-order theorem provers are utterly bad with
numbers. This project is about adding and experimenting
with finite-domain integer arithmetic to Darwin. This
could be done by clever encoding arithmetic in the input
and/or extending Darwin.
-
Minimal model computation: Darwin can compute finite
models of first-order logic formulas (provided they
exist). For some applications, e.g., within computational
linguistics or diagnosis, it is not enough to deliver just
any model. One wants to have a minimal model. In
diagnosis, for instance, this corresponds to the
experience that usually only a minimal set of components
of a device breaks at a time.
-
Some domains are necessarily infinite, e.g., the
integers. Instance-based methods are bad at detecting
satisfiability in presence of axiom sets that admit
infinite models only. Can we find a clever mechanism to
improve the situation, i.e. to discover and represent
(certain) infinite models? This is a challenging (also)
theoretical project.
-
Many practically interesting problem classes are
NEXPTIME-complete. Examples include satisfiability of
SHOIQ knowledge bases (important for the Semantic Web),
first-order model expansion (relevant for constraint
solving), satisfiability in the Ackermann-Class with
equality, and First-Order logic with two variables and
counting quantifiers. This suggests a number of
investigations: to look at existing transformations of the
above problems into function-free clause logic, check if
they are feasible, perhaps improve them and use tools such
as our Darwin prover to solve them.
-
Verified compilation
Towards a provably correct compiler for a standard
programming language.
Supervisor: Michael Norrish
-
It is now possible to describe real world languages, such as
C, Java and Ada, using formal methods (typically by giving
the languages an operational semantics). Real world
chip-architectures (such as the ARM instruction set) have
also been formally described. The big, remaining project is
to tie two such descriptions together, producing a compiler
that is proven to produce correct output.
Issues:
-
Modern compilers include many optimisations. Can a
verified compiler also include verified optimisations?
-
The “obvious” way to get a verified compiler
is to define the function that maps from source-code to
target assembly in a logic, and then prove that this
function preserves behaviour (for some value of
“behaviour”). Another approach might also
be possible: write code to prove that a piece of
assembly implemented the same behaviour as the source.
As long as the proof procedure had high coverage (it
might be tailored to only verify the output of a
particular compilation function, which function might
also output hints to help the procedure do the proof),
this would be “just as good”.
Tangents and Variations:
-
Compile a functional or logic language like SML or
Prolog.
-
Give a semantics to the sort of code that appears in
operating system kernels, which mixes C and assembler,
potentially within the same function.
-
Verify sophisticated optimisations on intermediate
languages.
Further reading:
But: Xavier Leroy may well “do”
this whole project in the near future; his existing
verifications go from a largish subset of C to a subset of
the PowerPC instruction set.
-
Formal Protocol Design and Analysis
Contribution to the specification and alanysis of
communication protocols
Supervisor: Michael Norrish
-
Design a language capable of expressing the intricacies of
modern communications protocols (such as TCP), but which
also admits a reasonably efficient checking function, able
to determine if an observed trace could have been generated
by a given protocol description. Such a tool and language
would be extremely valable to the community of
protocol designers and implementors.
This work would form part of NICTA's project "Valildating
Network Semantics", which is a collaboration with
researchers at the University of Cambridge.
Tangents and Variations:
Further reading:
-
Theory of Co-algebraic Types with Binders
Development of new logical theory relevant to programming
language semantics
Supervisor: Michael Norrish
-
Researchers in the area of programming language semantics
have long been exercised by the problems thrown up the
notions of binders and α-equivalence. Some progress
has been made recently, but focussing on types that are
quotients of algebraic or inductive types, where values are
finite in size. Co-algebraic types are potentially infinite
in size, which means that assumptions like the one that a
value only has finitely many free variables will no longer
hold.
Developing a theory of co-algebraic types with binders would
probably involve proving co-induction and co-recursion
principles for these types, and demonstrating their utility
in proofs drawn from the literature. One possible example
would be the Boehm tree, which is used in the theory of the
λ-calculus.
Further reading:
-
Mechanise X, Apply to Y
Contribution both to mechanised mathematics and to automated
reasoning applications.
Supervisor: Michael Norrish
-
A possible structure for a PhD project is to mechanise some
large-ish piece of mathematics (X) and to then use
this to solve some “real” computer science
problem (Y). Mechanisation is likely to require
sustained use of some tool capable of serving as a reasoning
assistant.
To bring this PhD template down to earth, consider a couple of
examples. One might, for instance, mechanise real analysis,
and then use it to verify floating-point algorithms
(
John Harrison).
Or mechanise probability theory, and then use it to verify
probabilistic algorithms
(
Joe Hurd).
-
Theorem-Proving Implementation—Interfaces
Exploring the area where automated reasoning overlaps with
HCI
Supervisor: Michael Norrish
-
How would one take an interactive, but powerful, tool such
as ACL2, HOL, Isabelle, or PVS and make it more
user-friendly, without compromsing its ability to handle
large problems? A graphical user-interface might be
appropriate, but would a GUI put off the power-user?
Perhaps a more significant rethinking of the theorem-proving
work-flow is necessary.
Further reading:
-
Theorem-Proving Implementation—Core Technology
Proving more theorems faster, with or without human intervention
Supervisor: Michael Norrish
-
This would be work on topics such as first-order proof,
rewriting and decision procedures for subject domains like
arithmetic. There is a wealth of expertise in
CSL and the
Logic & Computation programme on these areas, so I
might not necessarily be the best person to act as
supervisor, though work on these technologies in an LCF
setting (which tends to be interesting research) would be
something I would be involved with.
-
Proof Optimisation
Improving low-level proofs extracted from high-level ones
Supervisor: Michael Norrish
-
A proof over an LCF-style kernel is usually expressed (by
the user) in terms of high-level steps or tactics.
Ultimately however, the proof is still executed at the
lowest level of primitive inference steps. If these steps
are recorded, the resulting “proof trace” is
very long, for all that it may be possible to execute it
quickly.
A possible project would be to investigate these traces and
optimise them so that they proved the same results but did
so faster. Many of the tricks of compiler optimisation
might well have proof analogues, and insights from proof
theory may also be relevant.
This may turn out not to involve enough new work to warrant
a PhD, so it might be best undertaken as the research
component of a lower degree first, to act as a pilot project
for something larger if necessary.
-
-
General HOL4 Upgrade
Programming project
Supervisor: Michael Norrish
-
Upgrade the current HOL implementation (“HOL4”)
to run under the MLton compiler. This would require a
fairly substantial re-engineering of the basic HOL
work-flow, and the way it represents theories. Once this
much was done, there’d be scope to play with
connections to Eclipse (see
interfaces project
above).
Further reading:
-
Finding Models of Unsolvable First Order Goals
Implementing a generator of counter-examples for use in a
proof assistant
Supervisors: Peter Baumgartner and Michael Norrish
-
When using an interactive proof assistant, a powerful tactic
for finishing goals is the use of first order proof
procedures. Unfortunately, these procedures are
all-or-nothing affairs: if they succeed, they succeed
completely, but if they fail to prove the goal, the user
knows only that the provided lemmas and assumptions were not
sufficient to allow the procedure to find a proof.
The advent of model generation tools promises a way out of
this painful situation. These tools can be used to generate
what is effectively a counter-example to false goals. This
is much more helpful for the user. Either the tool's output
really is a counter-example and their proof will never
succeed, or the counter-example will be impossible because
the user has forgotten to tell the tools some relevant lemma
from a database of known facts.
The student’s task in this project would be to
implement a connection between two existing tools: the
interactive system HOL4, and the model generation tool
Darwin. This connection would provide a user-friendly
implementation of the scheme described above: when an
invocation of the first order prover in HOL failed, the
connection implementation would translate the HOL goal into
terms that Darwin could understand, call Darwin, and then
finally translate Darwin's provided counter-example back
into a human-readable form for the user’s benefit.
This project would suit a student familiar with functional
programming (HOL is written in SML), and happy with the
concepts of first order logic (terms, formulas,
quantifiers).
|
|