Getting Started
[Déjà Fu is] A martial art in which the user's limbs move in time as well as space, […] It is best described as "the feeling that you have been kicked in the head this way before"
-- Terry Pratchett, Thief of Time
Déjà Fu is a unit-testing library for concurrent Haskell programs. Tests are deterministic and expressive, making it easy and convenient to test your threaded code.
Available on GitHub, Hackage, and Stackage.
Features:
- An abstraction over the concurrency functionality in
IO - Deterministic testing of nondeterministic code
- Both complete (slower) and incomplete (faster) modes
- A unit-testing-like approach to writing test cases
- A property-testing-like approach to comparing stateful operations
- Testing of potentially nonterminating programs
- Integration with HUnit and tasty
There are a few different packages under the Déjà Fu umbrella:
| Version | Summary | |
|---|---|---|
| concurrency | 1.11.0.3 | Typeclasses, functions, and data types for concurrency and STM. |
| dejafu | 2.4.0.8 | Systematic testing for Haskell concurrency. |
| hunit-dejafu | 2.0.0.6 | Deja Fu support for the HUnit test framework. |
| tasty-dejafu | 2.1.0.2 | Deja Fu support for the Tasty test framework. |
See the latest package documentation.
Installation
Install from Hackage globally:
$ cabal install dejafu
Or add it to your cabal file:
build-depends: ...
, dejafu
Or to your package.yaml:
dependencies:
...
- dejafu
Quick start guide
Déjà Fu supports unit testing, and comes with a helper function called
autocheck to look for some common issues. Let's see it in action:
import Control.Concurrent.Classy
myFunction :: MonadConc m => m String
myFunction = do
var <- newEmptyMVar
fork (putMVar var "hello")
fork (putMVar var "world")
readMVar var
That MonadConc is a typeclass abstraction over concurrency, but
we'll get onto that shortly. First, the result of testing:
> autocheck myFunction
[pass] Never Deadlocks
[pass] No Exceptions
[fail] Consistent Result
"hello" S0----S1--S0--
"world" S0----S2--S0--
False
There are no deadlocks or uncaught exceptions, which is good; but the program is (as you probably spotted) nondeterministic!
Along with each result, Déjà Fu gives us a representative execution
trace in an abbreviated form. Sn means that thread n started
executing, and Pn means that thread n pre-empted the previously
running thread.
Why Déjà Fu?
Testing concurrent programs is difficult, because in general they are nondeterministic. This leads to people using work-arounds like running their testsuite many thousands of times; or running their testsuite while putting their machine under heavy load.
These approaches are inadequate for a few reasons:
- How many runs is enough? When you are just hopping to spot a bug by coincidence, how do you know to stop?
- How do you know if you've fixed a bug you saw previously? Because the scheduler is a black box, you don't know if the previously buggy schedule has been re-run.
- You won't get that much scheduling variety! Operating systems and language runtimes like to run threads for long periods of time, which reduces the variety you get (and so drives up the number of runs you need).
Déjà Fu addresses these points by offering complete testing. You can run a test case and be guaranteed to find all results with some bounds. These bounds can be configured, or even disabled! The underlying approach used is smarter than merely trying all possible executions, and will in general explore the state-space quickly.
If your test case is just too big for complete testing, there is also
a random scheduling mode, which is necessarily incomplete. However,
Déjà Fu will tend to produce much more schedule variety than just
running your test case in IO the same number of times, and so bugs
will tend to crop up sooner. Furthermore, as you get execution traces
out, you can be certain that a bug has been fixed by simply following
the trace by eye.
Bibliography
These libraries wouldn't be possible without prior research, which I mention in the documentation. Haddock comments get the full citation, whereas in-line comments just get the shortened name:
-
[BPOR] Bounded partial-order reduction, K. Coons, M. Musuvathi, and K. McKinley (2013)
-
[RDPOR] Dynamic Partial Order Reduction for Relaxed Memory Models, N. Zhang, M. Kusano, and C. Wang (2015)
-
[Empirical] Concurrency Testing Using Schedule Bounding: an Empirical Study, P. Thompson, A. Donaldson, and A. Betts (2014)
-
[RMMVerification] On the Verification of Programs on Relaxed Memory Models, A. Linden (2014)
There are also a few papers on dejafu itself:
-
Déjà Fu: A Concurrency Testing Library for Haskell, M. Walker and C. Runciman (2015)
This details dejafu-0.1, and was presented at the 2015 Haskell Symposium.
-
Déjà Fu: A Concurrency Testing Library for Haskell, M. Walker and C. Runciman (2016)
This is a more in-depth technical report, written between the dejafu-0.2 and dejafu-0.3 releases.
-
Revealing Behaviours of Concurrent Functional Programs by Systematic Testing, M. Walker (2018)
This is my Ph.D thesis, which discusses dejafu and my other research projects.
Typeclasses
We don't use the regular Control.Concurrent and Control.Exception modules,
we use typeclass-generalised ones instead from the concurrency and
exceptions packages.
Porting guide
If you want to test some existing code, you'll need to port it to the
appropriate typeclass. The typeclass is necessary, because we can't peek inside
IO and STM values, so we need to able to plug in an alternative
implementation when testing.
Fortunately, this tends to be a fairly mechanical and type-driven process:
-
Import
Control.Concurrent.Classy.*instead ofControl.Concurrent.* -
Import
Control.Monad.Catchinstead ofControl.Exception -
Change your monad type:
IO abecomesMonadConc m => m aSTM abecomesMonadSTM stm => stm a
-
Parameterise your state types by the monad:
TVarbecomesTVar stmMVarbecomesMVar mIORefbecomesIORef m
-
Some functions are renamed:
forkIO*becomesfork*atomicModifyIORefCASbecomesmodifyIORefCAS*
-
Fix the type errors
If you're lucky enough to be starting a new concurrent Haskell project, you can
just program against the MonadConc interface.
What if I really need I/O?
You can use MonadIO and liftIO with MonadConc, for instance if you need to
talk to a database (or just use some existing library which needs real I/O).
To test IO-using code, there are some rules you need to follow:
-
Given the same set of scheduling decisions, your
IOcode must be deterministic (see below). -
As dejafu can't inspect
IOvalues, they should be kept small; otherwise dejafu may miss buggy interleavings. -
You absolutely cannot block on the action of another thread inside
IO, or the test execution will just deadlock.
[!TIP] Deterministic
IOis only essential if you're using the systematic testing (the default). NondeterministicIOwon't break the random testing, it'll just make things more confusing.
Deriving your own instances
There are MonadConc and MonadSTM instances for many common monad
transformers. In the simple case, where you want an instance for a newtype
wrapper around a type that has an instance, you may be able to derive it. For
example:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE UndecidableInstances #-}
data Env = Env
newtype MyMonad m a = MyMonad { runMyMonad :: ReaderT Env m a }
deriving (Functor, Applicative, Monad)
deriving instance MonadThrow m => MonadThrow (MyMonad m)
deriving instance MonadCatch m => MonadCatch (MyMonad m)
deriving instance MonadMask m => MonadMask (MyMonad m)
deriving instance MonadConc m => MonadConc (MyMonad m)
MonadSTM needs a slightly different set of classes:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE UndecidableInstances #-}
data Env = Env
newtype MyMonad m a = MyMonad { runMyMonad :: ReaderT Env m a }
deriving (Functor, Applicative, Monad, Alternative, MonadPlus)
deriving instance MonadThrow m => MonadThrow (MyMonad m)
deriving instance MonadCatch m => MonadCatch (MyMonad m)
deriving instance MonadSTM m => MonadSTM (MyMonad m)
Don't be put off by the use of UndecidableInstances, it's safe here.
Unit Testing
Writing tests with Déjà Fu is a little different to traditional unit testing, as your test case may have multiple results. A "test" is a combination of your code, and a predicate which says something about the set of allowed results.
Most tests will look something like this:
dejafu "Assert the thing holds" myPredicate myAction
The dejafu function comes from Test.DejaFu. Another useful function is
dejafuWithSettings (see Advanced Usage).
Actions
An action is just something with the type MonadConc m => m a, or (MonadConc m, MonadIO m) => m a for some a that your chosen predicate can deal with.
For example, some users on Reddit found a couple of apparent bugs in the
auto-update package a while ago (thread here). As the
package is simple and self-contained, I translated it to the MonadConc
abstraction and wrote a couple of tests to replicate the bugs. Here they are:
deadlocks :: MonadConc m => m ()
deadlocks = do
auto <- mkAutoUpdate defaultUpdateSettings
auto
nondeterministic :: forall m. MonadConc m => m Int
nondeterministic = do
var <- newIORef 0
let settings = (defaultUpdateSettings :: UpdateSettings m ())
{ updateAction = atomicModifyIORef var (\x -> (x+1, x)) }
auto <- mkAutoUpdate settings
auto
auto
These actions action could be tested with autocheck, and the issues would be
revealed. The use of ScopedTypeVariables in the second is an unfortunate
example of what can happen when everything becomes more polymorphic. But other
than that, note how there is no special mention of Déjà Fu in the actions: it's
just normal concurrent Haskell, simply written against a different interface.
The modified package is included in the test suite, if you want to see the full code.
[!NOTE] The predicates in dejafu-tests are a little confusing, as they're the opposite of what you would normally write! These predicates are checking that the bug is found, not that the code is correct.
If the RTS supports bound threads (the -threaded flag was passed to GHC when
linking), then the main thread of an action given to Déjà Fu will be bound, and
further bound threads can be forked with the forkOS functions. If not, then
attempting to fork a bound thread will raise an error.
Conditions
When a concurrent program of type MonadConc m => m a is executed, it may
produce a value of type a, or it may experience a condition such as
deadlock.
A condition does not necessarily cause your test to fail. It's important to be aware of what exactly your test is testing, to avoid drawing the wrong conclusions from a passing (or failing) test.
Setup and Teardown
Because dejafu drives the execution of the program under test, there are some tricks available to you which are not possible using normal concurrent Haskell.
If your test does some set-up work which is required for your test to work, but which is not the actual thing you are testing, you can define that as a setup action:
withSetup
:: Program Basic n x
-- ^ Setup action
-> (x -> Program Basic n a)
-- ^ Main program
-> Program (WithSetup x) n a
dejafu will save the state at the end of the setup action, and efficiently restore that state in subsequent runs of the same test with a different schedule. This can be much more efficient than dejafu running the setup action normally every single time.
If you want to examine some state you created in your setup action even if your actual test case deadlocks or something, you can define a teardown action:
withSetupAndTeardown
:: Program Basic n x
-- ^ Setup action
-> (x -> Either Condition y -> Program Basic n a)
-- ^ Teardown action
-> (x -> Program Basic n y)
-- ^ Main program
-> Program (WithSetupAndTeardown x y) n a
The teardown action is always executed.
Finally, if you want to ensure that some invariant holds over some shared state, you can define invariants in the setup action, which are checked atomically during the main action:
-- slightly contrived example
let setup = do
var <- newEmptyMVar
registerInvariant $ do
value <- inspectMVar var
when (x == Just 1) (throwM Overflow)
pure var
in withSetup setup $ \var -> do
fork $ putMVar var 0
fork $ putMVar var 1
tryReadMVar var
If the main action violates the invariant, it is terminated with an
InvariantFailure condition, and any teardown action is run.
Predicates
There are a few predicates built in, and some helpers to define your own.
abortsNever | checks that the computation never aborts |
abortsAlways | checks that the computation always aborts |
abortsSometimes | checks that the computation aborts at least once |
An abort is where the scheduler chooses to terminate execution early. If you see it, it probably means that a test didn't terminate before it hit the execution length limit. Aborts are hidden unless you use explicitly enable them, see (see Advanced Usage).
deadlocksNever | checks that the computation never deadlocks |
deadlocksAlways | checks that the computation always deadlocks |
deadlocksSometimes | checks that the computation deadlocks at least once |
Deadlocking is where every thread becomes blocked. This can be, for
example, if every thread is trying to read from an MVar that has been emptied.
exceptionsNever | checks that the main thread is never killed by an exception |
exceptionsAlways | checks that the main thread is always killed by an exception |
exceptionsSometimes | checks that the main thread is killed by an exception at least once |
An uncaught exception in the main thread kills the process. These can be synchronous (thrown in the main thread) or asynchronous (thrown to it from a different thread).
alwaysSame | checks that the computation is deterministic and always produces a value |
alwaysSameOn f | is like alwaysSame, but transforms the results with f first |
alwaysSameBy f | is like alwaysSame, but uses f instead of (==) to compare |
notAlwaysSame | checks that the computation is nondeterministic |
notAlwaysSameOn f | is like notAlwaysSame, but transforms the results with f first |
notAlwaysSameBy f | is like notAlwaysSame, but uses f instead of (==) to compare |
Checking for determinism will also find nondeterministic failures: deadlocking (for instance) is still a result of a test!
alwaysTrue p | checks that p is true for every result |
somewhereTrue p | checks that p is true for at least one result |
These can be used to check custom predicates. For example, you might want all your results to be less than five.
gives xs | checks that the set of results is exactly xs (which may include conditions) |
gives' xs | checks that the set of results is exactly xs (which may not include conditions) |
These let you say exactly what you want the results to be. Your test will fail if it has any extra results, or misses a result.
You can check multiple predicates against the same collection of results using
the dejafus and dejafusWithSettings functions. These avoid recomputing the
results, and so may be faster than multiple dejafu / dejafuWithSettings
calls.
Using HUnit and Tasty
By itself, Déjà Fu has no framework in place for named test groups and parallel execution or anything like that. It does one thing and does it well, which is running test cases for concurrent programs. HUnit and tasty integration is provided to get more of the features you'd expect from a testing framework.
The integration is provided by the hunit-dejafu and tasty-dejafu packages.
There's a simple naming convention used: the Test.DejaFu function dejafuFoo
is wrapped in the appropriate way and exposed as testDejafuFoo from
Test.HUnit.DejaFu and Test.Tasty.DejaFu.
Our example from the start becomes:
testDejafu "Assert the thing holds" myPredicate myAction
The autocheck function is exposed as testAuto.
Refinement Testing
Déjà Fu also supports a form of property-testing where you can check things
about the side-effects of stateful operations. For example, we can assert that
readMVar is equivalent to sequencing takeMVar and putMVar like so:
prop_mvar_read_take_put =
sig readMVar `equivalentTo` sig (\v -> takeMVar v >>= putMVar v)
Given the signature function, sig, defined in the next section. If we check
this, our property fails!
> check prop_mvar_read_take_put
*** Failure: (seed Just 0)
left: [(Nothing,Just 0)]
right: [(Nothing,Just 0),(Just Deadlock,Just 0)]
False
This is because readMVar is atomic, whereas sequencing takeMVar with
putMVar is not, and so another thread can interfere with the MVar in the
middle. The check and equivalentTo functions come from
Test.DejaFu.Refinement (also re-exported from Test.DejaFu).
Signatures
A signature tells the property-tester something about the state your operation acts upon, it has a few components:
data Sig s o x = Sig
{ initialise :: x -> ConcIO s
, observe :: s -> x -> ConcIO o
, interfere :: s -> x -> ConcIO ()
, expression :: s -> ConcIO ()
}
-
sis the state type, it's the thing which your operations mutate. ForreadMVar, the state is someMVar a. -
ois the observation type, it's some pure (and comparable) proxy for a snapshot of your mutable state. ForMVar a, the observation is probably aMaybe a. -
xis the seed type, it's some pure value used to construct the initial mutable state. ForMVar a, the seed is probably aMaybe a. -
ConcIOis just one of the instances ofMonadConcthat Déjà Fu defines for testing purposes. Just write code polymorphic in the monad as usual, and all will work.
The initialise, observe, and expression functions should be
self-explanatory, but the interfere one may not be. It's the job of the
interfere function to change the state in some way; it's run concurrently with
the expression, to simulate the nondeterministic action of other threads.
Here's a concrete example for our MVar example:
sig :: (MVar ConcIO Int -> ConcIO a) -> Sig (MVar ConcIO Int) (Maybe Int) (Maybe Int)
sig e = Sig
{ initialise = maybe newEmptyMVar newMVar
, observe = \v _ -> tryTakeMVar v
, interfere = \v s -> tryTakeMVar v >> maybe (pure ()) (\x -> void $ tryPutMVar v (x * 1000)) s
, expression = void . e
}
The observe function should be deterministic, but as it is run after the
normal execution ends, it may have side-effects on the state. The interfere
function can do just about anything (there are probably some concrete rules for
a good function, but I haven't figured them out yet), but a poor one may result
in the property-checker being unable to distinguish between atomic and nonatomic
expressions.
Properties
A property is a pair of signatures linked by one of three provided functions. These functions are:
| Function | Operator | Checks that... |
|---|---|---|
equivalentTo | === | ... the left and right have exactly the same behaviours |
refines | =>= | ... every behaviour of the left is also a behaviour of the right |
strictlyRefines | ->- | ... left =>= right holds but left === right does not |
The signatures can have different state types, as long as the seed and
observation types are the same. This lets you compare different implementations
of the same idea: for example, comparing a concurrent stack implemented using
MVar with one implemented using IORef.
Properties can have parameters, given in the obvious way:
check $ \a b c -> sig1 ... `op` sig2 ...
Under the hood, seed and parameter values are generated using the leancheck
package, an enumerative property-based testing library. This means that any
types you use will need to have a Listable instance.
You can also think about the three functions in terms of sets of results, where
a result is a (Maybe Failure, o) value. A Failure is something like
deadlocking, or being killed by an exception; o is the observation type. An
observation is always made, even if execution of the expression fails.
| Function | Result-set operation |
|---|---|
refines | For all seed and parameter assignments, subset-or-equal |
strictlyRefines | For at least one seed and parameter assignment, proper subset; for all others, subset-or-equal |
equivalentTo | For all seed and parameter assignments, equality |
Finally, there is an expectFailure function, which inverts the expected result
of a property.
The Déjà Fu testsuite has a collection of refinement properties, which may help you get a feel for this sort of testing.
Using HUnit and Tasty
As for unit testing, HUnit and tasty integration is provided for refinement testing in the hunit-dejafu and tasty-dejafu packages.
The testProperty function is used to check properties. Our example from the
start becomes:
testProperty "Read is equivalent to Take then Put" prop_mvar_read_take_put
Advanced Usage
Déjà Fu tries to have a sensible set of defaults, but there are some times when the defaults are not suitable. There are a lot of knobs provided to tweak how things work.
Execution settings
The autocheckWithSettings, dejafuWithSettings, and dejafusWithSettings let
you provide a Settings value, which controls some of Déjà Fu's behaviour:
dejafuWithSettings mySettings "Assert the thing holds" myPredicate myAction
The available settings are:
-
"Way", how to explore the behaviours of the program under test.
-
Length bound, a cut-off point to terminate an execution even if it's not done yet.
-
Memory model, which affects how non-synchronised operations, such as
readIORefandwriteIORefbehave. -
Discarding, which allows throwing away uninteresting results, rather than keeping them around in memory.
-
Early exit, which allows exiting as soon as a result matching a predicate is found.
-
Representative traces, keeping only one execution trace for each distinct result.
-
Trace simplification, rewriting execution traces into a simpler form (particularly effective with the random testing).
-
Safe IO, pruning needless schedules when your IO is only used to manage thread-local state.
See the Test.DejaFu.Settings module for more information.
Performance tuning
-
Are you happy to trade space for time?
Consider computing the results once and running multiple predicates over the output: this is what
dejafus/testDejafus/ etc does. -
Can you sacrifice completeness?
Consider using the random testing functionality. See the
*WithSettingsfunctions. -
Would strictness help?
Consider using the strict functions in
Test.DejaFu.SCT(the ones ending with a'). -
Do you just want the set of results, and don't care about traces?
Consider using
Test.DejaFu.SCT.resultsSet. -
Do you know something about the sort of results you care about?
Consider discarding results you don't care about. See the
*WithSettingsfunctions inTest.DejaFu,Test.DejaFu.SCT, andTest.{HUnit,Tasty}.DejaFu.
For example, let's say you want to know if your test case deadlocks, but you don't care about the execution trace, and you are going to sacrifice completeness because your possible state-space is huge. You could do it like this:
dejafuWithSettings
( set ldiscard
-- "efa" == "either failure a", discard everything but deadlocks
(Just $ \efa -> Just (if either isDeadlock (const False) efa then DiscardTrace else DiscardResultAndTrace))
. set lway
-- try 10000 executions with random scheduling
(randomly (mkStdGen 42) 10000)
$ defaultSettings
)
-- the name of the test
"Never Deadlocks"
-- the predicate to check
deadlocksNever
-- your test case
testCase
1.x to 2.x
dejafu-2.0.0.0 is a super-major release which breaks compatibility with dejafu-1.x.
Highlights reel:
- Test cases are written in terms of a new
Programtype. - The
Failuretype has been replaced with aConditiontype (actually in 1.12). - Random testing takes an optional length bound.
- Atomically-checked invariants over shared mutable state.
See the changelogs for the full details.
The Program type
The ConcT type is now an alias for Program Basic.
A Program Basic has all the instances ConcT did, defined using the ~
instance trick, so this shouldn't be a breaking change:
instance (pty ~ Basic) => MonadTrans (Program pty)
instance (pty ~ Basic) => MonadCatch (Program pty n)
instance (pty ~ Basic) => MonadThrow (Program pty n)
instance (pty ~ Basic) => MonadMask (Program pty n)
instance (pty ~ Basic, Monad n) => MonadConc (Program pty n)
instance (pty ~ Basic, MonadIO n) => MonadIO (Program pty n)
The dontCheck function has been removed in favour of withSetup:
do x <- dontCheck setup
action x
-- becomes
withSetup setup action
The subconcurrency function has been removed in favour of
withSetupAndTeardown:
do x <- setup
y <- subconcurrency (action x)
teardown x y
-- becomes
withSetupAndTeardown setup teardown action
The dontCheck and subconcurrency functions used to throw runtime errors if
nested. This is not possible with withSetup and withSetupAndTeardown due to
their types:
withSetup
:: Program Basic n x
-- ^ Setup action
-> (x -> Program Basic n a)
-- ^ Main program
-> Program (WithSetup x) n a
withSetupAndTeardown
:: Program Basic n x
-- ^ Setup action
-> (x -> Either Condition y -> Program Basic n a)
-- ^ Teardown action
-> (x -> Program Basic n y)
-- ^ Main program
-> Program (WithSetupAndTeardown x y) n a
Previously, multiple calls to subconcurrency could be sequenced in the same
test case. This is not possible using withSetupAndTeardown. If you rely on
this behaviour, please file an issue.
The Condition type
This is a change in dejafu-1.12.0.0, but the alias Failure = Condition is removed in dejafu-2.0.0.0.
- The
STMDeadlockandDeadlockconstructors have been merged. - Internal errors have been split into the
Errortype and are raised as exceptions, instead of being returned as conditions.
The name "failure" has been a recurring source of confusion, because an individual execution can "fail" without the predicate as a whole failing. My hope is that the more neutral "condition" will prevent this confusion.
Deprecated functions
All the deprecated special-purpose functions have been removed. Use more
general *WithSettings functions instead.
Need help?
- For general help talk to me in IRC (barrucadu in #haskell) or shoot me an email (mike@barrucadu.co.uk)
- For bugs, issues, or requests, please file an issue.
0.x to 1.x
dejafu-1.0.0.0 is a super-major release which breaks compatibility with dejafu-0.x quite significantly, but brings with it support for bound threads, and significantly improves memory usage in the general case.
Highlights reel:
- Most predicates now only need to keep around the failures, rather than all results.
- Support for bound threads (with concurrency-1.3.0.0).
- The
ST/IOinterface duplication is gone, everything is now monadic. - Function parameter order is closer to other testing libraries.
- Much improved API documentation.
See the changelogs for the full details.
ST and IO functions
There is only one set of functions now. Testing bound threads requires being
able to fork actual threads, so testing with ST is no longer possible. The
ConcST type is gone, there is only ConcIO.
For dejafu change:
autocheckIOtoautocheckdejafuIOtodejafudejafusIOtodejafusautocheckWayIOtoautocheckWaydejafuWayIOtodejafuWaydejafusWayIOtodejafusWaydejafuDiscardIOtodejafuDiscardrunTestMtorunTestrunTestWayMtorunTestWay
If you relied on being able to get a pure result from the ConcST functions,
you can no longer do this.
For hunit-dejafu and tasty-dejafu change:
testAutoIOtotestAutotestDejafuIOtotestDejafutestDejafusIOtotestDejafustestAutoWayIOtotestAutoWaytestDejafuWayIOtotestDejafuWaytestDejafusWayIOtotestDejafusWaytestDejafuDiscardIOtotestDejafuDiscard
Function parameter order
Like HUnit, the monadic action to test is now the last parameter of the testing functions. This makes it convenient to write tests without needing to define the action elsewhere.
For dejafu change:
dejafu ma (s, p)todejafu s p madejafus ma pstodejafus ps madejafuWay way mem ma (s, p)todejafuWay way mem s p madejafusWay way mem ma pstodejafuWay way mem ps madejafuDiscard d way mem ma (s, p)todejafuDiscard d way mem s p ma
For hunit-dejafu and tasty-dejafu change:
testDejafu ma s ptotestDejafu s p matestDejafus ma pstotestDejafus ps matestDejafuWay way mem ma s ptotestDejafuWay way mem s p matestDejafusWay way mem ma pstotestDejafusWay way mem ps matestDejafuDiscard d way mem ma s ptotestDejafuDiscard d way mem s p ma
Predicates
The Predicate a type is now an alias for ProPredicate a a, defined like so:
data ProPredicate a b = ProPredicate
{ pdiscard :: Either Failure a -> Maybe Discard
-- ^ Selectively discard results before computing the result.
, peval :: [(Either Failure a, Trace)] -> Result b
-- ^ Compute the result with the un-discarded results.
}
If you use the predicate helper functions to construct a predicate, you do not need to change anything (and should get a nice reduction in your resident memory usage). If you supply a function directly, you can recover the old behaviour like so:
old :: ([(Either Failure a, Trace)] -> Result a) -> ProPredicate a a
old p = ProPredicate
{ pdiscard = const Nothing
, peval = p
}
The alwaysTrue2 helper function is gone. If you use it, use alwaysSameOn or
alwaysSameBy instead.
Need help?
- For general help talk to me in IRC (barrucadu in #haskell) or shoot me an email (mike@barrucadu.co.uk)
- For bugs, issues, or requests, please file an issue.
Contributing
Thanks for caring about Déjà Fu!
Ways to contribute
Déjà Fu is a project under active development, there's always something to do. Here's a list of ideas to get you started:
- Submit bug reports.
- Submit feature requests.
- Got a particularly slow test case which you think should be faster? Open an issue for that too.
- Blog about how and why you use Déjà Fu.
- Check if any bugs which have been open for a while are still bugs.
If you want to contribute code, you could:
- Tackle one of the issues tagged "good first issue".
- Tackle a bigger issue, perhaps one of the roadmap issues!
- Run code coverage and try to fix a gap in the tests.
- Profile the test suite and try to improve a slow function.
Roadmap issues are priority issues (in my opinion), so help with those is especially appreciated.
If you have a support question, you can talk to me on IRC (#haskell on freenode) or send an email (mike@barrucadu.co.uk) rather than opening an issue. But maybe your question is a bug report about poor documentation in disguise!
Making the change
-
Talk to me!
I don't bite, and chances are I can quickly tell you where you should start. It's better to ask what seems like a stupid question than to waste a lot of time on the wrong approach.
-
Make the change.
Figure out what needs to be changed, how to change it, and do it. If you're fixing a bug, make sure to add a minimal reproduction to Cases.Regressions in dejafu-tests.
-
Document the change.
All top-level definitions should have a Haddock comment explaining what it does. If you've added or changed a top-level function, consider commenting its arguments too.
If you've added a top-level definition, or changed one in a backwards-incompatible way, add an
@since unreleasedHaddock comment to it. This is to help prevent me from missing things when I update the changelog ahead of a release. -
Submit a PR.
GitHub Actions will run some checks, which might prompt further action. You should expect a response from me in a day or two.
Don't worry about your PR being perfect the first time. We'll work through any issues together, to ensure that Déjà Fu gets the best code it can.
Coding style
There isn't really a prescribed style. It's not quite the wild west though; keep these three rules in mind:
- Be consistent.
- Run stylish-haskell to format import lists.
- Use hlint and weeder and fix lints unless you have a good reason not to.
GitHub Actions runs stylish-haskell and hlint on all PRs.
Coverage
hpc can generate a coverage report from the execution of dejafu-tests:
$ stack build --coverage
$ stack exec dejafu-tests
$ stack hpc report --all dejafu-tests.tix
This will print some stats and generate an HTML coverage report:
Generating combined report
52% expressions used (4052/7693)
48% boolean coverage (63/129)
43% guards (46/106), 31 always True, 9 always False, 20 unevaluated
68% 'if' conditions (11/16), 2 always True, 3 unevaluated
85% qualifiers (6/7), 1 unevaluated
61% alternatives used (392/635)
80% local declarations used (210/261)
26% top-level declarations used (280/1063)
The combined report is available at /home/barrucadu/projects/dejafu/.stack-work/install/x86_64-linux/nightly-2016-06-20/8.0.1/hpc/combined/custom/hpc_index.html
The highlighted code in the HTML report emphasises branch coverage:
- Red means a branch was evaluated as always false.
- Green means a branch was evaluated as always true.
- Yellow means an expression was never evaluated.
See also the stack coverage documentation.
Performance
GHC can generate performance statistics from the execution of dejafu-tests:
$ stack build --profile
$ stack exec -- dejafu-tests +RTS -p
$ less dejafu-tests.prof
This prints a detailed breakdown of where memory and time are being spent:
Mon Mar 20 19:26 2017 Time and Allocation Profiling Report (Final)
dejafu-tests +RTS -p -RTS
total time = 105.94 secs (105938 ticks @ 1000 us, 1 processor)
total alloc = 46,641,766,952 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
findBacktrackSteps.doBacktrack.idxs' Test.DejaFu.SCT.Internal 21.9 12.0
== Test.DejaFu.Common 12.4 0.0
yieldCount.go Test.DejaFu.SCT 12.1 0.0
dependent' Test.DejaFu.SCT 5.1 0.0
runThreads.go Test.DejaFu.Conc.Internal 2.7 4.1
[...]
Be careful, however! Compiling with profiling can significantly affect the behaviour of a program! Use profiling to get an idea of where the hot spots are, but make sure to confirm with a non-profiled build that things are actually getting faster.
If you compile with -rtsopts you can get some basic stats from a non-profiled
build:
$ stack exec -- dejafu-tests +RTS -s
[...]
86,659,658,504 bytes allocated in the heap
13,057,037,448 bytes copied during GC
13,346,952 bytes maximum residency (4743 sample(s))
127,824 bytes maximum slop
37 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 78860 colls, 0 par 32.659s 32.970s 0.0004s 0.0669s
Gen 1 4743 colls, 0 par 3.043s 3.052s 0.0006s 0.0086s
TASKS: 174069 (174065 bound, 4 peak workers (4 total), using -N1)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.001s ( 0.001s elapsed)
MUT time 98.685s (101.611s elapsed)
GC time 35.702s ( 36.022s elapsed)
EXIT time 0.001s ( 0.007s elapsed)
Total time 134.388s (137.640s elapsed)
Alloc rate 878,145,635 bytes per MUT second
Productivity 73.4% of total user, 73.8% of total elapsed
Heap profiling
GHC can tell you where the memory is going:
$ stack build --profile
$ stack exec -- dejafu-tests +RTS -hc
$ hp2ps -c dejafu-tests.hp
This will produce a graph of memory usage over time, as a postscript file, broken down by cost-centre which produced the data. There are a few different views:
-hmbreaks down the graph by module-hdbreaks down the graph by closure description-hybreaks down the graph by type
I typically find -hd and -hy most useful. If you're feeling particularly
brave, you can try -hr, which is intended to help track down memory leaks
caused by unevaluated thunks.
Supported GHC Versions
Déjà Fu supports the latest four GHC releases, at least. For testing purposes, we use Stackage snapshots as a proxy for GHC versions. The currently supported versions are:
| GHC | Stackage | base |
|---|---|---|
| 9.8 | LTS 23.0 | 4.19.2.0 |
| 9.6 | LTS 22.0 | 4.18.1.0 |
| 9.4 | LTS 21.0 | 4.17.0.0 |
| 9.2 | LTS 20.0 | 4.16.0.0 |
| 9.0 | LTS 19.0 | 4.15.0.0 |
| 8.1 | LTS 17.0 | 4.14.1.0 |
| 8.8 | LTS 15.0 | 4.13.0.0 |
| 8.6 | LTS 14.0 | 4.12.0.0 |
| 8.4 | LTS 12.0 | 4.11.0.0 |
| 8.2 | LTS 10.0 | 4.10.1.0 |
In practice, we may compile with older versions of GHC, but keeping them working is not a priority.
Adding new GHC releases
When a new version of GHC is released, we need to make some changes for everything to go smoothly. In general, these changes should only cause a patch level version bump.
- Bump the upper bound of base and set up any needed conditional compilation
- Add the GHC and base versions to the table.
- Remove any unsupported versions from the table.
- Make a patch release.
A new GHC release won't get a Stackage snapshot for little while. When it does:
- Add the snapshot to the GitHub Actions configuration.
- Update the resolver in the stack.yaml.
- Put the snapshot in the table.
Dropping old GHC releases
When we want to drop an unsupported version of GHC, we need to bump the version bound on base to preclude it. This is a backwards-incompatible change which causes a major version bump.
- Remove the dropped GHC version from the GitHub Actions configuration.
- Bump the lower bound of base.
- Look through the other dependencies. Some may not work with our new lower bound on base, so we should bump those too.
- Remove any now-irrelevant conditional compilation (mostly CPP, but there may also be some cabal file bits).
- Make whatever change required the bump.
- Make a major release.
GHC versions shouldn't be dropped just because we can, but here are some good reasons to do it:
- We want to bump the lower bounds of a dependency to a version which doesn't support that GHC.
- We want to add a new dependency which doesn't support that GHC.
- The conditional compilation needed to keep that GHC working is getting confusing.
Release Process
[!WARNING] If it's early in the year, make sure you put down the right year in the CHANGELOG!
-
Figure out what the next version number is. See the PVP_ page if unsure.
-
Update version numbers in the relevant cabal files:
- Update the
versionfield - Update the
tagin thesource-repositoryblock
- Update the
-
Fill in all
@since unreleasedHaddock comments with the relevant version number. -
Update version numbers in the tables in the README page.
-
Ensure the relevant CHANGELOG files have all the entries they should.
-
Add the release information to the relevant CHANGELOG files:
- Change the
unreleasedtitle to the version number - Add the current date
- Add the git tag name
- Add the Hackage URL
- Add the contributors list
- Change the
-
Commit.
-
Push to GitHub and wait for GitHub Actions to confirm everything is OK. If it's not OK, fix what is broken before continuing.
-
Merge the PR.
-
Tag the merge commit. Tags are in the form
<package>-<version>, and the message is the changelog entry. -
Push tags to GitHub.
When the merge commit successfully builds on master the updated packages will
be pushed to Hackage by Concourse.
Pro tips
-
If a release would have a combination of breaking and non-breaking changes, if possible make two releases: the non-breaking ones first, and then a major release with the breaking ones.
This makes it possible for users who don't want the breaking changes to still benefit from the non-breaking improvements.
-
Before uploading to Hackage, check you have no changes to the files (for example, temporarily changing the GHC options, or adding
tracecalls, for debugging reasons).stack uploadwill upload the files on the disk, not the files in version control, so your unwanted changes will be published!