Introducing simple-store and simple-cell

June 23, 2017
haskelldb

simple-store1 provides persistence, atomicity and consistency for a shared data type. simple-cell2 uses simple-store and directed-keys3 to create multiple atomic values with the unique keys. We will review a few concepts before discussing these packages further.

Review

Concurrency control

Concurrency control4 ensures that concurrent operations are safe and fast and solves the following problems:

We will only discuss one type here, optimistic concurrency control.

Optimistic concurrency control (OCC)

OCC5 assumes that multiple transactions can be made frequently without interfering with each other. OCC does not use locks6. Rather, before committing each transaction, it verifies that no other transaction has modified the data it has read. If the check reveals that modifications have occurred, the committing transaction rolls back and can be restarted. OCC is good for environments with low data contention.

OCC phases

Software transactional memory (STM)

STM7 is a type of OCC. A thread modifies shared memory without concern for what other threads may be doing to that memory. STM makes the reader responsible for making sure nothing is operating on the shared memory.

Concurrent Haskell and STM

MVar

MVar t8 is a mutable location that is empty or contains a value t.

TMVar

TMVar t9 is the STM version of MVar and is thread safe.

Dependencies

cereal

cereal10 is a package the performs binary serialization. By declaring an instance of the Serialize type class, we can perform serialization and deserialization on a type. It can also use GHC.Generics to automatically declare a Serialize instance.

directed-keys

directed-keys provides a data type and functions to serialize data to and from Base6411.

simple-store

The main data type is SimpleStore. You do not need to manipulate SimpleStore records directly. simple-store provides a set of functions to save and retrieve data via the SimpleStore data type and filesystem.

data SimpleStore st = 
  SimpleStore
    { storeFP     :: FilePath
    , storeState  :: TVar st
    , storeLock   :: TMVar StoreLock
    , storeHandle :: Maybe Handle
    }

The most important functions are:

simple-cell

A SimpleCell takes a function to retrieve keys and a function to make those keys into filenames. It maintains a key-value pair of filename to SimpleStore. By convention, simple-cell uses a type suffixed Store as a newtype wrapped entity of an entity with DB specific properties (like external keys).

newtype User = User 
  { name :: String 
  }

data UserStore = UserStore 
  { userKey   :: !Int
  , userValue :: !Maybe User
  } 
  deriving (Eq,Generic,Show)

The general work flow is:

We need to define three functions for CellKey for looking up, decoding and encoding a DirectedKeyRaw.

data CellKey k src dst tm st = CellKey 
  { getKey                :: st -> DirectedKeyRaw k src dst tm
  , codeCellKeyFilename   :: DirectedKeyRaw k src dst tm -> Text
  , decodeCellKeyFilename :: (Text -> Either Text (DirectedKeyRaw k src dst tm)) 
  }

Then we use Template Haskell to produce a set of type specific functions.

$(makeStoreCell 'userStoreCellKey 'initUser ''User)

makeStoreCell generates the following functions, but you have to provide the type signature for each of them to help the Template Haskell.

We generally do not need to manipulate the SimpleCell data type directly, but it is helpful to know what it contains.

data SimpleCell  k src dst tm stlive stdormant = SimpleCell {
  { cellCore     :: !(TCellCore  k src dst tm (SimpleStore stlive) stdormant )
  , cellKey      :: !(CellKey  k src dst tm stlive)
  , cellParentFP :: !FilePath
  , cellRootFP   :: !FilePath
  } deriving (Typeable,Generic)

Types to remember for simple-cell:

For a complete example, take a look at the simple-cell tests.

References