Introducing simple-store and simple-cell
simple-store1 provides persistence, atomicity and consistency for a shared data type. simple-cell2 uses simple-store and directed-keys3 to create multiple atomic values with the unique keys. We will review a few concepts before discussing these packages further.
Review
Concurrency control
Concurrency control4 ensures that concurrent operations are safe and fast and solves the following problems:
- lost update
- dirty read
- incorrect summary
We will only discuss one type here, optimistic concurrency control.
Optimistic concurrency control (OCC)
OCC5 assumes that multiple transactions can be made frequently without interfering with each other. OCC does not use locks6. Rather, before committing each transaction, it verifies that no other transaction has modified the data it has read. If the check reveals that modifications have occurred, the committing transaction rolls back and can be restarted. OCC is good for environments with low data contention.
OCC phases
- begin: record a timestamp which marks the beginning of a transaction.
- modify: read db values, tentatively write changes.
- validate: check if other transactions have modified the data that the transaction has read/wrote.
- commit/rollback: if there is no conflict, make all changes take effect, otherwise abort the changes.
Software transactional memory (STM)
STM7 is a type of OCC. A thread modifies shared memory without concern for what other threads may be doing to that memory. STM makes the reader responsible for making sure nothing is operating on the shared memory.
Concurrent Haskell and STM
MVar
MVar t8 is a mutable location that is empty
or contains a value t.
putMVarsets the value in anMVar.readMVargets the value in anMVarand sets it to the value it just took.takeMVargets the value in anMVarand sets the value to empty.
TMVar
TMVar t9 is the STM version of
MVar and is thread safe.
putTMVarsets the value in anTMVar.readTMVargets the value in anTMVarand sets it to the value it just took.takeTMVargets the value in anTMVarand sets the value to empty.
Dependencies
cereal
cereal10 is a package the performs binary
serialization. By declaring an instance of the Serialize
type class, we can perform serialization and deserialization on a type.
It can also use GHC.Generics to automatically declare a
Serialize instance.
directed-keys
directed-keys provides a data type and functions to serialize data to and from Base6411.
DirectedKeyRawis data type that has a unique key, a source url/file path, a destination url/file path and a time. They all need an instance ofHashableandSerialize.DirectedKeyis aByteStringof the data inDirectedKeyRaw.encodeKeyRawis likeDKeyRawconstructor, but it enforces theSerializerestriction on the keys.encodeKeyanddecodeKeyto an from a base64ByteString.parseFilenameanddecodeFilenameescape and unescape the necesssary Unix characters in a file path.
simple-store
The main data type is SimpleStore. You do not need to
manipulate SimpleStore records directly. simple-store
provides a set of functions to save and retrieve data via the
SimpleStore data type and filesystem.
data SimpleStore st =
SimpleStore
{ storeFP :: FilePath
, storeState :: TVar st
, storeLock :: TMVar StoreLock
, storeHandle :: Maybe Handle
}The most important functions are:
makeSimpleStoresave a serializable type to a fileopenSimpleStoreread a deserializable type from a filegetSimpleStoreget the data type value from theSimpleStore.modifySimpleStoreupdateSimpleStore
simple-cell
A SimpleCell takes a function to retrieve keys and a
function to make those keys into filenames. It maintains a key-value
pair of filename to SimpleStore. By convention, simple-cell
uses a type suffixed Store as a newtype wrapped entity of
an entity with DB specific properties (like external keys).
newtype User = User
{ name :: String
}
data UserStore = UserStore
{ userKey :: !Int
, userValue :: !Maybe User
}
deriving (Eq,Generic,Show)The general work flow is:
- Declare CellKey value for the data type we want to store.
- Generate functions with Template Haskell.
- Initialize a
SimpleStore. - Pass the cell around manually or with
ReaderT. - Insert data with
insert<Type>SC. - Perform batch operations with
foldlWithKey<Type>SC - Get stores out the cell with
get<Type>SC.
We need to define three functions for CellKey for
looking up, decoding and encoding a DirectedKeyRaw.
data CellKey k src dst tm st = CellKey
{ getKey :: st -> DirectedKeyRaw k src dst tm
, codeCellKeyFilename :: DirectedKeyRaw k src dst tm -> Text
, decodeCellKeyFilename :: (Text -> Either Text (DirectedKeyRaw k src dst tm))
}Then we use Template Haskell to produce a set of type specific functions.
$(makeStoreCell 'userStoreCellKey 'initUser ''User)makeStoreCell generates the following functions, but you
have to provide the type signature for each of them to help the Template
Haskell.
get<Type>SCupdate<Type>SCcreateCheckpointAndClose<Type>SCtraverseWithKey<Type>SC_insert<Type>SCdelete<Type>SCfoldlWithKey<Type>SCinitialize<Type>SC
We generally do not need to manipulate the SimpleCell
data type directly, but it is helpful to know what it contains.
data SimpleCell k src dst tm stlive stdormant = SimpleCell {
{ cellCore :: !(TCellCore k src dst tm (SimpleStore stlive) stdormant )
, cellKey :: !(CellKey k src dst tm stlive)
, cellParentFP :: !FilePath
, cellRootFP :: !FilePath
} deriving (Typeable,Generic)cellCoreis an in-memory representation of the map of keys andSimpleStores.cellKeyis the key provided by the user.cellParentFPis the file path that the root of the project is in.cellRootFPis the file path that the cell occupies.
Types to remember for simple-cell:
DirectedKeyRawCellKeySimpleCellSimpleStore
For a complete example, take a look at the simple-cell tests.