Minimal type-safe static pointers
GHC 7.10 introduced a new language construct called static pointers. The goal is to allow safe and efficient sharing of morally top-level definitions across processes boundaries. This is most commonly used for sending functions (not serializable on their own in Haskell). In this post we will look at how static pointers are currently unsafe (as of GHC 8.4.2), look at a proposed specification to make them safer, and how we can achieve the same guarantees with the current GHC version.
Static pointers are often mixed up with Cloud Haskell and various high-level libraries that make use of them, but in this post we will concentrate on the minimal API provided by GHC, see how to make it safe, and how powerful features can be built on top of it.
The main use of static pointers is sharing functions, but GHC generalizes the concept to work for arbitrary expressions. The basic idea is simple: we annotate an expression known at compile time with a new keyword static
(enabled by -XStaticPointers
), and GHC generates a unique name for it, and adds them to a global table where they can be looked up. This is called the Static Pointer Table, or SPT for short.
The name of the static pointer is known as its key. They are generated using a hash of the static expression, thus making static pointers a form of content-adressable values
The basic API (from GHC.StaticPtr
) looks like this:
StaticPtr a
deRefStaticPtr :: StaticPtr a -> a
staticKey :: StaticPtr a -> StaticKey
staticPtrInfo :: StaticPtr a -> StaticPtrInfo
Unlike the pointer itself, the StaticKey
is serializable, so we can share it with an arbitrary process, who in turn can look it up in its own SPT to obtain a pointer a local value created from the same static expression. This will only work if both processes have the pointer in their SPT, e.g. if they were compiled by the same compiler, and the source code defining the given static expression was identical (including its dependencies). The easiest way to assure this is to only send StaticPtr
s between process compiled at the same time.
StaticPtrInfo
provides the source file and location of the expression that defined the pointer, intended to be used in error-messages.
Because static pointers are content-addressable by default, the serialization is lookup safe, e.g. a bogus value sent by attacker can not lead to the wrong value being injected in the remote process (up to hash collisions). This also minimizes the risk of collision within the set of communicating processes.
Creating and serializing Static Pointers #
We can think of the static keyword as a function of type
Typeable a => a -> StaticPtr
. Otherwise, the value pointed to can be anything, including functions and IO
actions.
sayHello :: StaticPtr (IO ())
sayHello = static (putStrLn "Hello, world!")
A static expression is must not refer to variables outside its own scope (excluding the top-level scope). E.g. this is disallowed:
sayHelloTo :: String -> StaticPtr (IO ())
sayHelloTo x = static (putStrLn $ "Hello, world!" ++ x)
The following is fine.
sayHelloTo :: StaticPtr (String -> IO ())
sayHelloTo = static (\x -> putStrLn $ "Hello, " ++ x)
In other words, all static expressions can be trivially factored out to the top-level.
Polymorphic values #
As mentioned, the type of the expression must be Typeable
. These will all work:
foo :: StaticPtr Int
foo = static (2 :: Int)
idInt :: StaticPtr (Int -> Int)
idInt = static id
This will fail with No instance for (Typeable a)
.
id :: StaticPtr (a -> a)
id = static id
We can fix this by adding the constraint to the scope, though note that in order to actually serialize these values (via staticKey
), we need to instantiate the type.
idT :: Typeable a => StaticPtr (a -> a)
idT = static id
constT :: (Typeable a, Typeable b) => StaticPtr (b -> a -> b)
constT = static const
-- bad1 = staticKey idT -- Doesn't typecheck
good1 = staticKey (idT :: StaticPtr (Bool -> Bool)) -- OK
Because of this restriction, it would seem sending polymorphic functions is impossible. However we can get around this with some clever (if ugly) wrapping.
-- forall a . a -> a
data IdFunc where
IdFunc :: (forall a . a -> a) -> IdFunc
deriving (Typeable)
-- forall a b. a -> b -> a
data ConstFunc where
ConstFunc :: (forall a b . a -> b -> a) -> ConstFunc
deriving (Typeable)
-- forall c a . c a => a -> a -> b
data PredFunc :: (Type -> Constraint) -> Type -> Type where
PredFunc :: (forall a . c a => a -> a -> b) -> PredFunc c b
deriving (Typeable)
id' :: StaticPtr IdFunc
id' = static (IdFunc id)
const' :: StaticPtr ConstFunc
const' = static (ConstFunc const)
eq' :: StaticPtr (PredFunc Eq Bool)
eq' = static (PredFunc (==))
Type safety #
As we mentioned earlier, GHC static pointers are lookup safe. Unfortunately the lookup function is currently not type safe. This is because the type representation of a static pointer is not stored in the SPT, so the recieving process has no way of verifying that the type specified by the caller and the pointer in the table have the same type.
In other words, lookups might lead to segfaults and runtime crashes: the lookup function is called unsafeLookupStaticPtr
for this reason) A type-safe api has been proposed but not implemented.
It turns out that we can actually the safe API as a library, with the single infelicity that we have to list all of our static expressions manually at the top-level of our program. (Side note: the lookup fuction is also in IO because of dynamic linking. We will ignore this in our implementation. For the real library the solution proposed here is to make the default function pure and provide an escape hatch for the dynamic loading case.)
First we define an internal type to wrap up a StaticPtr
along with its Typeable
context.
-- Used internally
data WrappedPtr :: Type where
WrappedPtr :: Typeable a => StaticPtr a -> WrappedPtr
We can now easily define our own version of ...lookup
and staticPtrKeys
. Note the safe API for lookupStaticPtr
. This will fail with Nothing
if the given key does not exist in the SPT, and otherwise call into the handler, where the caller can do their own casting/type detection logic.
lookupStaticPtr
:: StaticKey
-> (forall a . Typeable a => StaticPtr a -> b)
-> Maybe b
lookupStaticPtr key k = f <$> Prelude.lookup key __ptrs__ where
f (WrappedPtr x) = k x
staticPtrKeys' :: [StaticKey]
staticPtrKeys' = fst <$> __ptrs__
The mysterious value __ptrs
is the new type-safe SPT that we want the compiler to write for us. For now, we could write it manually, like this:
__ptrs__ :: [(StaticKey, WrappedPtr)]
__ptrs__ =
[ wrapStatic sayHello
, wrapStatic sayHelloTo
, wrapStatic id'
, wrapStatic const'
, wrapStatic eq'
]
where
wrapStatic x = (staticKey x, WrappedPtr x)
Let’s also define some utility functions. Note these are built on top of of the minimal API defined above. Note lookupStaticPtrAs
makes no distiction between a static pointer not existing in the table and existing in the table with the wrong type.
derefAs :: (Typeable a, Typeable b) => StaticPtr a -> Maybe b
derefAs = cast . deRefStaticPtr
lookupStaticPtrAs :: Typeable a => StaticKey -> Maybe a
lookupStaticPtrAs k = join $ lookupStaticPtr k derefAs
lookupStaticPtrInfo :: StaticKey -> Maybe StaticPtrInfo
lookupStaticPtrInfo k = lookupStaticPtr k staticPtrInfo
Finally, let us test our implementation by “sending” values of various complexity. We simulate sending by just converting the values to the key and back.
test = do
-- Send "hello"
let helloKey :: StaticKey = staticKey sayHelloTo
case lookupStaticPtrAs helloKey of
Nothing -> pure ()
Just (hello :: String -> IO ()) -> hello "everybody!"
-- Send "id"
let idKey :: StaticKey = staticKey id'
case lookupStaticPtrAs idKey of
Nothing -> pure ()
Just (IdFunc id) -> print $ (id succ) (id 22)
-- Send "const"
let constKey :: StaticKey = staticKey const'
case lookupStaticPtrAs constKey of
Nothing -> pure ()
Just (ConstFunc const) -> print $ const False (const [] ())
-- Send "(==)"
let eqKey :: StaticKey = staticKey eq'
case lookupStaticPtrAs eqKey :: Maybe (C_A_AToAToB Eq Bool) of
Nothing -> pure ()
Just (PredFunc eq) -> print $ 2 `eq` 2
Conclusion #
We have added a safe lookup function to GHC.StaticPtr
API, replacing unsafeLookupStaticPtr
. Notably our implementation is fully type safe. This is possible because is does not make use of unsafeLookupStaticPtr
, but uses its own pointer table. Our API is equivalent to the one presented in the aforementioned proposal, except that it keeps the wrapper type hidden, adding only lookupStaticPtr
and ptrKeys
. This also corresponds with SPJ’s [argument to keep the trusted code-base minimal][spj].
Further reading #
- Cloud Haskell paper (which includes the original StaticPtr design): https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/remote.pdf
- GHC implementation status: https://ghc.haskell.org/trac/ghc/wiki/StaticPointers
- GHC type safe decoding proposal: https://ghc.haskell.org/trac/ghc/wiki/StaticPointers/TypesafeDecoding
[spj]: https://ghc.haskell.org/trac/ghc/blog/simonpj/StaticPointers