Minimal type-safe static pointers
GHC 7.10 introduced a new language construct called static pointers. The goal is to allow safe and efficient sharing of morally top-level definitions across processes boundaries. This is most commonly used for sending functions (not serializable on their own in Haskell). In this post we will look at how static pointers are currently unsafe (as of GHC 8.4.2), look at a proposed specification to make them safer, and how we can achieve the same guarantees with the current GHC version.
Static pointers are often mixed up with Cloud Haskell and various high-level libraries that make use of them, but in this post we will concentrate on the minimal API provided by GHC, see how to make it safe, and how powerful features can be built on top of it.
The main use of static pointers is sharing functions, but GHC generalizes the concept to work for arbitrary expressions. The basic idea is simple: we annotate an expression known at compile time with a new keyword
static (enabled by
-XStaticPointers), and GHC generates a unique name for it, and adds them to a global table where they can be looked up. This is called the Static Pointer Table, or SPT for short.
The name of the static pointer is known as its key. They are generated using a hash of the static expression, thus making static pointers a form of content-adressable values
The basic API (from
GHC.StaticPtr) looks like this:
StaticPtr a deRefStaticPtr :: StaticPtr a -> a staticKey :: StaticPtr a -> StaticKey staticPtrInfo :: StaticPtr a -> StaticPtrInfo
Unlike the pointer itself, the
StaticKey is serializable, so we can share it with an arbitrary process, who in turn can look it up in its own SPT to obtain a pointer a local value created from the same static expression. This will only work if both processes have the pointer in their SPT, e.g. if they were compiled by the same compiler, and the source code defining the given static expression was identical (including its dependencies). The easiest way to assure this is to only send
StaticPtrs between process compiled at the same time.
StaticPtrInfo provides the source file and location of the expression that defined the pointer, intended to be used in error-messages.
Because static pointers are content-addressable by default, the serialization is lookup safe, e.g. a bogus value sent by attacker can not lead to the wrong value being injected in the remote process (up to hash collisions). This also minimizes the risk of collision within the set of communicating processes.
Creating and serializing Static Pointers
We can think of the static keyword as a function of type
Typeable a => a -> StaticPtr. Otherwise, the value pointed to can be anything, including functions and
sayHello :: StaticPtr (IO ()) sayHello = static (putStrLn "Hello, world!")
A static expression is must not refer to variables outside its own scope (excluding the top-level scope). E.g. this is disallowed:
sayHelloTo :: String -> StaticPtr (IO ()) sayHelloTo x = static (putStrLn $ "Hello, world!" ++ x)
The following is fine.
sayHelloTo :: StaticPtr (String -> IO ()) sayHelloTo = static (\x -> putStrLn $ "Hello, " ++ x)
In other words, all static expressions can be trivially factored out to the top-level.
As mentioned, the type of the expression must be
Typeable. These will all work:
foo :: StaticPtr Int foo = static (2 :: Int) idInt :: StaticPtr (Int -> Int) idInt = static id
This will fail with
No instance for (Typeable a).
id :: StaticPtr (a -> a) id = static id
We can fix this by adding the constraint to the scope, though note that in order to actually serialize these values (via
staticKey), we need to instantiate the type.
idT :: Typeable a => StaticPtr (a -> a) idT = static id constT :: (Typeable a, Typeable b) => StaticPtr (b -> a -> b) constT = static const -- bad1 = staticKey idT -- Doesn't typecheck good1 = staticKey (idT :: StaticPtr (Bool -> Bool)) -- OK
Because of this restriction, it would seem sending polymorphic functions is impossible. However we can get around this with some clever (if ugly) wrapping.
-- forall a . a -> a data IdFunc where IdFunc :: (forall a . a -> a) -> IdFunc deriving (Typeable) -- forall a b. a -> b -> a data ConstFunc where ConstFunc :: (forall a b . a -> b -> a) -> ConstFunc deriving (Typeable) -- forall c a . c a => a -> a -> b data PredFunc :: (Type -> Constraint) -> Type -> Type where PredFunc :: (forall a . c a => a -> a -> b) -> PredFunc c b deriving (Typeable) id' :: StaticPtr IdFunc id' = static (IdFunc id) const' :: StaticPtr ConstFunc const' = static (ConstFunc const) eq' :: StaticPtr (PredFunc Eq Bool) eq' = static (PredFunc (==))
As we mentioned earlier, GHC static pointers are lookup safe. Unfortunately the lookup function is currently not type safe. This is because the type representation of a static pointer is not stored in the SPT, so the recieving process has no way of verifying that the type specified by the caller and the pointer in the table have the same type.
In other words, lookups might lead to segfaults and runtime crashes: the lookup function is called
unsafeLookupStaticPtr for this reason) A type-safe api has been proposed but not implemented.
It turns out that we can actually the safe API as a library, with the single infelicity that we have to list all of our static expressions manually at the top-level of our program. (Side note: the lookup fuction is also in IO because of dynamic linking. We will ignore this in our implementation. For the real library the solution proposed here is to make the default function pure and provide an escape hatch for the dynamic loading case.)
First we define an internal type to wrap up a
StaticPtr along with its
-- Used internally data WrappedPtr :: Type where WrappedPtr :: Typeable a => StaticPtr a -> WrappedPtr
We can now easily define our own version of
staticPtrKeys. Note the safe API for
lookupStaticPtr. This will fail with
Nothing if the given key does not exist in the SPT, and otherwise call into the handler, where the caller can do their own casting/type detection logic.
lookupStaticPtr :: StaticKey -> (forall a . Typeable a => StaticPtr a -> b) -> Maybe b lookupStaticPtr key k = f <$> Prelude.lookup key __ptrs__ where f (WrappedPtr x) = k x staticPtrKeys' :: [StaticKey] staticPtrKeys' = fst <$> __ptrs__
The mysterious value
__ptrs is the new type-safe SPT that we want the compiler to write for us. For now, we could write it manually, like this:
__ptrs__ :: [(StaticKey, WrappedPtr)] __ptrs__ = [ wrapStatic sayHello , wrapStatic sayHelloTo , wrapStatic id' , wrapStatic const' , wrapStatic eq' ] where wrapStatic x = (staticKey x, WrappedPtr x)
Let’s also define some utility functions. Note these are built on top of of the minimal API defined above. Note
lookupStaticPtrAs makes no distiction between a static pointer not existing in the table and existing in the table with the wrong type.
derefAs :: (Typeable a, Typeable b) => StaticPtr a -> Maybe b derefAs = cast . deRefStaticPtr lookupStaticPtrAs :: Typeable a => StaticKey -> Maybe a lookupStaticPtrAs k = join $ lookupStaticPtr k derefAs lookupStaticPtrInfo :: StaticKey -> Maybe StaticPtrInfo lookupStaticPtrInfo k = lookupStaticPtr k staticPtrInfo
Finally, let us test our implementation by “sending” values of various complexity. We simulate sending by just converting the values to the key and back.
test = do -- Send "hello" let helloKey :: StaticKey = staticKey sayHelloTo case lookupStaticPtrAs helloKey of Nothing -> pure () Just (hello :: String -> IO ()) -> hello "everybody!" -- Send "id" let idKey :: StaticKey = staticKey id' case lookupStaticPtrAs idKey of Nothing -> pure () Just (IdFunc id) -> print $ (id succ) (id 22) -- Send "const" let constKey :: StaticKey = staticKey const' case lookupStaticPtrAs constKey of Nothing -> pure () Just (ConstFunc const) -> print $ const False (const  ()) -- Send "(==)" let eqKey :: StaticKey = staticKey eq' case lookupStaticPtrAs eqKey :: Maybe (C_A_AToAToB Eq Bool) of Nothing -> pure () Just (PredFunc eq) -> print $ 2 `eq` 2
We have added a safe lookup function to
GHC.StaticPtr API, replacing
unsafeLookupStaticPtr. Notably our implementation is fully type safe. This is possible because is does not make use of
unsafeLookupStaticPtr, but uses its own pointer table. Our API is equivalent to the one presented in the aforementioned proposal, except that it keeps the wrapper type hidden, adding only
ptrKeys. This also corresponds with SPJ’s argument to keep the trusted code-base minimal.
- Cloud Haskell paper (which includes the original StaticPtr design): https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/remote.pdf
- GHC implementation status: https://ghc.haskell.org/trac/ghc/wiki/StaticPointers
- GHC type safe decoding proposal: https://ghc.haskell.org/trac/ghc/wiki/StaticPointers/TypesafeDecoding