Minimal type-safe static pointers

GHC 7.10 introduced a new language construct called static pointers. The goal is to allow safe and efficient sharing of morally top-level definitions across processes boundaries. This is most commonly used for sending functions (not serializable on their own in Haskell). In this post we will look at how static pointers are currently unsafe (as of GHC 8.4.2), look at a proposed specification to make them safer, and how we can achieve the same guarantees with the current GHC version.

Static pointers are often mixed up with Cloud Haskell and various high-level libraries that make use of them, but in this post we will concentrate on the minimal API provided by GHC, see how to make it safe, and how powerful features can be built on top of it.

The main use of static pointers is sharing functions, but GHC generalizes the concept to work for arbitrary expressions. The basic idea is simple: we annotate an expression known at compile time with a new keyword static (enabled by -XStaticPointers), and GHC generates a unique name for it, and adds them to a global table where they can be looked up. This is called the Static Pointer Table, or SPT for short.

The name of the static pointer is known as its key. They are generated using a hash of the static expression, thus making static pointers a form of content-adressable values

The basic API (from GHC.StaticPtr) looks like this:

StaticPtr a
deRefStaticPtr  :: StaticPtr a -> a
staticKey       :: StaticPtr a -> StaticKey
staticPtrInfo   :: StaticPtr a -> StaticPtrInfo

Unlike the pointer itself, the StaticKey is serializable, so we can share it with an arbitrary process, who in turn can look it up in its own SPT to obtain a pointer a local value created from the same static expression. This will only work if both processes have the pointer in their SPT, e.g. if they were compiled by the same compiler, and the source code defining the given static expression was identical (including its dependencies). The easiest way to assure this is to only send StaticPtrs between process compiled at the same time.

StaticPtrInfo provides the source file and location of the expression that defined the pointer, intended to be used in error-messages.

Because static pointers are content-addressable by default, the serialization is lookup safe, e.g. a bogus value sent by attacker can not lead to the wrong value being injected in the remote process (up to hash collisions). This also minimizes the risk of collision within the set of communicating processes.

 Creating and serializing Static Pointers

We can think of the static keyword as a function of type
Typeable a => a -> StaticPtr. Otherwise, the value pointed to can be anything, including functions and IO actions.

sayHello :: StaticPtr (IO ())
sayHello = static (putStrLn "Hello, world!")

A static expression is must not refer to variables outside its own scope (excluding the top-level scope). E.g. this is disallowed:

sayHelloTo :: String -> StaticPtr (IO ())
sayHelloTo x = static (putStrLn $ "Hello, world!" ++ x)

The following is fine.

sayHelloTo :: StaticPtr (String -> IO ())
sayHelloTo = static (\x -> putStrLn $ "Hello, " ++ x)

In other words, all static expressions can be trivially factored out to the top-level.

 Polymorphic values

As mentioned, the type of the expression must be Typeable. These will all work:

foo :: StaticPtr Int
foo = static (2 :: Int)

idInt :: StaticPtr (Int -> Int)
idInt = static id

This will fail with No instance for (Typeable a).

id :: StaticPtr (a -> a)
id = static id

We can fix this by adding the constraint to the scope, though note that in order to actually serialize these values (via staticKey), we need to instantiate the type.

idT :: Typeable a => StaticPtr (a -> a)
idT = static id

constT :: (Typeable a, Typeable b) => StaticPtr (b -> a -> b)
constT = static const

-- bad1 = staticKey idT                             -- Doesn't typecheck
good1 = staticKey (idT :: StaticPtr (Bool -> Bool)) -- OK

Because of this restriction, it would seem sending polymorphic functions is impossible. However we can get around this with some clever (if ugly) wrapping.

-- forall a . a -> a
data IdFunc where
  IdFunc :: (forall a . a -> a) -> IdFunc 
  deriving (Typeable)

-- forall a b. a -> b -> a
data ConstFunc where
  ConstFunc :: (forall a b . a -> b -> a) -> ConstFunc 
  deriving (Typeable)

-- forall c a . c a => a -> a -> b
data PredFunc :: (Type -> Constraint) -> Type -> Type where
  PredFunc :: (forall a . c a => a -> a -> b) -> PredFunc c b
  deriving (Typeable)

id' :: StaticPtr IdFunc 
id' = static (IdFunc id)

const' :: StaticPtr ConstFunc 
const' = static (ConstFunc const)

eq' :: StaticPtr (PredFunc Eq Bool)
eq' = static (PredFunc (==))

 Type safety

As we mentioned earlier, GHC static pointers are lookup safe. Unfortunately the lookup function is currently not type safe. This is because the type representation of a static pointer is not stored in the SPT, so the recieving process has no way of verifying that the type specified by the caller and the pointer in the table have the same type.

In other words, lookups might lead to segfaults and runtime crashes: the lookup function is called unsafeLookupStaticPtr for this reason) A type-safe api has been proposed but not implemented.

It turns out that we can actually the safe API as a library, with the single infelicity that we have to list all of our static expressions manually at the top-level of our program. (Side note: the lookup fuction is also in IO because of dynamic linking. We will ignore this in our implementation. For the real library the solution proposed here is to make the default function pure and provide an escape hatch for the dynamic loading case.)

First we define an internal type to wrap up a StaticPtr along with its Typeable context.

-- Used internally
data WrappedPtr :: Type where
  WrappedPtr :: Typeable a => StaticPtr a -> WrappedPtr

We can now easily define our own version of ...lookup and staticPtrKeys. Note the safe API for lookupStaticPtr. This will fail with Nothing if the given key does not exist in the SPT, and otherwise call into the handler, where the caller can do their own casting/type detection logic.

lookupStaticPtr 
    :: StaticKey 
    -> (forall a . Typeable a => StaticPtr a -> b) 
    -> Maybe b
lookupStaticPtr key k = f <$> Prelude.lookup key __ptrs__ where
  f (WrappedPtr x) = k x

staticPtrKeys' :: [StaticKey]
staticPtrKeys' = fst <$> __ptrs__

The mysterious value __ptrs is the new type-safe SPT that we want the compiler to write for us. For now, we could write it manually, like this:

__ptrs__ :: [(StaticKey, WrappedPtr)]
__ptrs__ =
  [ wrapStatic sayHello
  , wrapStatic sayHelloTo
  , wrapStatic id'
  , wrapStatic const'
  , wrapStatic eq'
  ]
  where
    wrapStatic x = (staticKey x, WrappedPtr x)

Let’s also define some utility functions. Note these are built on top of of the minimal API defined above. Note lookupStaticPtrAs makes no distiction between a static pointer not existing in the table and existing in the table with the wrong type.

derefAs :: (Typeable a, Typeable b) => StaticPtr a -> Maybe b
derefAs = cast . deRefStaticPtr

lookupStaticPtrAs :: Typeable a => StaticKey -> Maybe a
lookupStaticPtrAs k = join $ lookupStaticPtr k derefAs

lookupStaticPtrInfo :: StaticKey -> Maybe StaticPtrInfo
lookupStaticPtrInfo k = lookupStaticPtr k staticPtrInfo

Finally, let us test our implementation by “sending” values of various complexity. We simulate sending by just converting the values to the key and back.

test = do
  -- Send "hello"
  let helloKey :: StaticKey = staticKey sayHelloTo
  case lookupStaticPtrAs helloKey of
    Nothing -> pure ()
    Just (hello :: String -> IO ()) -> hello "everybody!"

  -- Send "id"
  let idKey :: StaticKey = staticKey id'
  case lookupStaticPtrAs idKey of
    Nothing -> pure ()
    Just (IdFunc id) -> print $ (id succ) (id 22)

  -- Send "const"
  let constKey :: StaticKey = staticKey const'
  case lookupStaticPtrAs constKey of
    Nothing -> pure ()
    Just (ConstFunc const) -> print $ const False (const [] ())

  -- Send "(==)"
  let eqKey :: StaticKey = staticKey eq'
  case lookupStaticPtrAs eqKey :: Maybe (C_A_AToAToB Eq Bool) of
    Nothing -> pure ()
    Just (PredFunc eq) -> print $ 2 `eq` 2

 Conclusion

We have added a safe lookup function to GHC.StaticPtr API, replacing unsafeLookupStaticPtr. Notably our implementation is fully type safe. This is possible because is does not make use of unsafeLookupStaticPtr, but uses its own pointer table. Our API is equivalent to the one presented in the aforementioned proposal, except that it keeps the wrapper type hidden, adding only lookupStaticPtr and ptrKeys. This also corresponds with SPJ’s argument to keep the trusted code-base minimal.

 Further reading

 
9
Kudos
 
9
Kudos

Now read this

Free, Finally

Most Haskell programmers have come across the term “free”, often used in conjuction with “monad”, “monoid”, “algebra”. Though I never had a problem understanding these concepts in isolation, it was not entirely clear to me how they... Continue →