A bit of context to prevent xy:
I want to build a cache for my system. There are Type pairs consisting of IdType
and EndpointType
in a 1:1 relationship. Every IdType
only ever refers to one EndpointType
and vice versa. The Id
trait (see code below) is not required, it only exists in my current iteration of trying to make this work.
There are way too many different types of endpoints - and usually an application will only use a small subset of them - to justify statically building a cache per endpoint and holding that in memory. Additionally, I want to be able to add new endpoints without touching any code related to caching or the client itself.
The Endpoint
trait objects are not object safe and there is no way to make them object safe due to associated const
s, Sized
and methods not using self
to make use of compile time optimizations.
I came up with the idea of storing them as kind of Any
type. Now I like type safety, so I tried to restrict it a bit more. After failing to find a satisfying solution, I still have several issues with my idea:
- How do I make this approach work?
- Is there a better solution?
- Is this sound? Why?
- Is there a way to make it sound?
- Can I achieve this without unsafe?
use std::sync::Mutex;
use std::collections::HashMap;
pub trait Endpoint: Sized {
type Id;
}
pub trait Id {
type Endpoint;
}
pub struct Client {
cache: Mutex<Cache>,
}
impl Client {
fn get<T: Endpoint>(&self, id: T::Id) -> T {
if let Some(result) = self.cache.lock().unwrap().get(id) {
return result;
}
todo!()
}
}
pub struct Cache {
map: HashMap<Box<dyn Id>, Box<dyn Endpoint>>,
}
impl Cache {
fn get<T: Id>(&self, id: T) -> Option<T::Endpoint> {
if let Some(endpoint) = self.map.get(Box::new(id)) {
let endpoint: Box<T::Endpoint> = unsafe { std::mem::transmute(endpoint) };
Some(endpoint)
} else {
None
}
}
}
CodePudding user response:
I'd strongly recommend going with Box<dyn Any>
over std::mem::transmute
. A fairly common pattern is to have a HashMap<TypeId, Box<dyn Any>>
. TypeId
is a type that can be used to distinguish between other types, and it's Eq
and Hash
impls work how you'd expect: if the types are the same, the type ids are equal. So you could have something roughly like:
struct Cache {
map: HashMap<TypeId, Box<dyn Any>>,
}
impl Cache {
fn get<T>(&self) -> Option<&T> {
let type_id = TypeId::of::<T>();
let any = self.map.get(&type_id)?;
any.downcast_ref()
}
}
This can act like a rough lookup table that associates a type with a single value of that type. If there is a Box<dyn Any>
that is associated with the type id for T
, but points to a value of some other type, you won't get UB, you'll just get None
.
This sort of primitive can be used to build a more complex cache around it. For example, you could have a struct that wraps this that provides access, but it only supports keys of type (T, <T as Id>::Endpoint)
.
The key point is to use Any
and downcast_*
methods to avoid unsafe.
std::mem::transmute
is one of the more dangerous unsafe functions, and should largely be seen as a last resort. From the docs:
transmute
is incredibly unsafe. There are a vast number of ways to cause undefined behaviour with this function.transmute
should be the absolute last resort
I'd also add, this pattern is common enough that there's probably a crate that provides a type safe interface, a quick search yielded this: https://crates.io/crates/anymap, though I can't speak for this crate's quality in particular
Edit:
If you also want to be able to distinguish between multiple instances of the same Id
/Endpoint
pair type, you could modify it to store a hashmap with the keys having a type (TypeId, u64)
, where the u64
is the result of hashing the original key (a bit like a SQL composite key):
struct Cache {
map: HashMap<(TypeId, u64), Box<dyn Any>>,
}
impl Cache {
fn insert<T>(&mut self, id: T::Id, endpoint: T)
where
T: Endpoint 'static,
T::Id: Hash,
{
let type_id = TypeId::of::<T>();
let hash = {
let mut hasher = DefaultHasher::new();
id.hash(&mut hasher);
hasher.finish()
};
self.map.insert((type_id, hash), Box::new(endpoint));
}
fn get<T>(&self, id: T::Id) -> Option<&T>
where
T: Endpoint 'static,
T::Id: Hash,
{
let type_id = ...;
let hash = ...;
let option_any = self.map.get(&(type_id, hash));
option_any.and_then(|any| any.downcast_ref())
}
}
This lets you have multiple FooEndpoint
s (with FooId
s), as well as BarEndpoint
s (and BarId
s), while avoiding transmute/unsafe, all with a single map lookup. Hopefully I read your question more accurately this time ;p
P.S. I have no idea if this particular way of obtaining a u64
hash is "good", I've never actually had to do this before (I've only ever used the Hash
trait with std::collections::HashMap/HashSet
). Might be worth doing some googling to make sure this isn't doing something horrible