-
Notifications
You must be signed in to change notification settings - Fork 3
Can we deserialize automagically with simdjson::ondemand #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Will start investigating this a bit. Will keep updates posted here (@lemire) |
Will open a branch on https://github.com/simdjson/simdjson for it |
Let's keep this open and when we have the official C++26 spec and compiler support we can add it to simdjson, using the version check as @lemire recently used for leveraging concepts for having serialization of vector |
The extractor syntax should help.... It will be interesting to find out how it interacts with C++ reflection. |
Extract should work without modification. In C++26, we could simply, instead of blowing up in the user's face with So, we simply need a OR or AND: We could use a little helper function with struct users {
int id;
std::string username;
std::ifstream profile_image; // no can do kinda field; gets ignored
} user;
struct admins {
int id;
std::string username;
} admin;
obj.extract(
to{"user", ignore_unknown_fields(user)},
to{"admin", admin} // no need to do anything, the global `tag_invoke` should be able to handle this
); So, I don't think we need to do much about the Note: |
@lemire, @FranciscoThiesen Here's a mock code of what I'm thinking: namespace simdjson {
template <typename T>
consteval auto is_convertible_type() {
for (std::meta::info field : nonstatic_data_members_of(^T)) {
if (deserializable(type_of(field))) return false;
}
return true;
}
template <typename T>
requires (is_convertible_type<T>()) // all of its fields MUST be deserializable as well
constexpr error_code tag_invoke(deserialize_tag, auto& value, T& out) {
template for (constexpr auto e : std::meta::nonstatic_data_members_of(^T)) {
simdjson::ondemand::object obj;
if (auto error = value.get_object().get(obj); error) {
return error;
}
// it might be better if we could somehow call .extract once
obj.extract(
to{name_of(e), [:e:]}
);
}
return SUCCESS;
}
}
It's my first time writing C++26 reflections, so, I don't know how reflections work in above mock code. Honestly, this could be laughably easy to implement if C++26's reflection is what I think it is. |
I looked at simdjson PR and it looks amazing, the issue with disserealization and reflection is the need to call recursion on objects that are not trivial, so you need to call sometimes Here is idea that i used to make simple deserializer for yaml-cpp where it provides map-like structure via Node template <typename T> void from_node(auto const &node, T &t) {
util::for_range<0, util::number_of_members<T>()>([&]<auto I>() { // loop over members
constexpr auto mem = util::member_info<T>(I);
auto name = std::string{util::name_of(mem)};
if constexpr (std::is_constructible_v<std::formatter<[:type_of(mem):]>>) { // check if type is trivial like double, int, char ...
t.[:mem:] = node[name].template as<[:type_of(mem):]>());
} else {
from_node(node[name], t.[:mem:]); // call recursion when member is some other structure
}
}); Also, when you use reflection to deserialize it is not clear how to handle custom cases like this one
|
We could even have another helper function or class for obj.extract(
to{"user", ignore_fields(user, "password", "ip_address")},
);
Wait, that might be a good idea for now too in case we have a This could be cool, but it wouldn't be the most performant way that the user could implement But that requires the implementers of It might be a good idea to have a concept for types (document and value currently) that we can extract object from in a way that in the future helper functions can wrap them around. |
@Yaraslaut The whole idea of Meaning, even though this is possible: obj.extract(
to{"user", sub{
to{"id", user.id},
to{"username", user.username}
}},
);
But this is more general when you give a struct users {
friend error_code tag_invoke ... {
// ...
obj.extract(
to{"id", out.id},
to{"username", out.username}
);
}
}
// now in the middle of your codes, you simply provide `user`:
std::vector<user> admins;
obj.extract(
to{"user", user},
to{"admins", admins}
); We probably should provide std::vector<
unique_ptr<shared_ptr<atomic<uniqe_ptr<user>>>> // and how much deep you'd like to go
> admins;
obj.extract(
to{"admins", admins},
); |
@the-moisrex Ideally, we'd call |
@the-moisrex Turns out that calling |
The
simdjson::ondemand
API is begging for reflection-based deserialization.We have this ugly approach:
https://github.com/simdjson/simdjson/blob/master/doc/basics.md#adding-support-for-custom-types
It works, but it requires too much work from our users. It should be automagical. :-)
The text was updated successfully, but these errors were encountered: