Skip to content

Commit 27f33fe

Browse files
committed
fix(ai): add spatial hashing
1 parent a70c3db commit 27f33fe

File tree

3 files changed

+263
-0
lines changed

3 files changed

+263
-0
lines changed

courses/algorithms/03-dynamic-data/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,7 @@ public:
116116
## Homework
117117

118118
For both, implement the following functions:
119+
119120
- `T* find(const T& value)`: returns a pointer to the first occurrence of the value in the collection, or nullptr if the value is not found.
120121
- `bool contains(const T& value)`: returns true if the value is found in the collection, false otherwise.
121122
- `T& at(size_t index)`: returns a reference to the element at the specified index. If the index is out of bounds, throw an `std::out_of_range` exception.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,261 @@
1+
# Spatial Hashing
2+
3+
A Spatial Hashing is a common technique to speed up queries in a multidimensional space. It is a data structure that allows you to quickly find all objects within a certain area of space. It is commonly used in games and simulations to speed up, artificial intelligence world queries, collision detection, visibility testing and other spatial queries.
4+
5+
Advantages of the spatial hashing:
6+
7+
- simple to implement;
8+
- very fast: as fast as your key hashing function;
9+
- easy to parallelize;
10+
- a good choice for big worlds;
11+
12+
Problem with spatial hashing:
13+
14+
- it is not precise;
15+
- it is not good for small worlds;
16+
- needs fine tune to find the right cell size;
17+
- have to update the bucket when the object moves;
18+
- find the nearest objects is not trivial, you will have to query the adjacent cells;
19+
20+
## Buckets
21+
22+
The core of the spatial hashing is the bucket. It is a container that holds all the objects that are within a certain area of space contained in the cell area or volume. The terms cell and bucket can be interchangeable in this context.
23+
24+
In order to find buckets, you will have to create ways to quantize the world space into a grid of cells. It is hard to define the best cell size, but it is a good practice to make it be a couple of times bigger than the biggest object you have in the world. The cell size will define the precision of the spatial hashing, and the bigger it is, the less precise it will be.
25+
26+
## Spatial quantization
27+
28+
The spatial quantization is the process of converting a continuous space into a discrete space. This is the core process of finding the right bucket for an object. Let's assume that we have a 2D space, and we want to find the bucket for a given object.
29+
30+
```c++
31+
// assuming Vector2f is a 2D vector with float components;
32+
// and Vector2i is a 2D vector with integer components;
33+
// the quantizations gunction will be:
34+
Vector2i quantize(Vector2f position, float cellSize) {
35+
return Vector2i{
36+
(int)(position.x / cellSize),
37+
(int)(position.y / cellSize)};
38+
}
39+
```
40+
41+
## Data structures
42+
43+
### Data structure for the bucket
44+
45+
First, we have to decide the data structure your bucket will use to store the objects. The common choices are:
46+
47+
- `vector<GameObject*>` - a vector of pointers to game objects;
48+
- `set<GameObject*>` - a set of pointers to game objects;
49+
- `unordered_set<GameObject*>` - an unordered_set of pointers to game objects;
50+
51+
- The problem of using a `vector` is that it is not efficient to remove, and find an object in it: `O(n)`; but it is efficient to add (amortized `O(1)`) and iterate over it (random access is `O(1)`).
52+
- The underlying data structure of a `set` and `map` is a binary search tree, so it is efficient to find, add and remove objects: `O(lg(n))`, but it is not efficient to iterate over it.
53+
- Now, the `unordered_set` and `unordered_map` is a hash table, so it is efficient to find, add and remove objects: `O(1)`, and it is efficient to iterate over it. The overhead of using a hash table is the memory usage and the hashing function. It will be as fast as your hashing function.
54+
55+
In our use case, we will frequently list all elements in a bucket, we will add and remove elements from it, while they move in the world. So, the best choice is to use an `unordered_set` of pointers to game objects.
56+
57+
So lets define the bucket:
58+
59+
```cpp
60+
using std::unordered_set<GameObject*> = bucket_t;
61+
```
62+
63+
### Data structure for indexing buckets
64+
65+
Ideally, we are looking for a data structure that will give us a bucket for a given position. We have some candidates for this job:
66+
67+
- `bucket_t[width][height]` - a 2D array of buckets;
68+
- `vector<vector<bucket_t>>` - a 2D vector of buckets;
69+
- `map<Vector2i, bucket_t>` - a map of buckets;
70+
- `unordered_map<Vector2i, bucket_t>` - a map of buckets;
71+
72+
- `array`s and `vector`s are the fastest data structures to use, but they are not good choices if you have a sparse world;
73+
- `map` is a binary search tree;
74+
- `unordered_map` is a hash table.
75+
76+
The `unordered_map` is the best choice for this use case.
77+
78+
```c++
79+
// quantized world
80+
unordered_map<Vector2i, go_bucket_t> world;
81+
```
82+
83+
### Iterating over the whole world at once
84+
85+
Sometimes we just want to iterate over all objects in the world, add and remove elements. In this case, we can use a `unordered_set` to store all game objects.
86+
87+
```c++
88+
// all game objects for faster global world iteration and cleanup
89+
go_bucket_t worldObjects;
90+
```
91+
92+
## Implementation
93+
94+
This sample here is a bit complex, but I added a bunch of support code to make it more complete, feel free to simplify it to your needs.
95+
96+
```cpp
97+
#include <iostream> // for cout
98+
#include <unordered_map> // for unordered_map
99+
#include <unordered_set> // for unordered_set
100+
#include <random> // for random_device and default_random_engine
101+
#include <cmath> // for floor
102+
#include <cstdint> // for int32_t
103+
#include <memory> // for shared_ptr
104+
105+
using namespace std;
106+
107+
// to allow derivated structs to be used as keys in sorted containers and binary search algorithms
108+
template<typename T>
109+
struct IComparable { virtual bool operator<(const T& other) const = 0; };
110+
// to allow derivated structs to be used as keys in hash based containers and linear search algorithms
111+
template<typename T>
112+
struct IEquatable { virtual bool operator==(const T& other) const = 0; };
113+
114+
// generic Vector2
115+
// requires that T is a int32_t or float_t
116+
template<typename T>
117+
#ifdef __cpp_concepts
118+
requires std::is_same_v<T, int32_t> || std::is_same_v<T, float_t>
119+
#endif
120+
struct Vector2: public IComparable<Vector2<T>>, public IEquatable<Vector2<T>> {
121+
T x, y;
122+
Vector2(): x(0), y(0) {}
123+
Vector2(T x, T y): x(x), y(y) {}
124+
// operator equals
125+
bool operator==(const Vector2& other) const {
126+
return x == other.x && y == other.y;
127+
}
128+
// operator < for being able to use it as a key in a map or set
129+
bool operator<(const Vector2& other) const {
130+
return x < other.x || (x == other.x && y < other.y);
131+
}
132+
133+
// quantize the vector to a 2d index
134+
// you may want to simplify this function to use less instructions
135+
Vector2<int32_t> quantized(float_t cellSize=1.0f) const {
136+
return Vector2<int32_t>{
137+
static_cast<int32_t>(std::floor(x + cellSize/2) / cellSize),
138+
static_cast<int32_t>(std::floor(y + cellSize/2) / cellSize)
139+
};
140+
}
141+
};
142+
143+
// specialized Vector2 for int and float
144+
using Vector2i = Vector2<int32_t>;
145+
// float32_t is only available in c++23, so we use float_t instead
146+
using Vector2f = Vector2<float_t>;
147+
148+
// helper struct to generate unique id for game objects
149+
// mostly debug purposes
150+
struct uid_type {
151+
private:
152+
static inline size_t nextId = 0; // to be used as a counter
153+
size_t uid; // to be used as a unique identifier
154+
public:
155+
// not thread safe, but it is not a problem for this example
156+
uid_type(): uid(nextId++) {}
157+
inline size_t getUid() const { return uid; }
158+
};
159+
160+
// generic game object implementation
161+
// replace this with your own data that you want to store in the world
162+
class GameObject: public uid_type, public enable_shared_from_this<GameObject> {
163+
Vector2f position;
164+
public:
165+
GameObject(): uid_type(){};
166+
GameObject(const GameObject& other): uid_type(other), position(other.position) {}
167+
// todo: add your other custom data here
168+
// when the it moves, it should check if it needs to update its bucket in the world
169+
void setPosition(const Vector2f& newPosition);
170+
Vector2f getPosition() const { return position; }
171+
};
172+
173+
// hashing
174+
namespace std {
175+
// Hash specialization for Vector2 types
176+
template<typename T>
177+
struct hash<Vector2<T>> {
178+
size_t operator()(const Vector2<T>& v) const {
179+
// given both x and y are 32 bits, we can shift and xor operator the other to get a unique hash
180+
// the problem of this approach is that it will generate neighboring cells with similar hashes
181+
// to fix that, you might want to use a more complex hashing function from std::hash<T>
182+
// hash<size_t>{}((*(size_t*)&v.x << 32) ^ (*(size_t*)&v.y))
183+
return (*(size_t*)&v.x << 32) ^ (*(size_t*)&v.y);
184+
}
185+
};
186+
}
187+
188+
// game object pointer
189+
// shared pointer is used to avoid memory leaks
190+
using GameObjectPtr = std::shared_ptr<GameObject>;
191+
// alias for the game object bucket
192+
using go_bucket_t = std::unordered_set<GameObjectPtr>;
193+
// alias for the world type
194+
using world_t = std::unordered_map<Vector2i, go_bucket_t>;
195+
196+
// singletons here are being used to avoid global variables and to allow the world to be used in a visible scope
197+
// you should use a better wrappers and abstractions in a real project
198+
// singleton world
199+
world_t& world() {
200+
static world_t world;
201+
return world;
202+
}
203+
// singleton world objects
204+
go_bucket_t& worldObjects(){
205+
static go_bucket_t worldObjects;
206+
return worldObjects;
207+
}
208+
209+
// this function requires the world to be in a visible scope like this or change it to access through a singleton
210+
void GameObject::setPosition(const Vector2f& newPosition) {
211+
// check if it needs to update its bucket in the world
212+
if (position.quantized() == newPosition.quantized())
213+
return;
214+
// remove from old bucket
215+
world()[position.quantized()].erase(shared_from_this());
216+
// update position
217+
position = newPosition;
218+
// add to new bucket
219+
world()[position.quantized()].insert(shared_from_this());
220+
}
221+
222+
// random vector2f
223+
Vector2f randomVector2f(float_t min, float_t max) {
224+
static random_device rd;
225+
static default_random_engine re(rd());
226+
static uniform_real_distribution<float_t> dist(min, max);
227+
return Vector2f{dist(re), dist(re)};
228+
}
229+
230+
int main() {
231+
// fill the world with some random game objects
232+
for (int i = 0; i < 121; i++) {
233+
auto obj = std::make_shared<GameObject>();
234+
obj->setPosition(randomVector2f(-5, 5));
235+
world()[obj->getPosition().quantized()].insert(obj);
236+
worldObjects().insert(obj);
237+
}
238+
// randomlly move the game objects
239+
// move them, this will update their position and their bucket in the world
240+
for (auto& obj: worldObjects()) {
241+
obj->setPosition(randomVector2f(-5, 5));
242+
}
243+
244+
// print the bucket id and every object in it
245+
for (auto& bucket: world()) {
246+
cout << "bucket " << bucket.first.x << ", " << bucket.first.y << endl;
247+
for (auto& obj: bucket.second)
248+
cout << " object " << obj->getUid() << " at " << obj->getPosition().x << ", " << obj->getPosition().y << endl;
249+
}
250+
251+
return 0;
252+
}
253+
```
254+
255+
## Homework
256+
257+
1. Implement a spatial hashing for a 3D world;
258+
2. Implement another space partition technique, such as a quadtree/octree/kdtree and compare:
259+
1. the performance of both in scenarios of moving objects, searching for objects and adding / removing objects;
260+
2. memory consumption;
261+
3. which one will be slow down faster the bigger the world becomes;

mkdocs.yml

+1
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ nav:
1313
- AI:
1414
- courses/artificialintelligence/README.md
1515
- Spatial Quantization: courses/artificialintelligence/readings/spatial-quantization.md
16+
- Spatial Hashing: courses/artificialintelligence/04-spatialhashing/README.md
1617
- Maze Data Structure: blog/posts/MazeDataStructure/MazeDataStructures.md
1718
- Assignments:
1819
- courses/artificialintelligence/assignments/README.md

0 commit comments

Comments
 (0)