move bounds checks to constructor #32

robertbuessow · 2024-10-28T13:13:01Z

I moved the bounds check from getindex to the Blobs constructors. We now guarantee that the constructed blob is valid (has enough memory for the data), so we don't need boundschecks whenever we deref. This gets rid of all the boundscheck code.

RAICode needs some fixes to work with this, i.e. where we created blobs with offset > limit. See: https://github.com/RelationalAI/raicode/pull/21062

kuszmaul

This is an interesting idea. I'm not sure whether I'm sold on it.

src/bit_vector.jl

.vscode/settings.json

kuszmaul · 2024-10-29T16:55:44Z

src/blob.jl

+            if offset < 0 || offset + self_size(T) > limit
+                throw(InvalidBlobError(Blob{T}, base, offset, limit, 1))
+            end
+            @assert base != Ptr{Nothing}(0) "Null pointer dereference in $(T)"


This assert against the null pointer seems out of place. There's no way to protect against bad pointers, and may be the user wants to make such a blob.

Maybe we should allow null pointers if their size is also set to 0?

sounds good to me (I also remove the @Assert since it caused in alloc)

kuszmaul · 2024-10-29T16:58:26Z

src/blob.jl

+Base.@propagate_inbounds function Blob(ref::Base.RefValue{T}) where T
    Blob{T}(pointer_from_objref(ref), 0, sizeof(T))


This @propagate_bounds seems unneeded and possibly wrong. We know that there are no bounds errors here, right? Also, this is going to be a problem if self_size != sizeof so maybe it should be self_size rather than sizeof. Or maybe there should be an error.

Suggested change

Base.@propagate_inbounds function Blob(ref::Base.RefValue{T}) where T

Blob{T}(pointer_from_objref(ref), 0, sizeof(T))

function Blob(ref::Base.RefValue{T}) where T

@assert self_size(T) == sizeof(T)

@inbounds Blob{T}(pointer_from_objref(ref), 0, self_size(T))

That @Assert actually breaks the test. Basically this operation only works if the memory layout of Julia structs and Blobs are the same. That's not always the case because of alignment. The best would be to get rid of the constructor altogether. Otherwise, the @Assert is probably good enough, assuming that Julia only takes more memory due to alignment.

src/blob.jl

kuszmaul · 2024-10-29T17:08:53Z

src/blob.jl

@@ -32,7 +56,7 @@ function Base.pointer(blob::Blob{T}) where T
    convert(Ptr{T}, getfield(blob, :base) + getfield(blob, :offset))
 end

-function Base.:+(blob::Blob{T}, offset::Integer) where T
+Base.@propagate_inbounds function Base.:+(blob::Blob{T}, offset::Integer) where T


It's traditional in the C world to allow people to construct a pointer that goes just off the end of an array. (But not to access that pointer)

This code prohibits the construction of the pointer. Is that where we want to go?

You can created a Blob{Nothing}(<someptr>, 0, 0). That actually happening. I don't think a Blob is an array so don't see a need for something else.

Co-authored-by: Bradley C. Kuszmaul <[email protected]>

… into rb-blobs-nobounds

kuszmaul · 2024-11-14T16:39:49Z

src/blob.jl

+    end
+end
+
+@noinline function _bounds_check(


What's the rationale for making this @noinline? Julia doesn't do it for checkbounds(::AbstractArray, I...), for example.

The reason for the large amount of generate llvm code was the bounds check (building the strings etc). Therefore, I want to avoid that this is getting inlined into the specialized function.

kuszmaul · 2024-11-14T16:50:37Z

src/blob.jl

+    offset::Int64,
+    limit::Int64,
+    self_size_T::Int64,
+    @nospecialize(T::DataType))


Does this @nospecialize actually do anything?

Julia doesn't specialize arguments that look like t::Type. (It does specialize arguments like ::Type{T}) where T.)

julia> foo(base::Int64, t::DataType) = base+sizeof(t) foo (generic function with 1 method) julia> Base.specializations(@which foo(3,Int)) Base.MethodSpecializations(MethodInstance for foo(::Int64, ::DataType)) julia> Base.specializations(@which foo(3,UInt8)) Base.MethodSpecializations(MethodInstance for foo(::Int64, ::DataType)) julia> bar(base::Int64, ::Type{T}) where T = base+sizeof(T) bar (generic function with 1 method) julia> Base.specializations(@which bar(3,Int)) Base.MethodSpecializations(MethodInstance for bar(::Int64, ::Type{Int64})) julia> Base.specializations(@which bar(3,UInt8)) Base.MethodSpecializations(svec(MethodInstance for bar(::Int64, ::Type{Int64}), MethodInstance for bar(::Int64, ::Type{UInt8}), nothing, nothing, nothing, nothing, nothing))

No, good catch. I added it because I was surprised that there is still more code generated by the constructor (I believe). But it didn't change anything for the reason that you point out.

robertbuessow added 4 commits October 28, 2024 13:12

move bounds checks to constructor

e11a7f9

fix

0dbcae5

rename allocated_size -> available_size

0aff1f0

remove .vscoce/settings.json

35b22f6

robertbuessow marked this pull request as ready for review October 29, 2024 17:14

kuszmaul requested changes Oct 29, 2024

View reviewed changes

robertbuessow and others added 5 commits October 29, 2024 18:19

Update src/bit_vector.jl

0627afa

Co-authored-by: Bradley C. Kuszmaul <[email protected]>

Code review comments, documentation, tests for Ref blobs

7bf6830

Merge branch 'rb-blobs-nobounds' of github.com:robertbuessow/Blobs.jl…

effbc95

… into rb-blobs-nobounds

Move boundscheck out of constructor

d70026c

small improvement

f2e114c

kuszmaul reviewed Nov 14, 2024

View reviewed changes

Merge branch 'main' into rb-blobs-nobounds

03ec596

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move bounds checks to constructor #32

move bounds checks to constructor #32

robertbuessow commented Oct 28, 2024 •

edited

Loading

kuszmaul left a comment

kuszmaul Oct 29, 2024

rgankema Oct 30, 2024

robertbuessow Oct 30, 2024 •

edited

Loading

kuszmaul Oct 29, 2024

robertbuessow Oct 30, 2024 •

edited

Loading

kuszmaul Oct 29, 2024

robertbuessow Oct 29, 2024

kuszmaul Nov 14, 2024

robertbuessow Nov 15, 2024

kuszmaul Nov 14, 2024

robertbuessow Nov 15, 2024

		Base.@propagate_inbounds function Blob(ref::Base.RefValue{T}) where T
		Blob{T}(pointer_from_objref(ref), 0, sizeof(T))

move bounds checks to constructor #32

Are you sure you want to change the base?

move bounds checks to constructor #32

Conversation

robertbuessow commented Oct 28, 2024 • edited Loading

kuszmaul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robertbuessow Oct 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robertbuessow Oct 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robertbuessow commented Oct 28, 2024 •

edited

Loading

robertbuessow Oct 30, 2024 •

edited

Loading

robertbuessow Oct 30, 2024 •

edited

Loading