-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sketch]Arbitrary-precision BigInt #84
Conversation
@@ -143,6 +143,18 @@ HEADER_SHIM float libm_log10f(float x) { | |||
return __builtin_log10f(x); | |||
} | |||
|
|||
HEADER_SHIM float libm_ceilf(float x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These don't appear to be used anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I only use the double
version in the BigInt implementation, but I thought for the sake of completeness I would add all the variants. If that's wrong, I'd be happy to take 'em out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted at the use sites, you don't need any of these.
Sources/BigInt/BigInt.swift
Outdated
} | ||
|
||
public init?<T>(exactly source: T) where T : BinaryFloatingPoint { | ||
if libm_ceil(Double(source)) != Double(source) { return nil } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check can be source.rounded(.towardZero) != source
. But you also need to eliminate infinity, right?
Sources/BigInt/BigInt.swift
Outdated
var float = Double(isNegative ? -source : source) | ||
var words = Words() | ||
while float > 0.0 { | ||
let digit = UInt(libm_remainder(float, Double(UInt.max) + 1.0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't do what you want it to; remainder(a,b)
produces a value in [-b/2, b/2]
, but you need a result in [0,b)
. This will trap for many inputs when you try to init UInt
with a negative value. The standard library gives you everything you need here:
let radix = T(sign: .plus, exponent: T.Exponent(UInt.bitWidth), significand: 1)
repeat {
digit = UInt(float.truncatingRemainder(dividingBy: radix))
words.append(digit)
float = (float / radix).rounded(.towardZero)
} while float != 0
You need to handle a bunch of edge cases that this currently ignores, too: infinity, NaN, and the case where UInt.max > T.greatestFiniteMagnitude (in which case you get a single word that is just UInt(float)
.)
@@ -143,6 +143,18 @@ HEADER_SHIM float libm_log10f(float x) { | |||
return __builtin_log10f(x); | |||
} | |||
|
|||
HEADER_SHIM float libm_ceilf(float x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted at the use sites, you don't need any of these.
Ok, new commit to address the floating-point feedback. There doesn't seem to be any easy way to test the one with the |
@rothomp3 Friendly reminder that you can "bulk" manage "access control" using extensions. Extension also give more clear groupings (especially when used with I recommend declaring So: public extension BigInt {
func foo() {}
func bar() {}
} Which you can at any point change to extension BigInt {
public func foo() {}
public func bar() {}
} |
@rothomp3 I have not yet looked through your PR in detail, I'm sure it is great! But I just wanted to give you two references - which you surely know about, but if not they might be useful: |
So both of those use additional storage for the sign, and all the extra math required for that, instead of using 2's complement to store the BigInts the same way the language/CPU already store signed integers of 64 bits or less. I believe my way more effectively leverages the hardware. |
Do we have benchmarks that demonstrate this? (The BigInt implementations I’ve seen use sign-magnitude, and I don’t think it’s principally for ease of implementation.) |
It makes (approximately) no difference. This is pretty easy to see even without benchmarks. First off, note that converting from one to the other is O(n), and every arithmetic and bitwise operation defined on FixedWidthInteger is at least O(n), so there can be no difference in asymptotic performance--you could always just convert, do the operation, and convert back. So we're only interested in the the constants.
This approach is fine, and most closely matches the semantics of @rothomp3, sorry, I've been busy this week, but I'll give this a more thorough review next week and once we're to a good place I'll set up a branch to merge it onto for further development. |
We deliberately don't do this in the standard library, and I'm going to follow that style with Swift Numerics to facilitate moving code into the standard library if evolution wants some of these features in the future. I know that people have strong opinions about it, but ultimately it doesn't matter very much. Please keep explicit access control for now for simplicity. |
I was looking at the code for the
I think the initializer should be modified to something along the lines of the following: public init?(_ description: String) {
guard let firstCharacter = description.first else { return nil }
var description = description
var isNegative = false
if firstCharacter == "-" {
guard description.count > 2 else { return nil }
isNegative = true
description.removeFirst()
} else if firstCharacter == "+" {
guard description.count > 2 else { return nil }
description.removeFirst()
}
let validDigits = Set("0123456789")
guard description.allSatisfy({ validDigits.contains($0) }) else { return nil }
var result: BigInt = 0
for (i, char) in description.drop(while: { $0 == "0" }).reversed().enumerated() {
result += BigInt(char.wholeNumberValue!) * pow(10, BigInt(i))
}
if isNegative {
result = -result
}
words = result.words
} |
Eliminate unnecessary math function calls, and also handle floating point edge cases better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Not a full review)
Package.swift
Outdated
|
||
.testTarget(name: "ComplexTests", dependencies: ["Complex", "NumericsShims"]), | ||
.testTarget(name: "RealTests", dependencies: ["Real"]), | ||
.testTarget(name: "BigIntTests", dependencies: ["BigInt"]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing products, targets, dependencies are in alphabetical order.
Sources/BigInt/BigInt.swift
Outdated
// BigInt.swift | ||
// swift-numerics | ||
// | ||
// Created by Robert Thompson on 2/4/19. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing swift files in Sources and Tests:
- begin with a standard copyright/license comment.
- use an indent width of two spaces.
Sources/BigInt/BigInt.swift
Outdated
// Created by Robert Thompson on 2/4/19. | ||
// | ||
|
||
public struct BigInt: BinaryInteger, SignedNumeric, SignedInteger, CustomStringConvertible, LosslessStringConvertible, Hashable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this list can be reduced (if needed):
- SignedInteger inherits from BinaryInteger and SignedNumeric.
- BinaryInteger inherits from CustomStringConvertible and Hashable.
- LosslessStringConvertible inherits from CustomStringConvertible.
public struct BigInt: BinaryInteger, SignedNumeric, SignedInteger, CustomStringConvertible, LosslessStringConvertible, Hashable { | |
public struct BigInt: SignedInteger, LosslessStringConvertible { |
Sources/BigInt/BigInt.swift
Outdated
|
||
@inlinable | ||
public func signum() -> BigInt { | ||
return _isNegative ? -1 : 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this return zero when self == 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
As per benrimmington's excellent suggestion Co-Authored-By: Ben Rimmington <[email protected]>
Includes reformatting to use 2-space indentation and the correct file headers, among other simplifications.
as per further suggestions from @Wildchild9, adding in the removing leading zeros part I missed.
Let me just say I think it's hilarious how much scrutiny the |
Hi @rothomp3, I created a "biginteger" branch for this work, since there's more development to be done than fits neatly in a single PR. Can you update this PR to point to that branch (I can do this for you if you need help, but it will need to wait a couple days). |
Retargeted PR to |
This comment has been minimized.
This comment has been minimized.
UInt64, Int64, etc have more than one word on 32-bit architectures
Sources/BigInt/BigInt.swift
Outdated
|
||
public var bitWidth: Int { words.count * UInt.bitWidth } | ||
|
||
public var trailingZeroBitCount: Int { words.first?.trailingZeroBitCount ?? 0 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn’t that be something like
var trailingZeroBitCount: Int {
var totalZeros = 0
for word in words {
let zeros = word.trailingZeroBitCount
totalZeros += zeros
if zeros < UInt.bitWidth {
break
}
}
return totalZeros
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, imagine that the entire array of Uint
s is one giant integer. Only the trailing zeros in the least significant word are actually trailing, the rest are in the middle!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the least significant word is all zeroes (e.g. UInt.max + 1
), you'll need to keep going up the chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doh! You're right! I feel suddenly stupid, I'll fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed it and added some quick tests for it too.
Also add a couple tests for it
Sources/BigInt/BigInt.swift
Outdated
public var trailingZeroBitCount: Int { words.first?.trailingZeroBitCount ?? 0 } | ||
public var trailingZeroBitCount: Int { | ||
var totalZeros = 0 | ||
for word in words { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don’t need to invoke trailingZeroBitCount
on every word and then compare to bitWidth
. That’s just asking if word == 0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, I think. I'll take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, updated to just explicitly check for word == 0
, as I agree that's likely to be slightly faster.
Sources/BigInt/BigInt.swift
Outdated
let lhsWord = lhs.words[0] | ||
let rhsWord = rhs.words[0] | ||
|
||
let (result, isOverflow) = lhsWord.addingReportingOverflow(rhsWord) | ||
|
||
if !isOverflow && result < Int.max { | ||
lhs.words[0] = result | ||
return | ||
} | ||
let knownNegativeResult = lhsWord > Int.max && rhsWord > Int.max | ||
|
||
if lhsWord > Int.max || rhsWord > Int.max, !knownNegativeResult { | ||
// positive + negative is always smaller, so overflow is a red herring | ||
lhs.words[0] = result | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you evaluated using pattern matches to clarify some of this code and make it a bit easier to spot any edge cases. For instance, here I believe you could do:
if lhs.words.count == 1, rhs.words.count == 1 {
let lhsWord = lhs.words[0]
let rhsWord = rhs.words[0]
let lhsNegative = lhsWord > Int.max
let rhsNegative = rhsWord > Int.max
let sum = lhsWord &+ rhsWord
let sumNegative = sum > Int.max
switch (lhsNegative, rhsNegative, sumNegative) {
case (true, true, false);
lhs.words = [sum, UInt.max]
return
case (false, false, true);
lhs.words = [sum, 0]
return
default:
lhs.words[0] = sum
return
}
}
We definitely could do something like that, I’d need to try this code and make sure it actually catches all the cases correctly (looks fine at first glance).
… On Mar 28, 2020, at 6:18 AM, David Waite ***@***.***> wrote:
@dwaite commented on this pull request.
In Sources/BigInt/BigInt.swift <#84 (comment)>:
> + let lhsWord = lhs.words[0]
+ let rhsWord = rhs.words[0]
+
+ let (result, isOverflow) = lhsWord.addingReportingOverflow(rhsWord)
+
+ if !isOverflow && result < Int.max {
+ lhs.words[0] = result
+ return
+ }
+ let knownNegativeResult = lhsWord > Int.max && rhsWord > Int.max
+
+ if lhsWord > Int.max || rhsWord > Int.max, !knownNegativeResult {
+ // positive + negative is always smaller, so overflow is a red herring
+ lhs.words[0] = result
+ return
Have you evaluated using pattern matches to clarify some of this code and make it a bit easier to spot any edge cases. For instance, here I believe you could do:
if lhs.words.count == 1, rhs.words.count == 1 {
let lhsWord = lhs.words[0]
let rhsWord = rhs.words[0]
let lhsNegative = lhsWord > Int.max
let rhsNegative = rhsWord > Int.max
let sum = lhsWord &+ rhsWord
let sumNegative = sum > Int.max
switch (lhsNegative, rhsNegative, sumNegative) {
case (true, true, false);
lhs.words = [sum, UInt.max]
return
case (false, false, true);
lhs.words = [sum, 0]
return
default:
lhs.words[0] = sum
return
}
}
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#84 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA6NZ2R5V7N54VYEIXZV32LRJXFGJANCNFSM4J37T7WA>.
|
See also: apple#97
Hey @rothomp3, are you still working on this issue? I've been exploring a design where we keep the lowest word inline. It seems promising and I'd like to work on it more, but don't want to step on any ongoing plans you have. |
Hi all, I've shared an alternative implementation at #120 which starts from a different approach. |
Sorry I missed this message at the time, I am happy to keep working on it, but hadn’t heard anything in quite a while from the people responsible for merging it or telling me why they can’t ;) Anyway, I actually explored this option myself long before I made this PR, and in my testing it was _dramatically_ worse. It’s possible I’m doing it wrong though!
Robert Thompson
Platform Software Engineer
WillowTree
https://willowtreeapps.com
… On Jun 2, 2020, at 2:12 PM, Xiaodi Wu ***@***.***> wrote:
Hey @rothomp3 <https://github.com/rothomp3>, are you still working on this issue? I've been exploring a design where we keep the lowest word inline. It seems promising and I'd like to work on it more, but don't want to step on any ongoing plans you have.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#84 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA6NZ2WJXSFS4PMGISEM2SDRUU6HPANCNFSM4J37T7WA>.
|
Yup. Keeping the lowest word inline sure does make things worse! Check out some of the other ideas I've been exploring over at the other PR though :) |
I don't even have a good explanation for why, other than perhaps "the people who wrote |
[BigInt] Rename module
The short version is pretty much "special cases make things slower". For some input distributions, it should be a win to have a special small representation (like we do for String), but that's definitely something that we would take about adding down the road--we want to get the general implementation right first. |
What's the status on these changes? |
Well, I’m still here, I’m not sure what the next step should be, though! |
The main thing that's missing at this point is that the tests need to be bulked up pretty enormously, and we should be looking at better benchmarks as well. That's something that's OK to do on a branch, though, so I think that it's pretty reasonable to give this a final review and merge it in the near future. I will probably do a fairly invasive restructuring of the code once it's merged to split it out into somewhat finer-grained files to bring it more in line with the rest of the project, FWIW, which will cause you some hassle if you have changes on other branches that you're waiting to merge after this. |
Nope, go for it! |
As far as the tests go: there is Violet - Python VM written in Swift which has its own implementation of
Which means that technically we have 3 different implementations of
Also, in most of the cases we went with property based testing with means that we test millions of inputs to check if the general rule holds (for example: [*] There is an 64-bit assumption for the whole project. Violet is more-or-less a closed system where we write with the assumption that we are the only consumers. This allows us to concentrate on things that we need, instead of designing for every use case (for |
What are we doing about tests? I can create a pull request(s) with tests from Violet - Python VM written in Swift. In each pull request we would decide if we want them or not. The code has already been written, so this cost was already paid. If we decide not to merge them then no biggie. I think that it is worth to run them, to check if something fails. |
Btw. It may be a good idea to write an 'Int semantics' article for the official Swift blog. This would include common pitfalls:
|
Fixes #5
This is the step 2 "Sketch" PR. It's actually very complete, but obviously perhaps not ideal in style and documentation (since there is none of the latter). Probably needs more tests too, though there are some.
The basic architecture is based on Knuth's algorithms for this purpose. An integer is broken up into "digits" where each digit is in base 64 (or base 32 on 32-bit systems, which are supported). This maps nicely onto the
BinaryInteger
protocol since we can simply use an array ofUint
for the words. We're also using 2's complement internally, which is super convenient because then for any pair ofBigInt
s that are each only one word long, we can simply delegate to the regular operators for a minimal overhead.