-
Notifications
You must be signed in to change notification settings - Fork 33
Bugfix/closeable seq finalize #55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
GC on a cell of the closeable-seq can trigger the finalize method to close before the underlying channel/stream is actually done. Also adds closing logic to the File->seq of BB conversion, to ensure they're automatically closed when the last byte is read. Unit tests added, too.
Add some unit tests for untested File->*Channel conversions
@arnaudgeiser, any interest in looking at this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with both changes on the principle:
- Using
java.nio
directly - Removing the
finalize
method to prevent closing the resources when GC occurs
I haven't take the time to play with the code yet.
@@ -576,35 +577,51 @@ | |||
;; file => readable-channel | |||
(def-conversion ^{:cost 0} [File ReadableByteChannel] | |||
[file] | |||
(.getChannel (FileInputStream. file))) | |||
(-> file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The good news was the previous behavior was fine already [1].
However, the .getChannel
seemed to be present for legacy reason to be able to leverage java.nio
from java.io
. This is a welcome change!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, ok. The javadocs said they were synced in terms of things like reading position, but didn't actually say anything specifically about closing, so I didn't want to assume.
src/clj_commons/byte_streams.clj
Outdated
(let [^RandomAccessFile raf (RandomAccessFile. file (if writable? "rw" "r")) | ||
^FileChannel fc (.getChannel raf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we try to use FileChannel/open
here instead of relying java.io
instead? (as above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we could, I don't think the RAF is used for anything but getting a channel. Working with Java object arrays like providing the options is annoying, though.
buf-seq (fn buf-seq [offset] | ||
(when-not (<= (.size fc) offset) | ||
(let [remaining (- (.size fc) offset)] | ||
(when (and (.isOpen fc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess adding a isOpen
on the FileChannel causes no harm here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's required, in case the user manually closed the returned closeable-seq. If they close it, that will also .close the RAF and channel, but the seq is still usable, and can be read from up until the point of closing. Attempts to read further would result in Exceptions without checking, this way the seq just ends.
(finalize [_] | ||
(close-fn)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll have to move away from finalizers anyway, better today than tomorrow. :)
clojure.lang.Seqable | ||
(seq [this] this) | ||
|
||
clojure.lang.ISeq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it expected to have moved this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of, it throws off kondo when a protocol/interface isn't followed by the methods it declares.
Also, the equiv
method there isn't part of any of protocol/interfaces used. It's part of the IPersistentCollection interface. reify is more tolerant than I thought; it happily added the equiv method to the output class, despite not being required. I don't see anything calling it, but I figured I'd leave it alone for now.
(when close-after-reading? | ||
(close-fn)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been updated to automatically close the file upon exhaustion (Maybe this should be an option? We wouldn't want to stop tailing a growing log file just because we temporarily caught up to the end...)
To me, as we open the resource, we have the responsibility to close it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but that comment's not about responsibility, but default behavior. For a static file, we should close upon reading the last byte, since with the File -> (seq-of ByteBuffer)
conversion, there's no outside resource being supplied that the user knows they have to close. Otherwise, they have to close the seq when done with it (which isn't so bad, if it came down to it, but it's not intuitive.)
The catch is, what if you're reading a file that's growing, like tailing a log file? You effectively want an infinite seq. As it is now, if the lazy-seq ever catches up to the current end of the file, it'll stop and close, even if the file is still growing.
The more I think about it, the more I think there should be an option to turn off auto-closing if you don't want it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrrm. Unfortunately, a lot more of the code in that conversion assumes the file is static. In particular, (> (.size fc) offset)
will end the seq, even if we don't .close
the channel. If we catch up to the end of the file, it'll return nil and the lazy-seq stops, when we would prefer it block if the file is growing.
That test also means shrinking or truncating the file will make it fail to .close, though to be honest, a seq across a shrinking file doesn't make logical sense anyway.
I think I'll just add a comment for now. This will require reworking to add blocking/async behavior if we wanted it to handle growing files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Fixes #37 by removing the
finalize
method from the closeable-seq that closes input streams/channels prematurely.Since the existing behavior was already broken, and many other conversions already require you manage/close resources yourself anyway, I'm comfortable making that the requirement.
While doing this, I looked over the other conversions, to see if any of them leaked file descriptors. The
File -> (seq-of ByteBuffer)
conversion was also subject to this problem, and did not close its own resources. It's been updated to automatically close the file upon exhaustion (Maybe this should be an option? We wouldn't want to stop tailing a growing log file just because we temporarily caught up to the end...)I also updated the
File -> WritableByteChannel
andFile -> ReadableByteChannel
conversions. It wasn't 100% clear from the javadocs whether the linked streams would close when their channels did, so to avoid that, I bypassed theFile*Streams
.Finally, a bunch of tests were added to prevent regressions, including one of a large file.