Skip to content

Bugfix/closeable seq finalize #55

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 21, 2022
Merged

Conversation

KingMob
Copy link
Collaborator

@KingMob KingMob commented May 16, 2022

Fixes #37 by removing the finalize method from the closeable-seq that closes input streams/channels prematurely.

Since the existing behavior was already broken, and many other conversions already require you manage/close resources yourself anyway, I'm comfortable making that the requirement.

While doing this, I looked over the other conversions, to see if any of them leaked file descriptors. The File -> (seq-of ByteBuffer) conversion was also subject to this problem, and did not close its own resources. It's been updated to automatically close the file upon exhaustion (Maybe this should be an option? We wouldn't want to stop tailing a growing log file just because we temporarily caught up to the end...)

I also updated the File -> WritableByteChannel and File -> ReadableByteChannel conversions. It wasn't 100% clear from the javadocs whether the linked streams would close when their channels did, so to avoid that, I bypassed the File*Streams.

Finally, a bunch of tests were added to prevent regressions, including one of a large file.

KingMob added 2 commits May 16, 2022 23:08
GC on a cell of the closeable-seq can trigger the finalize method to close
before the underlying channel/stream is actually done.

Also adds closing logic to the File->seq of BB conversion, to ensure they're
automatically closed when the last byte is read.

Unit tests added, too.
Add some unit tests for untested File->*Channel conversions
@KingMob
Copy link
Collaborator Author

KingMob commented May 16, 2022

@arnaudgeiser, any interest in looking at this?

Copy link
Collaborator

@arnaudgeiser arnaudgeiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with both changes on the principle:

  • Using java.nio directly
  • Removing the finalize method to prevent closing the resources when GC occurs

I haven't take the time to play with the code yet.

@@ -576,35 +577,51 @@
;; file => readable-channel
(def-conversion ^{:cost 0} [File ReadableByteChannel]
[file]
(.getChannel (FileInputStream. file)))
(-> file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The good news was the previous behavior was fine already [1].
However, the .getChannel seemed to be present for legacy reason to be able to leverage java.nio from java.io. This is a welcome change!

[1] : https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/nio/ch/FileChannelImpl.java#L201-L208

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, ok. The javadocs said they were synced in terms of things like reading position, but didn't actually say anything specifically about closing, so I didn't want to assume.

Comment on lines 599 to 600
(let [^RandomAccessFile raf (RandomAccessFile. file (if writable? "rw" "r"))
^FileChannel fc (.getChannel raf)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we try to use FileChannel/open here instead of relying java.io instead? (as above)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we could, I don't think the RAF is used for anything but getting a channel. Working with Java object arrays like providing the options is annoying, though.

buf-seq (fn buf-seq [offset]
(when-not (<= (.size fc) offset)
(let [remaining (- (.size fc) offset)]
(when (and (.isOpen fc)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess adding a isOpen on the FileChannel causes no harm here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's required, in case the user manually closed the returned closeable-seq. If they close it, that will also .close the RAF and channel, but the seq is still usable, and can be read from up until the point of closing. Attempts to read further would result in Exceptions without checking, this way the seq just ends.

Comment on lines -205 to -206
(finalize [_]
(close-fn))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have to move away from finalizers anyway, better today than tomorrow. :)

https://openjdk.java.net/jeps/421

clojure.lang.Seqable
(seq [this] this)

clojure.lang.ISeq
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it expected to have moved this line?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of, it throws off kondo when a protocol/interface isn't followed by the methods it declares.

Also, the equiv method there isn't part of any of protocol/interfaces used. It's part of the IPersistentCollection interface. reify is more tolerant than I thought; it happily added the equiv method to the output class, despite not being required. I don't see anything calling it, but I figured I'd leave it alone for now.

Comment on lines +617 to +618
(when close-after-reading?
(close-fn))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been updated to automatically close the file upon exhaustion (Maybe this should be an option? We wouldn't want to stop tailing a growing log file just because we temporarily caught up to the end...)

To me, as we open the resource, we have the responsibility to close it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but that comment's not about responsibility, but default behavior. For a static file, we should close upon reading the last byte, since with the File -> (seq-of ByteBuffer) conversion, there's no outside resource being supplied that the user knows they have to close. Otherwise, they have to close the seq when done with it (which isn't so bad, if it came down to it, but it's not intuitive.)

The catch is, what if you're reading a file that's growing, like tailing a log file? You effectively want an infinite seq. As it is now, if the lazy-seq ever catches up to the current end of the file, it'll stop and close, even if the file is still growing.

The more I think about it, the more I think there should be an option to turn off auto-closing if you don't want it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrrm. Unfortunately, a lot more of the code in that conversion assumes the file is static. In particular, (> (.size fc) offset) will end the seq, even if we don't .close the channel. If we catch up to the end of the file, it'll return nil and the lazy-seq stops, when we would prefer it block if the file is growing.

That test also means shrinking or truncating the file will make it fail to .close, though to be honest, a seq across a shrinking file doesn't make logical sense anyway.

I think I'll just add a comment for now. This will require reworking to add blocking/async behavior if we wanted it to handle growing files.

KingMob added 3 commits May 17, 2022 13:33
Add usage note on File conversion
Also appease Eastwood
Clean up ns and references
@KingMob KingMob marked this pull request as ready for review May 17, 2022 06:34
@KingMob KingMob requested a review from slipset as a code owner May 17, 2022 06:34
@KingMob KingMob removed the request for review from slipset May 17, 2022 06:35
Copy link
Collaborator

@arnaudgeiser arnaudgeiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@KingMob KingMob merged commit 5d4b945 into master May 21, 2022
@KingMob KingMob deleted the bugfix/closeable-seq-finalize branch May 21, 2022 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

closeable-seq may end prematurely after GC?
2 participants