-
Notifications
You must be signed in to change notification settings - Fork 9
Rework Core ArcTable towards better memory efficiency and performance #537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/Core/OntologyAnnotation.fs
Outdated
@@ -4,7 +4,7 @@ open ARCtrl.Helper | |||
open Fable.Core | |||
|
|||
[<AttachMembers>] | |||
type OntologyAnnotation(?name,?tsr,?tan, ?comments) = | |||
type OntologyAnnotation(?name,?tsr,?tan, ?tanInfo, ?comments) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uhh thats a new one! What does it mean?
@@ -19,17 +19,30 @@ type OntologyAnnotation(?name,?tsr,?tan, ?comments) = | |||
| tan -> tan | |||
let mutable _comments : ResizeArray<Comment> = defaultArg comments <| ResizeArray() | |||
|
|||
let mutable _tanInfo : {| IDSpace : string; LocalID : string |} option = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh i see what it is. But isn't this bad practise? Say we add a tsr and tan then tanInfo gets auto completed?
Why do we need this inside the constructor instead of a GET only member?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not my preferred solution either, but the TanInfo is relatively expensive to compute but quite important as a basis for the HashCode.
By allowing this info to be set from the outside we can omit multiple computations of this value in some cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then there must be clear indication what takes priority over which and how it behaves. say i use the following:
OntologyAnnotation(name = "test",tsr = "TEST",tan = "00001", tanInfo = {|IDSpace = "TST"; LocalID = "22222"|})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By allowing this info to be set from the outside we can omit multiple computations of this value in some cases.
And can you give me an example for that, because currently i do not really see it
src/Core/Table/ArcTable.fs
Outdated
/// column * row index | ||
with get() = _headers | ||
and set(newHeaders) = | ||
// SanityChecks.validate newHeaders values true |> ignore // TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this meant to stay "TODO"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, fixing
@@ -372,42 +391,49 @@ type ArcTable(name: string, headers: ResizeArray<CompositeHeader>, values: Syste | |||
|
|||
// - Column API - // | |||
// GetColumnAt? | |||
member this.GetColumn(columnIndex:int) : CompositeColumn = | |||
member this.GetColumn(columnIndex:int, ?fillDefault : bool) : CompositeColumn = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fillDefault behaves different to my expectations. Did you discuss this with someone else in the department and decided that we fail if a cell does not exist inside row/column count? I thought we only fail for trying to access outside these params?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, with the fillDefault
flag every user can decide. I could add some comment to make it more clear. Or would you wish to reverse the default behavious, and have a failIfMissing
flag instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would completly remove this flag. Just return the filled columns, then check if the array contains empty cells if you want.
In my opinion it makes using the function more obscure.
Do you have a valid usecase for this?
And one more thing. F# 9 does increase f# support for null. This could allow us to fill missing cells with null, making its behavior closer to js and python while also increase computation time
src/Core/Table/ArcTable.fs
Outdated
| Some c -> c | ||
|] | ||
let cells = | ||
[| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be optimized by first checking if columnIndex points to a Constant
columnRef, then we can just fill with the same value
src/Core/Table/ArcTable.fs
Outdated
let index = defaultArg index this.RowCount | ||
let cells = | ||
let cells = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again this behaves different then my expecatation. Did we not decide on a sparse approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will wait for your answers before continuing the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True that, this should just increase the rowCount of the "Values"
@@ -19,17 +19,30 @@ type OntologyAnnotation(?name,?tsr,?tan, ?comments) = | |||
| tan -> tan | |||
let mutable _comments : ResizeArray<Comment> = defaultArg comments <| ResizeArray() | |||
|
|||
let mutable _tanInfo : {| IDSpace : string; LocalID : string |} option = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then there must be clear indication what takes priority over which and how it behaves. say i use the following:
OntologyAnnotation(name = "test",tsr = "TEST",tan = "00001", tanInfo = {|IDSpace = "TST"; LocalID = "22222"|})
@@ -19,17 +19,30 @@ type OntologyAnnotation(?name,?tsr,?tan, ?comments) = | |||
| tan -> tan | |||
let mutable _comments : ResizeArray<Comment> = defaultArg comments <| ResizeArray() | |||
|
|||
let mutable _tanInfo : {| IDSpace : string; LocalID : string |} option = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By allowing this info to be set from the outside we can omit multiple computations of this value in some cases.
And can you give me an example for that, because currently i do not really see it
src/Core/Table/ArcTable.fs
Outdated
/// column * row index | ||
with get() = _headers | ||
and set(newHeaders) = | ||
SanityChecks.validateArcTableValues newHeaders _values true |> ignore // TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the //TODO
member this.RowCount | ||
with get() = _values.RowCount | ||
and set (newRowCount) = | ||
_values.RowCount <- newRowCount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we set a rowCount we must delete all cells below that rowcount?
Say we have a cell at row 20 and we set row <- 10, then we must delete cell at row 20?
src/Core/Table/ArcTable.fs
Outdated
and set (newRowCount) = | ||
_values.RowCount <- newRowCount | ||
|
||
static member getRowCount (table:ArcTable) = table.RowCount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
member this.ColumnCount
and static member columnCount (table:ArcTable)
, then why do we have member this.RowCount
and static member getRowCount (table:ArcTable)
| None -> CompositeCell.emptyTerm | ||
| Some (CompositeCell.Unitized _) -> CompositeCell.emptyUnitized | ||
| _ -> failwith "[extendBodyCells] This should never happen, IsTermColumn header must be paired with either term or unitized cell." | ||
let removeRowCells_withIndexChange (rowIndex:int) (values:ArcTableValues) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be good to have a function like this for multiple, this would increase performance for "RemoveRows" by quite a bit
src/Core/Table/ArcTableAux.fs
Outdated
/// Therefore we need to start with the last column to not overwrite any values we still need to shift | ||
let increaseColumnIndices() = | ||
/// Get last column index | ||
let lastColumnIndex = System.Math.Max(startColCount - 1, 0) // If there are no columns. We get negative last column index. In this case just return 0. | ||
// start with last column index and go down to `index` | ||
for columnIndex = lastColumnIndex downto index do | ||
for rowIndex in 0 .. startRowCount do | ||
moveCellTo(columnIndex,rowIndex,columnIndex+numberOfNewColumns,rowIndex) values | ||
moveColumnCellsTo (columnIndex) (columnIndex + numberOfNewColumns) values | ||
/// Then we can set the new column at `index` | ||
let setNewCells() = | ||
// Not sure if this is intended? If we for example `forceReplace` a single column table with `Input`and 5 rows with a new column of `Input` .. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this comment is outdated to the previous model
setCellAt (columnIndex,rowIndex,lastValue) values | ||
|
||
let addRow (index:int) (newCells:CompositeCell []) (headers: ResizeArray<CompositeHeader>) (values:Dictionary<int*int,CompositeCell>) = | ||
let fillMissingCells (headers: ResizeArray<CompositeHeader>) (values:ArcTableValues) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be deleted? Where do we use this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still kept the function, it will fill in default cells like before. Might be interesting in cases where we expect to make many random access calls and want to evaluate default values beforehand. Not used by any of our own functions.
src/Core/Table/ArcTables.fs
Outdated
|> Seq.toArray | ||
|> Array.collect ArcTable.SplitByProtocolREF | ||
|> Array.map (fun t -> | ||
|> ResizeArray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rmv? ResizeArray.collect
should work on a seq and returns an resizearray?
|
||
|
||
let dictionary (keyDecoder : Decoder<'key>) (valueDecoder : Decoder<'value>) : Decoder<Dictionary<'key,'value>> = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have no idea what happens here, trust!
These changes were necessary for much better memory efficiency, allowing Swate to depict and edit much bigger tables.
In principle this was done in two parts:
Calling functions were adjusted accordingly