Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DLFA-230: add pkg/index/IndexGitCommit() #79

Open
wants to merge 49 commits into
base: develop
Choose a base branch
from

Conversation

jgpawletko
Copy link
Member

@jgpawletko jgpawletko commented Mar 27, 2025

Overview

This PR includes two major changes:

  • the addition of pkg/index/IndexGitCommit() functionality
  • a change to the pkg/git/ListEADFilesForCommit() from:
  from:
  func ListEADFilesForCommit(string, string) (map[string]IndexerOperation, error)

  to:
  func ListEADFilesForCommit(string, string) ([]IndexerStep, error)

Other changes include:

  • pkg/index
    • the addition of a missing Solr Commit() call in DeleteEADFileDataFromIndex()
    • enhancements to the SolrClientMock functionality, including event-based assertions
    • increased test coverage
    • the implementation of the dot-git strategy for the git-repo fixture (originally implemented by David Arjanik in the debug package)
  • pkg/git
    • the addition of EADPathToEADID(string) string

Add EADPathToEADID() function
Insert "_" in certain test cases to make it easier to understand the
  purpose of the tests
Rename ErrorEvent type member "CallerName" to "FuncName"
Add    Event      type
Add    Events []Event to SolrClientMock
Add/update types and functions to support a more generalized
  "event"-based testing strategy.  The event-based test strategy is
  required to test IndexGitCommit() functionality.
Continue migrating pkg/index tests to use event-based methodology.
Update pkg/index/testutils sc.InitMockForDelete() to accept "sut"
  argument.
Continue migrating pkg/index to event-based testing
Extract goldenHash initialization into its own function
Add event-related functions to set expectations in preparation for
  IndexGitCommit() testing
Add `func EventsToString(events []Event) string`
Add `func (sc *SolrClientMock) GoldenFileHashesToString() string`
Add `func (sc *SolrClientMock)
  UpdateMockForDeleteEADFileDataFromIndex(eadid string) error`

Perform housekeeping:
  * standardize struct member initialization order in `add*Event` functions
  * alphabetize functions
Add TestIndexEADFile_Success_Events()

Add initial TestIndexGitCommit_Success_Events() skeleton to test that
  the sc.ExpectedEvents and sc.ActualEvents would work over multiple
  file-level IndexEADFile() and DeleteEADFileDataFromIndex() operations.

TestIndexEADFile_Success_Events(t *testing.T) {
+
+       repositoryCode := "fales"
+       eadid := "mss_460"
+       testEAD := filepath.Join(repositoryCode, eadid)
+       var eadPath = eadtestutils.EadFixturePath(testEAD)
+
+       sc := testutils.GetSolrClientMock()
+       sc.Reset()
+       err := sc.UpdateMockForIndexEADFile(testEAD, eadid)
+       if err != nil {
+               t.Errorf("Error updating the SolrClientMock: %s", err)
+               t.FailNow()
+       }
+
+       // Set the Solr client
+       SetSolrClient(sc)
+
+       // Index the EAD file
+       err = IndexEADFile(eadPath)
+       if err != nil {
+               t.Errorf("Error indexing EAD file: %s", err)
+       }
+
+       err = sc.CheckAssertionsViaEvents()
+       if err != nil {
+               t.Errorf("Assertions failed: %s", err)
+       }
+
+       if !sc.IsComplete() {
+               t.Errorf("not all files were added to the Solr index. Remaining values: %v", sc.GoldenFileHashes)
+       }
+}
+
 func TestIndexGitCommit_SolrClientNotSet(t *testing.T) {

        sut := "IndexGitCommit"
@@ -547,6 +581,83 @@ func TestIndexGitCommit_SolrClientMissingOriginURL(t *testing.T) {
        testutils.AssertErrorMessageContainsString(t, sut, err, expectedErrStringFragment)
 }

+func TestIndexGitCommit_Success_Events(
Snapshot of gen-repo.bash script that sets up complex git repo commits
  for IndexGitCommit() testing.
Update pkg/index/testdata/fixtures/git-repo with more complicated git
 repo as specified in https://jira.nyu.edu/browse/DLFA-230 ticket
Add git-repo test fixture init, setup, and teardown functions to
  pkg/index/index_test.go
Add TestIndexGitCommit_AddOne()
Add TestIndexGitCommit_DeleteOne()
Change the function signature of git.ListEADFilesForCommit() from:
    ListEADFilesForCommit(string, string) (map[string]IndexerOperation, error)
  to
    ListEADFilesForCommit(string, string) ([]IndexerStep, error)

  The modified function returns an ordered slice of IndexerSteps
  instead of a map because the index.IndexGitCommit() needs to process
  file operations in the order they were added to the Git commit.  The
  map could transpose the order of operations, and the order of
  operations is significant when the operations affect the same EAD.
  i.e.,
  (git rm ead.xml --> git add ead.xml) != (git add ead.xml --> git rm ead.xml)

In order to maintain atomicity of commits, multiple files in multiple
  packages were modified:

  pkg/debug
    update DumpSolrIndexerHTTPRequestsForGitCommit() to process the
      []IndexerStep value

  pkg/git
    change the ListEADFilesForCommit() signature as explained above
    add the IndexerStep type
    move the EADPathToEADID() function to be in alphabetical order
    update the TestListEADFilesForCommit() scenarios to use
      []IndexerStep

  pkg/index
    change IndexGitCommit() to process the []IndexerStep value
    remove outdated TestIndexGitCommit_Success_Events()
    add TestIndexGitCommit_AddAll()

  pkg/index/testutils
    tweak EventsToString() formatting and output
update gen-repo.bash to interleave the "add 3, delete 2" commit, i.e.,
  git add, git rm, git add, git rm, git add
Update comments with latest hashes for the git-repo test fixture

Add hash string variables at the top of the index_test.go file for
  each IndexGitCommit() scenario so that updating the hash values is
  more straightforward.

Replace hard-coded hash values in tests with hash string variables

Add TestIndexGitCommit_AddThreeDeleteTwo().
  NOTE:
  This test has a workaround because the order of add/rm operations
  returned by Go Git DIFFERS from the order in which the operations
  were performed on the repo.  It appears that the Go Git commit
  listing groups all of the "add" operations together followed by all
  of the "rm" operations.  (The relative order of the add and rm
  operations are correct within each group,
  i.e.,
    "add A,  rm B, add C, rm D, add E"
  turns into:
    "add A, add B, add C, rm B, rm  D"

Add TestIndexGitCommit_AddTwo()
Add scenario for removing, modifying, and adding the same EAD before a commit
Reorder commit scenarios so that EADs are NOT in alphabetical order
Update commit hashes to reflect latest git-repo values

Add TestIndexGitCommit_DeleteModifyAdd() to simulate pre-commit git operations

Modify mock initialization order to match Git behavior:
  * Go Git returns the changes in lexicographic order by file path,
    therefore indexing changes for a commit will also be processed
    in lexicographic order by file path
dd gitdiff alias for import
  github.com/go-git/go-git/v5/plumbing/format/diff
Restore return values of git.ListEADFilesForCommit() to:
  (map[string]IndexerOperation, error)

Update affected packages/functions:
  * pkg/debug DumpSolrIndexerHTTPRequestsForGitCommit()
  * pkg/git   ListEADFilesForCommit()
              TestListEADFilesForCommit()
  * pkg/index IndexGitCommit()
Rewrite TestCheckout() to perform precise checks of the repository
  contents after checking out each commit hash. The previous
  implementation was inadequate.
Add error capture to DeleteEADFileDataFromIndex() calls in
  * TestDeleteEADFileDataFromIndex_RollbackOnBadDelete()
  * TestDeleteEADFileDataFromIndex_ErrorOnRollback()

@@ -100,7 +100,10 @@ func TestDeleteEADFileDataFromIndex_RollbackOnBadDelete(t *testing.T) {
        SetSolrClient(sc)

        // Delete the data for the EADID
-       DeleteEADFileDataFromIndex(eadid)
+       err = DeleteEADFileDataFromIndex(eadid)
+       if err == nil {
+               t.Errorf("Expected an error, but got nil")
+       }

        // check that all expectations were met
        err = sc.CheckAssertionsViaEvents()
@@ -202,7 +205,10 @@ func TestDeleteEADFileDataFromIndex_ErrorOnRollback(t *testing.T) {
        SetSolrClient(sc)

        // Delete the data for the EADID
-       DeleteEADFileDataFromIndex(eadid)
+       err = DeleteEADFileDataFromIndex(eadid)
+       if err == nil {
+               t.Errorf("Expected an error, but got nil")
+       }

capture errors in pkg/index test functions
Rename got.Checkout() to git.CheckoutMergeReset()
Add comment warning developers of the git.CheckoutMergeReset()
  behavior with respect to files in the repo that are not
  under version control, (they will be deleted)
Update pkg/git git.go, git_test.go with new function name
Update pkg/index IndexGitCommit()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant