Skip to content

Conversation

ianwallen
Copy link
Contributor

@ianwallen ianwallen commented Sep 23, 2025

Add support to get metatada resource json object into the index.
Changed some of the conversion tool for xml to json and vice versa so that they support json arrays and other json nodes. Also fixed issue with indexer not being triggered after resources are uploaded.

Indexing resource information such as size and last modified data is helpful to be able to search on resources that were recently modified or to be able to search or summarize metadata content size quickly.

This is sample json that would be added to the index.

                    "fileStore": [
                        {
                            "lastModification": "2025-09-23T15:41:21.592+00:00",
                            "size": 450041,
                            "url": "http://localhost:8080/geonetwork/srv/api/records/37aecae5-7783-4274-b595-df02aa003ac3/attachments/test.txt",
                            "visibility": "PUBLIC"
                        },
                        {
                            "lastModification": "2023-01-12T14:40:34.000+00:00",
                            "size": 152545,
                            "url": "http://localhost:8080/geonetwork/srv/api/records/37aecae5-7783-4274-b595-df02aa003ac3/attachments/ecdc_thumbnail_radar-composite.jpg",
                            "visibility": "PUBLIC"
                        },
                        {
                            "lastModification": "2025-09-23T15:41:08.212+00:00",
                            "size": 13754,
                            "url": "http://localhost:8080/geonetwork/srv/api/records/37aecae5-7783-4274-b595-df02aa003ac3/attachments/test.json",
                            "visibility": "PUBLIC"
                        }
                    ],

Checklist

  • I have read the contribution guidelines
  • Pull request provided for main branch, backports managed with label
  • Good housekeeping of code, cleaning up comments, tests, and documentation
  • Clean commit history broken into understandable chucks, avoiding big commits with hundreds of files, cautious of reformatting and whitespace changes
  • Clean commit messages, longer verbose messages are encouraged
  • API Changes are identified in commit messages
  • Testing provided for features or enhancements using automatic tests
  • User documentation provided for new features or enhancements in manual
  • Build documentation provided for development instructions in README.md files
  • Library management using pom.xml dependency management. Update build documentation with intended library use and library tutorials or documentation

@ianwallen ianwallen force-pushed the support_for_metadata_filestore_properties_in_index branch from 81b1b65 to d584984 Compare September 23, 2025 17:51
@ianwallen ianwallen force-pushed the support_for_metadata_filestore_properties_in_index branch from d584984 to bf60288 Compare September 23, 2025 18:29
Changed some of the conversion tool for xml to json and vice versa so that they support json arrays and other json nodes.
Also fixed issue with indexer not being triggered after resources are uploaded.
@ianwallen ianwallen force-pushed the support_for_metadata_filestore_properties_in_index branch from bf60288 to 52750a8 Compare September 23, 2025 19:16
@ianwallen ianwallen marked this pull request as ready for review September 23, 2025 19:54
Copy link
Contributor

@tylerjmchugh tylerjmchugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested, works as expected

@ianwallen ianwallen added this to the 4.4.10 milestone Sep 30, 2025
@ianwallen ianwallen added the index structure change Indicate that this work introduces an index change. label Sep 30, 2025
…into support_for_metadata_filestore_properties_in_index
@juanluisrp juanluisrp self-requested a review October 16, 2025 13:20
Copy link
Contributor

@juanluisrp juanluisrp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some comments and suggestions. Can you check them if make sense, please?

.checkForSimilar()
.build();
assertFalse(diff.toString(), diff.hasDifferences());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adding here a new test for JSON object.

Suggested change
}
}
/**
* Tests the conversion of a JSON object to an XML representation and validates
* the resulting XML against an expected XML representation.
*/
@Test
public void testObjectConversion() throws Exception {
String jsonString = "{\"Test\":\"value\", \"nestedObject\": {\"nestedField\": 433}}";
String expectedElement = "<root><Test>value</Test><nestedObject><nestedField>433</nestedField></nestedObject></root>";
Element xmlFromJSON = Xml.getXmlFromJSON(jsonString);
Diff diff = DiffBuilder
.compare(Input.fromString(expectedElement))
.withTest(Input.fromString(Xml.getString(xmlFromJSON)))
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byName))
.normalizeWhitespace()
.ignoreComments()
.checkForSimilar()
.build();
assertFalse(diff.toString(), diff.hasDifferences());
}

*/
public static String getString(Node data) {
try {
TransformerFactory tf = TransformerFactory.newInstance();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it isn't a good idea from the point of view of the performance to create a new TransformerFactory every time the method is called. Shouldn't we use TransformerFactoryFactory.getTransformerFactory() instead that uses a singleton?

Suggested change
TransformerFactory tf = TransformerFactory.newInstance();
TransformerFactory tf = TransformerFactoryFactory.getTransformerFactory();

Comment on lines +693 to +709
ObjectMapper objectMapper = new ObjectMapper();
objectMapper.setSerializationInclusion(JsonInclude.Include.NON_NULL);
objectMapper.disable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS);

// Exclude fields with the index ignore annotation
objectMapper.setAnnotationIntrospector(new JacksonAnnotationIntrospector() {
@Override
public boolean hasIgnoreMarker(AnnotatedMember member) {
return member.hasAnnotation(IndexIgnore.class) || super.hasIgnoreMarker(member);
}

@Override
public PropertyName findNameForSerialization(Annotated annotated) {
if (annotated.hasAnnotation(IndexIgnore.class)) return null;
return super.findNameForSerialization(annotated);
}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating an ObjectMapper every time this method is called is inefficient. Maybe we can move it to be a member of the class?

Comment on lines +53 to +54
// Index the record so that the resources are included
v.indexMetadata(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why in MEFVisitor v.indexMetadata() is called in visit() but in MEF2Visitor is called in handleXml() method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just needed to follow the handleBin() call which was already in visit() for MEFVisitor and handleXml() for MEF2Visitor before these changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

index structure change Indicate that this work introduces an index change.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants