Skip to content

Conversation

@florian-glombik
Copy link
Contributor

@florian-glombik florian-glombik commented Jan 24, 2026

Checklist

General

Server

  • Important: I implemented the changes with a very good performance and prevented too many (unnecessary) and too complex database calls.
  • I strictly followed the principle of data economy for all database calls.
  • I strictly followed the server coding and design guidelines.
  • I added multiple integration tests (Spring) related to the features (with a high test coverage).
  • I added pre-authorization annotations according to the guidelines and checked the course groups for all new REST Calls (security).
  • I documented the Java code using JavaDoc style.

Client

  • Important: I implemented the changes with a very good performance, prevented too many (unnecessary) REST calls and made sure the UI is responsive, even with large data (e.g. using paging).
  • I strictly followed the principle of data economy for all client-server REST calls.
  • I strictly followed the client coding guidelines.
  • I strictly followed the AET UI-UX guidelines.
  • Following the theming guidelines, I specified colors only in the theming variable files and checked that the changes look consistent in both the light and the dark theme.
  • I added multiple integration tests (Jest) related to the features (with a high test coverage), while following the test guidelines.
  • I added authorities to all new routes and checked the course groups for displaying navigation elements (links, buttons).
  • I documented the TypeScript code using JSDoc style.
  • I added multiple screenshots/screencasts of my UI changes.
  • I translated all newly inserted strings into English and German.

Changes affecting Programming Exercises

  • High priority: I tested all changes and their related features with all corresponding user types on a test server configured with the integrated lifecycle setup (LocalVC and LocalCI).
  • I tested all changes and their related features with all corresponding user types on a test server configured with LocalVC and Jenkins.

Motivation and Context

Description

Steps for Testing

Prerequisites:

  • 1 Instructor
  • 2 Students
  • 1 Programming Exercise with Complaints enabled
  1. Log in to Artemis
  2. Navigate to Course Administration
  3. ...

Exam Mode Testing

Prerequisites:

  • 1 Instructor
  • 2 Students
  • 1 Exam with a Programming Exercise
  1. Log in to Artemis
  2. Participate in the exam as a student
  3. Make sure that the UI of the programming exercise in the exam mode stays unchanged. You can use the exam mode documentation as reference.
  4. ...

Testserver States

You can manage test servers using Helios. Check environment statuses in the environment list. To deploy to a test server, go to the CI/CD page, find your PR or branch, and trigger the deployment.

Review Progress

Performance Review

  • I (as a reviewer) confirm that the client changes (in particular related to REST calls and UI responsiveness) are implemented with a very good performance even for very large courses with more than 2000 students.
  • I (as a reviewer) confirm that the server changes (in particular related to database calls) are implemented with a very good performance even for very large courses with more than 2000 students.

Code Review

  • Code Review 1
  • Code Review 2

Manual Tests

  • Test 1
  • Test 2

Exam Mode Test

  • Test 1
  • Test 2

Performance Tests

  • Test 1
  • Test 2

Test Coverage

Warning: Server tests failed. Coverage could not be fully measured. Please check the workflow logs.

Last updated: 2026-01-24 17:21:47 UTC

Screenshots

Summary by CodeRabbit

  • New Features

    • Added Weaviate vector database integration with configurable connection settings (host, port, secure connection)
    • Implemented automatic schema validation on startup with strict and non-strict modes
    • Added health check functionality for Weaviate connectivity
  • Chores

    • Added Weaviate client dependency and configuration support
    • Updated documentation with integration setup guide

✏️ Tip: You can customize this high-level summary in your review settings.

@florian-glombik florian-glombik requested review from a team and krusche as code owners January 24, 2026 17:11
@github-project-automation github-project-automation bot moved this to Work In Progress in Artemis Development Jan 24, 2026
@github-actions github-actions bot added server Pull requests that update Java code. (Added Automatically!) documentation config-change Pull requests that change the config in a way that they require a deployment via Ansible. core Pull requests that affect the corresponding module labels Jan 24, 2026
@github-actions
Copy link

@florian-glombik Your PR description needs attention before it can be reviewed:

Issues Found

  1. No checkboxes are checked in the PR description
  2. Motivation/Context section is missing or needs improvement
  3. Description section is missing or needs improvement
  4. Testing instructions are missing or need more specific steps

How to Fix

  • Check the boxes that apply to your changes in the Checklist section
  • Provide a brief motivation/context explaining why this change is needed.
  • Add a real description under the Description section.
  • Replace the 'Steps for Testing' placeholders ('...') with concrete, step-by-step actions testers can follow.

This check validates that your PR description follows the PR template. A complete description helps reviewers understand your changes and speeds up the review process.

Note: This description validation is an experimental feature. If you observe false positives, please send a DM with a link to the wrong comment to Patrick Bassner on Slack. Thank you!

@github-actions
Copy link

@florian-glombik Test coverage could not be fully measured because some tests failed. Please check the workflow logs for details.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 24, 2026

Walkthrough

This PR introduces Weaviate vector database integration for the Artemis platform, including Spring configuration for conditional client initialization, schema definitions for five collections (Lectures, LectureTranscriptions, LectureUnitSegments, LectureUnits, Faqs) mirroring Iris schemas, a service managing collection creation and data operations, and supporting infrastructure.

Changes

Cohort / File(s) Summary
Build & Dependency Configuration
gradle.properties, build.gradle
Added Weaviate client dependency version 6.0.0 via Gradle configuration
Spring Configuration
src/main/java/de/tum/cit/aet/artemis/core/config/weaviate/WeaviateClientConfiguration.java, WeaviateConfigurationProperties.java
Created conditional Spring configuration class for Weaviate client bean with support for secure/non-secure connections; introduced properties class for host, port, gRPC, and schema validation settings
Schema Definition Framework
src/main/java/de/tum/cit/aet/artemis/core/config/weaviate/schema/WeaviateDataType.java, WeaviatePropertyDefinition.java, WeaviateReferenceDefinition.java, WeaviateCollectionSchema.java, WeaviateSchemas.java
Defined core schema types (enum for data types, records for properties/references/collections) and centralized schema registry with five collection schemas (LECTURES, LECTURE\_TRANSCRIPTIONS, LECTURE\_UNIT\_SEGMENTS, LECTURE\_UNITS, FAQS) aligned with Iris Python definitions
Weaviate Service
src/main/java/de/tum/cit/aet/artemis/core/service/weaviate/WeaviateService.java
Implemented service for startup collection initialization, lecture and FAQ data insertion, and health checks
Spring Integration
src/main/resources/META-INF/spring.factories
Registered WeaviateSchemaValidationFailureAnalyzer in Spring Boot failure analysis pipeline
Documentation
documentation/docs/developer/weaviate-setup.mdx, PR_DESCRIPTION.md
Added Weaviate setup guide with configuration options and comprehensive PR documentation including schema validation behavior and test coverage planning

Sequence Diagram(s)

sequenceDiagram
    participant App as Artemis App Startup
    participant Config as WeaviateClientConfiguration
    participant Client as WeaviateClient
    participant Server as Weaviate Server
    participant Service as WeaviateService

    App->>Config: Initialize Spring beans (artemis.weaviate.enabled=true)
    Config->>Client: Create WeaviateClient with config (host, port, secure)
    Client->>Server: Connect to Weaviate instance
    Server-->>Client: Connection established
    Client-->>Config: WeaviateClient bean ready
    Config-->>App: Configuration complete
    App->>Service: Instantiate WeaviateService
    Service->>Service: `@PostConstruct`: initializeCollections()
    loop For each schema in ALL_SCHEMAS
        Service->>Server: Check if collection exists
        alt Collection missing
            Service->>Server: Create collection with properties and references
            Server-->>Service: Collection created
        else Collection exists
            Server-->>Service: Confirmed
        end
    end
    Service-->>App: Collections initialized
    App->>Service: isHealthy()
    Service->>Server: List collections
    Server-->>Service: Success/Failure
    Service-->>App: Health status
Loading
sequenceDiagram
    participant Client as Application Code
    participant Service as WeaviateService
    participant Schema as WeaviateSchemas
    participant Weaviate as Weaviate Server

    Client->>Service: insertLecturePageChunk(courseId, ..., pageTextContent, ...)
    Service->>Schema: getSchema("Lectures")
    Schema-->>Service: LECTURES_SCHEMA with properties
    Service->>Weaviate: getCollection("Lectures")
    Weaviate-->>Service: CollectionHandle
    Service->>Weaviate: Insert document with mapped properties
    Weaviate-->>Service: Document inserted
    Service-->>Client: Operation complete

    Client->>Service: insertFaq(courseId, ..., questionTitle, questionAnswer)
    Service->>Schema: getSchema("Faqs")
    Schema-->>Service: FAQS_SCHEMA with properties
    Service->>Weaviate: getCollection("Faqs")
    Weaviate-->>Service: CollectionHandle
    Service->>Weaviate: Insert FAQ document
    Weaviate-->>Service: Document inserted
    Service-->>Client: Operation complete
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main change: inserting lecture metadata into Weaviate. However, it focuses on one aspect (data insertion) while the PR encompasses broader Weaviate integration including schema validation, configuration, and supporting infrastructure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/general/global-search/add-startup-schema-validation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@src/main/java/de/tum/cit/aet/artemis/core/service/weaviate/WeaviateService.java`:
- Around line 131-140: insertLecturePageChunk currently builds the payload with
Map.of which throws on any null value (e.g., baseUrl or other nullable fields);
change the code to validate required parameters up-front and construct the
insert map using a mutable builder (e.g., new HashMap<>() / Map.ofEntries
alternative) so you only put optional fields when non-null, then pass that
sanitized map to collection.data.insert; apply the same pattern to the other
insert methods referenced (lines ~161-169) to avoid NPEs from Map.of.
🧹 Nitpick comments (9)
src/main/java/de/tum/cit/aet/artemis/core/config/weaviate/schema/WeaviateReferenceDefinition.java (1)

21-23: Consider removing redundant factory method or adding validation.

The of() factory method simply delegates to the canonical constructor without adding any value. Either remove it and use the constructor directly, or add input validation to justify its existence.

Option 1: Remove redundant factory method
-    /**
-     * Creates a reference definition.
-     *
-     * `@param` name             the reference name
-     * `@param` targetCollection the target collection name
-     * `@param` description      the description
-     * `@return` the reference definition
-     */
-    public static WeaviateReferenceDefinition of(String name, String targetCollection, String description) {
-        return new WeaviateReferenceDefinition(name, targetCollection, description);
-    }
Option 2: Add validation to justify factory method
     public static WeaviateReferenceDefinition of(String name, String targetCollection, String description) {
+        if (name == null || name.isBlank()) {
+            throw new IllegalArgumentException("Reference name must not be null or blank");
+        }
+        if (targetCollection == null || targetCollection.isBlank()) {
+            throw new IllegalArgumentException("Target collection must not be null or blank");
+        }
         return new WeaviateReferenceDefinition(name, targetCollection, description);
     }

Based on learnings, input validation is recommended for server-side Java code.

src/main/java/de/tum/cit/aet/artemis/core/config/weaviate/schema/WeaviateCollectionSchema.java (2)

16-38: Defensively copy/validate schema lists to avoid accidental mutation.

The record stores the provided Lists as-is; if callers mutate them later, schema lookups can drift. A compact constructor can copy and null-check once.

♻️ Proposed fix
 public record WeaviateCollectionSchema(String collectionName, List<WeaviatePropertyDefinition> properties, List<WeaviateReferenceDefinition> references) {
+
+    public WeaviateCollectionSchema {
+        java.util.Objects.requireNonNull(collectionName, "collectionName");
+        properties = List.copyOf(java.util.Objects.requireNonNull(properties, "properties"));
+        references = List.copyOf(java.util.Objects.requireNonNull(references, "references"));
+    }

74-75: Provide clearer feedback on duplicate property names.

Collectors.toMap throws an IllegalStateException on duplicates; a merge function can fail fast with a targeted message.

♻️ Proposed fix
-        return properties.stream().collect(Collectors.toMap(WeaviatePropertyDefinition::name, p -> p));
+        return properties.stream().collect(Collectors.toMap(
+                WeaviatePropertyDefinition::name,
+                p -> p,
+                (a, b) -> {
+                    throw new IllegalArgumentException("Duplicate property name: " + a.name());
+                }
+        ));
src/main/java/de/tum/cit/aet/artemis/core/config/weaviate/schema/WeaviatePropertyDefinition.java (1)

14-50: Add basic null/blank validation for schema definitions.

Fail fast on invalid names/data types to avoid obscure errors during schema creation.

♻️ Proposed fix
 public record WeaviatePropertyDefinition(String name, WeaviateDataType dataType, boolean indexSearchable, boolean indexFilterable, String description) {
+
+    public WeaviatePropertyDefinition {
+        if (name == null || name.isBlank()) {
+            throw new IllegalArgumentException("name must not be blank");
+        }
+        java.util.Objects.requireNonNull(dataType, "dataType");
+        description = (description == null) ? "" : description;
+    }
Based on learnings, validate inputs early in server-side code.
src/main/java/de/tum/cit/aet/artemis/core/config/weaviate/WeaviateConfigurationProperties.java (1)

21-73: Validate host/port values to fail fast on misconfiguration.

Blank hosts or invalid ports currently pass through and only fail later during connection setup. Guard clauses keep failures actionable.

♻️ Proposed fix
     public void setHost(String host) {
-        this.host = host;
+        if (host == null || host.isBlank()) {
+            throw new IllegalArgumentException("artemis.weaviate.host must not be blank");
+        }
+        this.host = host;
     }
@@
     public void setPort(int port) {
-        this.port = port;
+        if (port <= 0 || port > 65535) {
+            throw new IllegalArgumentException("artemis.weaviate.port must be between 1 and 65535");
+        }
+        this.port = port;
     }
@@
     public void setGrpcPort(int grpcPort) {
-        this.grpcPort = grpcPort;
+        if (grpcPort <= 0 || grpcPort > 65535) {
+            throw new IllegalArgumentException("artemis.weaviate.grpc-port must be between 1 and 65535");
+        }
+        this.grpcPort = grpcPort;
     }
Based on learnings, validate inputs early in server-side code.
PR_DESCRIPTION.md (2)

11-11: Consider replacing “very good performance” to satisfy style checks.

✏️ Proposed fix
- - [ ] **Important**: I implemented the changes with a [very good performance](https://docs.artemis.cit.tum.de/dev/guidelines/performance/) and prevented too many (unnecessary) and too complex database calls.
+ - [ ] **Important**: I implemented the changes with [excellent performance](https://docs.artemis.cit.tum.de/dev/guidelines/performance/) and prevented too many (unnecessary) and too complex database calls.
@@
- - [ ] I (as a reviewer) confirm that the server changes (in particular related to database calls) are implemented with a very good performance even for very large courses with more than 2000 students.
+ - [ ] I (as a reviewer) confirm that the server changes (in particular related to database calls) are implemented with excellent performance even for very large courses with more than 2000 students.

Also applies to: 121-121


90-110: Use headings instead of bolded emphasis for test sections (MD036).

✏️ Proposed fix
-**Test 1: Basic Startup with Weaviate Enabled**
+#### Test 1: Basic Startup with Weaviate Enabled
@@
-**Test 2: Schema Validation Failure (Strict Mode)**
+#### Test 2: Schema Validation Failure (Strict Mode)
@@
-**Test 3: Schema Validation Warning (Non-Strict Mode)**
+#### Test 3: Schema Validation Warning (Non-Strict Mode)
@@
-**Test 4: Disabled Weaviate**
+#### Test 4: Disabled Weaviate
src/main/java/de/tum/cit/aet/artemis/core/config/weaviate/WeaviateClientConfiguration.java (1)

34-52: Add destroyMethod = "close" to the @Bean annotation to ensure the Weaviate client closes its connections on shutdown.

The WeaviateClient implements AutoCloseable and its close() method properly closes both the REST transport and gRPC channels. Wire it as a destroy method in the bean definition to prevent resource leaks.

♻️ Proposed fix
-    `@Bean`
+    `@Bean`(destroyMethod = "close")
     public WeaviateClient weaviateClient() {
src/main/java/de/tum/cit/aet/artemis/core/config/weaviate/schema/WeaviateSchemas.java (1)

295-303: Consider returning Optional<WeaviateCollectionSchema> instead of nullable.

Returning Optional instead of null would make the API safer and more explicit about the possibility of missing schemas, following modern Java conventions.

♻️ Suggested improvement
+import java.util.Optional;
+
 /**
  * Gets a schema by collection name.
  *
  * `@param` collectionName the collection name
- * `@return` the schema, or null if not found
+ * `@return` the schema wrapped in Optional, or empty if not found
  */
-public static WeaviateCollectionSchema getSchema(String collectionName) {
-    return SCHEMAS_BY_NAME.get(collectionName);
+public static Optional<WeaviateCollectionSchema> getSchema(String collectionName) {
+    return Optional.ofNullable(SCHEMAS_BY_NAME.get(collectionName));
 }

Comment on lines +131 to +140
public void insertLecturePageChunk(long courseId, String courseLanguage, long lectureId, long lectureUnitId, String pageTextContent, int pageNumber, String baseUrl,
int attachmentVersion) {

var collection = getCollection(WeaviateSchemas.LECTURES_COLLECTION);

try {
collection.data.insert(Map.of(WeaviateSchemas.LecturesProperties.COURSE_ID, courseId, WeaviateSchemas.LecturesProperties.COURSE_LANGUAGE, courseLanguage,
WeaviateSchemas.LecturesProperties.LECTURE_ID, lectureId, WeaviateSchemas.LecturesProperties.LECTURE_UNIT_ID, lectureUnitId,
WeaviateSchemas.LecturesProperties.PAGE_TEXT_CONTENT, pageTextContent, WeaviateSchemas.LecturesProperties.PAGE_NUMBER, pageNumber,
WeaviateSchemas.LecturesProperties.BASE_URL, baseUrl, WeaviateSchemas.LecturesProperties.ATTACHMENT_VERSION, attachmentVersion));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard against nulls before Map.of insertion.

Map.of rejects null values; any nullable field (e.g., courseDescription or baseUrl) will throw an NPE and skip insertion. Consider validating required fields explicitly and conditionally including optional ones.

🐛 Proposed fix (pattern)
-            collection.data.insert(Map.of(WeaviateSchemas.LecturesProperties.COURSE_ID, courseId, WeaviateSchemas.LecturesProperties.COURSE_LANGUAGE, courseLanguage,
-                    WeaviateSchemas.LecturesProperties.LECTURE_ID, lectureId, WeaviateSchemas.LecturesProperties.LECTURE_UNIT_ID, lectureUnitId,
-                    WeaviateSchemas.LecturesProperties.PAGE_TEXT_CONTENT, pageTextContent, WeaviateSchemas.LecturesProperties.PAGE_NUMBER, pageNumber,
-                    WeaviateSchemas.LecturesProperties.BASE_URL, baseUrl, WeaviateSchemas.LecturesProperties.ATTACHMENT_VERSION, attachmentVersion));
+            Map<String, Object> payload = new java.util.HashMap<>();
+            payload.put(WeaviateSchemas.LecturesProperties.COURSE_ID, courseId);
+            payload.put(WeaviateSchemas.LecturesProperties.COURSE_LANGUAGE, courseLanguage);
+            payload.put(WeaviateSchemas.LecturesProperties.LECTURE_ID, lectureId);
+            payload.put(WeaviateSchemas.LecturesProperties.LECTURE_UNIT_ID, lectureUnitId);
+            payload.put(WeaviateSchemas.LecturesProperties.PAGE_TEXT_CONTENT, pageTextContent);
+            payload.put(WeaviateSchemas.LecturesProperties.PAGE_NUMBER, pageNumber);
+            if (baseUrl != null) {
+                payload.put(WeaviateSchemas.LecturesProperties.BASE_URL, baseUrl);
+            }
+            payload.put(WeaviateSchemas.LecturesProperties.ATTACHMENT_VERSION, attachmentVersion);
+            collection.data.insert(payload);
@@
-            collection.data.insert(Map.of(WeaviateSchemas.FaqsProperties.COURSE_ID, courseId, WeaviateSchemas.FaqsProperties.COURSE_NAME, courseName,
-                    WeaviateSchemas.FaqsProperties.COURSE_DESCRIPTION, courseDescription, WeaviateSchemas.FaqsProperties.COURSE_LANGUAGE, courseLanguage,
-                    WeaviateSchemas.FaqsProperties.FAQ_ID, faqId, WeaviateSchemas.FaqsProperties.QUESTION_TITLE, questionTitle, WeaviateSchemas.FaqsProperties.QUESTION_ANSWER,
-                    questionAnswer));
+            Map<String, Object> payload = new java.util.HashMap<>();
+            payload.put(WeaviateSchemas.FaqsProperties.COURSE_ID, courseId);
+            payload.put(WeaviateSchemas.FaqsProperties.COURSE_NAME, courseName);
+            if (courseDescription != null) {
+                payload.put(WeaviateSchemas.FaqsProperties.COURSE_DESCRIPTION, courseDescription);
+            }
+            payload.put(WeaviateSchemas.FaqsProperties.COURSE_LANGUAGE, courseLanguage);
+            payload.put(WeaviateSchemas.FaqsProperties.FAQ_ID, faqId);
+            payload.put(WeaviateSchemas.FaqsProperties.QUESTION_TITLE, questionTitle);
+            payload.put(WeaviateSchemas.FaqsProperties.QUESTION_ANSWER, questionAnswer);
+            collection.data.insert(payload);
Based on learnings, validate and sanitize inputs early in server-side code.

Also applies to: 161-169

🤖 Prompt for AI Agents
In
`@src/main/java/de/tum/cit/aet/artemis/core/service/weaviate/WeaviateService.java`
around lines 131 - 140, insertLecturePageChunk currently builds the payload with
Map.of which throws on any null value (e.g., baseUrl or other nullable fields);
change the code to validate required parameters up-front and construct the
insert map using a mutable builder (e.g., new HashMap<>() / Map.ofEntries
alternative) so you only put optional fields when non-null, then pass that
sanitized map to collection.data.insert; apply the same pattern to the other
insert methods referenced (lines ~161-169) to avoid NPEs from Map.of.

@github-project-automation github-project-automation bot moved this from Work In Progress to Ready For Review in Artemis Development Jan 24, 2026
@florian-glombik florian-glombik changed the title Feature/general/global search/add startup schema validation Development: Insert lecture metadata into Weaviate Jan 24, 2026
@github-actions
Copy link

End-to-End (E2E) Test Results Summary

TestsPassed ✅Skipped ⚠️FailedTime ⏱
End-to-End (E2E) Test Report223 ran222 passed1 skipped0 failed1h 37m 4s 142ms
TestResultTime ⏱
No test annotations available

@florian-glombik florian-glombik marked this pull request as draft January 24, 2026 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

config-change Pull requests that change the config in a way that they require a deployment via Ansible. core Pull requests that affect the corresponding module documentation server Pull requests that update Java code. (Added Automatically!)

Projects

Status: Ready For Review

Development

Successfully merging this pull request may close these issues.

2 participants