Skip to content

Comments

fix: duplicate course ID detection was not working#38030

Open
bradenmacdonald wants to merge 4 commits intoopenedx:masterfrom
open-craft:braden/fix-case-sensitivity-issues
Open

fix: duplicate course ID detection was not working#38030
bradenmacdonald wants to merge 4 commits intoopenedx:masterfrom
open-craft:braden/fix-case-sensitivity-issues

Conversation

@bradenmacdonald
Copy link
Contributor

@bradenmacdonald bradenmacdonald commented Feb 19, 2026

Description

This fixes the following bugs:

  • In Studio, you can create a course whose ID differs from the ID of an existing course only in capitalization (a "case duplicate"). This creates two entries in SplitModulestoreCourseIndex but there can only be a single CourseOverview, so the two different courses are essentially merged and only one can be seen or accessed from Studio. This likely results in all kinds of problems.
  • In Studio, you can re-run a course to create a new course whose ID differs from the ID of an existing course only in capitalization (a "case duplicate"). You can even overwrite the original course, triggering a series of errors that make the course disappear entirely from the Studio course listing.

Bug Explanation

We're seeing this bug because this line isn't working, because Django's __iexact match, which is supposed to force case-insensitive matching doesn't actually force it, but just uses MySQL's LIKE operator which is usually case insensitive, but not always, depending on the collation of the strings in question.

In this case, because our course_id field is case sensitive (to match MongoDB), it means the __iexact operator is not working properly and is actually working like an __exact operator (case sensitive). Surprisingly, they don't have any warning about this in the Django docs, and none of our test cases caught it., even though it has fundamentally broken the has_course(... ignore_case=True) modulestore API. because we haven't been running the tests against MySQL, and the issue doesn't occur with SQLite.

Testing instructions

Try reproducing the bugs described above.

How could we have caught this sooner?

There are tests like ContentStoreTest::test_course_with_different_cases which are designed to catch this issue, but it seems like we are only running the test suite using SQLite, and it obviously won't catch issues that are specific to MySQL.

Operator Notice

This includes a migration to make sure the database enforces case-insensitive course uniqueness going forward. If any Open edX platform instance has "case duplicate" courses that were accidentally created, the migration will fail to apply with an error like this:

django.db.utils.IntegrityError: (1062, "Duplicate entry 'course-v1:universityx+phys835+run' for key 'split_modulestore_django_splitmodulestorecourseindex.splitmodulestorecourseindex_courseid_unique_ci'")

Such courses will likely already be broken to some extent, so it should be safe to delete one of the duplicates. To fix this, go to the Django admin at (studio)/admin/split_modulestore_django/splitmodulestorecourseindex/ and delete the duplicate course indexes (or rename their course_id value).

@openedx-webhooks openedx-webhooks added open-source-contribution PR author is not from Axim or 2U core contributor PR author is a Core Contributor (who may or may not have write access to this repo). labels Feb 19, 2026
@openedx-webhooks
Copy link

openedx-webhooks commented Feb 19, 2026

Thanks for the pull request, @bradenmacdonald!

This repository is currently maintained by @openedx/wg-maintenance-openedx-platform.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details
Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@github-project-automation github-project-automation bot moved this to Needs Triage in Contributions Feb 19, 2026
@bradenmacdonald bradenmacdonald force-pushed the braden/fix-case-sensitivity-issues branch from eab585b to 78ef29a Compare February 19, 2026 23:44
@kdmccormick
Copy link
Member

kdmccormick commented Feb 20, 2026

Do we know if the tests case didn't catch it because they're using SQLite?

@bradenmacdonald
Copy link
Contributor Author

@kdmccormick Ah, yeah, probably. For some reason I assumed all this time that our tests ran on MySQL on CI... they never get run on MySQL at all?

@bradenmacdonald bradenmacdonald force-pushed the braden/fix-case-sensitivity-issues branch from 78ef29a to 73ef990 Compare February 20, 2026 18:09
@bradenmacdonald
Copy link
Contributor Author

@kdmccormick Yes, there are tests like ContentStoreTest::test_course_with_different_cases which are designed to catch this issue, but if they aren't being run against MySQL, they won't catch bugs that only occur on MySQL.

@kdmccormick
Copy link
Member

kdmccormick commented Feb 20, 2026

@bradenmacdonald It seems to me that we mis-leadingly install MySQL before running unit tests, and then just use SQLite instead. My hypothesis is that this enabled the MySQL 6->8 upgrade team to run tests with MySQL on an ad-hoc basis, which makes sense, but it's unfortunate that we didn't either fully switch over to MySQL tests, or at least clean up the unit-tests.yml github workflow to be less confusing.

We suspect that running migrations in MySQL would be horrendously slow, but maybe we could fix that by squashing a lot of migrations. I have a draft PR here just to see what testing with MySQL would look like: #38033

@bradenmacdonald bradenmacdonald marked this pull request as ready for review February 20, 2026 18:58
@bradenmacdonald bradenmacdonald requested review from kdmccormick and ormsbee and removed request for farhan, irtazaakram and salman2013 February 20, 2026 18:58
@bradenmacdonald
Copy link
Contributor Author

bradenmacdonald commented Feb 20, 2026

@kdmccormick Even if running tests on MySQL is slow, we should at least be doing it on the master branch or (if it's extremely slow) on a daily basis, either of which is easy to do with GitHub actions.

Thanks for opening that PR to check into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core contributor PR author is a Core Contributor (who may or may not have write access to this repo). open-source-contribution PR author is not from Axim or 2U

Projects

Status: Needs Triage

Development

Successfully merging this pull request may close these issues.

3 participants