Skip to content

Add more info to inheritance chain #20701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: dev
Choose a base branch
from

Conversation

arash77
Copy link
Member

@arash77 arash77 commented Jul 28, 2025

This PR adds a same_user flag to dataset inheritance metadata to help detect whether copied datasets come from the same user. The goal is to improve plagiarism detection while avoiding false positives and without exposing user IDs. I’m open to feedback on whether this approach is acceptable or if exposing user_id would be preferred for flexibility.
This is needed because existing endpoints don’t reliably provide user info for inherited datasets, and accessing user IDs via history is only possible if the history is public or shared, which is often not the case.
This way, if the user copied a dataset, we should see somewhere from this endpoint that it is not from the user, for example:

[
  {
    "id": "3f5830403180d620",
    "name": "test",
    "dep": "Copy of 'test'",
    "same_user": true
  },
  {
    "id": "ebfb8f50c6abde6d",
    "name": "test",
    "dep": "test",
    "same_user": false
  }
]

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@github-actions github-actions bot added this to the 25.1 milestone Jul 28, 2025
@arash77 arash77 changed the title Add id inheritance Add more info to inheritance chain Jul 28, 2025
Copy link
Member

@jmchilton jmchilton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sold on the end goal exactly but this particular implementation and augmentation to the API seems perfectly fine to me.

Copy link
Member

@mvdbeek mvdbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so awkward, I don't think this should be part of the API

@mvdbeek mvdbeek dismissed their stale review July 29, 2025 08:31

It's part of an API that is already terribly awkward, so I guess it doesn't make things worse.

@arash77
Copy link
Member Author

arash77 commented Jul 29, 2025

Yeah, it does look awkward to me too, but honestly I’m not sure what the ideal structure should be either.

@mvdbeek
Copy link
Member

mvdbeek commented Jul 29, 2025

user_id would be preferred for flexibility.

I think I would prefer that

@arash77 arash77 force-pushed the add-id-Inheritance branch from a1b4166 to d7b01fd Compare July 31, 2025 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants