Skip to content

Conversation

@pgm
Copy link
Contributor

@pgm pgm commented Jul 1, 2025

In group_entry the exact_match == False allows us to provide a domain name so that all users with that domain email address have access to the group. However, the check is looser than I'm comfortable with and in its previous form just checks the email address suffix.

These changes require that the suffix that's checked starts with a "@" so that it's always checking the entire domain part of the email address.

@pgm pgm requested review from jessica-cheng and snwessel July 1, 2025 16:28
Copy link
Member

@snwessel snwessel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I'm glad to have this restriction in place

# This means that only the public group which is possible to be viewable by _all_ users. Similarly all users need to be
# able to access the transient group.
if group.id in [PUBLIC_GROUP_ID, TRANSIENT_GROUP_ID] and write_access is False:
return True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You make a good point about the public and transient groups being a special case. As a heads up, the updates you made to the add_group_entry endpoint above may specifically break our implementation of the _populate_minimal_data (here), which creates the public and transient groups if they don't already exist. This function is run whenever you either:

  1. Run the recreate_dev_db command
  2. Upgrade the database (as we do on deployments)

We should clean up that entry because I think it'd be better if it doesn't exist

I'll also just mention, when we do clean up those entries, the _populate_minimal_data method might be a good place to make the change. We've been using it as a way of ensuring that these two group entries are always defined the way we want them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the current updates made in the api layer for add_group_entry endpoint will break our implementation of _populate_minimal_data since that seems to call the crud's add_group_entry function. However, this makes me realize that my earlier suggestion of adding the extra validation to GroupEntryIn would break _populate_minimal_data.

I agree that we should remove those special entries for transient and public groups since they don't seem necessary and we'd no longer need to consider a case where the email could be "" (with added length constraint on the db side for extra precaution if we desire)

@snwessel
Copy link
Member

snwessel commented Jul 1, 2025

Fyi, I also just double checked that the script I used to migrate groups from our old private dataset system did append the @ to the beginning of the email domains when it added them to breadbox (here), so our existing private group definitions should be compliant with this new restriction.

raise HTTPException(404)
raise HTTPException(403)

if not group_entry.exact_match and not group_entry.email.startswith("@"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a personal preference but for validations related to the request body, I tend to put them in the request body's Pydantic model as a validator. This has the added benefit that it will return an error earlier if the request body is malformed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: Pydantic has a special type EmailStr which I thought we could use for the GroupEntryIn model but it requires that the email be formatted as [name]@[domain]. However, this makes me realize that perhaps there should be more specific rules to think about such as proper formatting of the domain to include only one @ and . symbol. I don't think we need a super stringent check for now since we currently control groups but something to think about in the future

# This means that only the public group which is possible to be viewable by _all_ users. Similarly all users need to be
# able to access the transient group.
if group.id in [PUBLIC_GROUP_ID, TRANSIENT_GROUP_ID] and write_access is False:
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the current updates made in the api layer for add_group_entry endpoint will break our implementation of _populate_minimal_data since that seems to call the crud's add_group_entry function. However, this makes me realize that my earlier suggestion of adding the extra validation to GroupEntryIn would break _populate_minimal_data.

I agree that we should remove those special entries for transient and public groups since they don't seem necessary and we'd no longer need to consider a case where the email could be "" (with added length constraint on the db side for extra precaution if we desire)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants