Don't :require_levelbuilder_mode to edit datasets#64779
Conversation
|
What about an option where we publish datasets from levelbuilder as an artifact to S3(?), and then on DTx we load the latest artifact into the datablock storage for that environment? This would keep the levelbuilder-user experience the same, and move us to a model where dataset updates are considered part of levelbuilder updates, with a new version entering the pipeline to production at night with the rest of the levelbuilder changes. The thing I don't like about this approach is that it introduces (compared to the pre-datablocks implementation) a likely discrepancy between Levelbuilder and Production. The system will not enforce that Production be "behind" Levelbuilder, or that Levelbuilder be a "draft/unpublished" version - they are simply different and their consistency is enforced only by human decisions. This is on a per-dataset basis, too, so we will find ourselves in a situation where Levelbuilder and Production datasets are unintentionally different. I'd argue that option 1 should have been the MVP of the datablocks project, and Option 3 wouldn't have been on the table, so I actually prefer 1. |
|
The coupling in firebase seemed to be a UX surprise for curriculum, it hadn't broken the world because we'd gotten lucky vs it being a pattern we want to perpetuate. I'm closing this PR given its reliance on an access grants ("levelbuilder on prod") that already have been handed out more widely than I'd like. Its possibly relevant to revisit it if we independently address that issue in a way that frees up levelbuilder on prod to more literally mean "levelbuilder access on prod". |
Datablock Storage (PR) unintentionally broke the ability for curriculum to self-service in updating applab datasets.
This PR is a one-liner (it removes an unnecessary (?) security check) that restores that functionality by allowing users with levelbuilder permissions on prod to access existing endpoints https://studio.code.org/datasets and https://studio.code.org/datasets/manifest/edit.
This PR restores the same editing flow as prior to datablock storage, but curriculum should edit both levelbuilder and prod to make an update as they are no longer joined at the DB level.
History of the issue
Prior to Datablock Storage (which uses our mysql DB), we stored applab datasets in Firebase.
These were typically edited by curriculum accessing https://levelbuilder-studio.code.org/datasets/manifest/edit and https://levelbuilder-studio.code.org/datasets, which use the DatasetsController.
Behind the scenes, levelbuilder was actually sharing applab datasets with production, and therefore when you edited levelbuilder, prod was instantly updated. This bypassed the usual "updates on levelbuilder are content scooped, stored in git, and re-hydrated on prod" flow, but nobody noticed/remembered this was the case.
With Datablock Storage, we did not catch this impicit coupling between levelbuilder and prod. As a result, we broke the ability for curriculum to self-service.
Current status quo is that curriculum must request updates from engineering, who can run a hacky / error-prone script I created to do the updates manually using console access (relevant slack thread).
Possible fixes
rejected: use the same model firebase used: levelbuilder edits the prod dataset definition. We won't do this for obvious security reasons: we will not give levelbuilder access to the prod DB.
add a content scoop hook to dump both datasets and the dataset library manifest into git, and rehydrate from there when seeding/re-seeding the DB. this matches how levels/scripts work.
UserPermission::LEVELBUILDERAND the server is in levelbuilder_mode. If we reduce this to simply requiringUserPermission::LEVELBUILDER, curriculum authors can self-service again.Really the choice today is between #3, or continuing to edit SQL by hand / use scripts, which is relatively disempowering for curriculum, and relatively risk prone.
Security Implications
Currently ability.rb restricts access to the datasets controller to users with
user.permission?(UserPermission::LEVELBUILDER).Additionally, datasets_controller.rb limits usage of the datasets controller to levelbuilder intsances, which prevents accessing /datasets on production instances.
This PR removes the second check. Users still require levelbuilder permissions, this time on prod, but based on empirical testing, this does not open editing datasets up to random users.
If somebody unauthorized were able to escalate to levelbuilder permissions for their user, they would not have access to PII, but could disrupt applab usage by deleting our datasets (which would be pretty easy to restore, say it would take a few hours).
Additionally, if such an exploit were found, it would be a much more serious issue anyway, as it would allow deleting ALL our curriculum by using the exploit (with a content-scoop hrs delay).
UPDATE: security issues
We've been granting levelbuilder access on prod more freely than levelbuilder access to levelbuilder. It was a very bad assumption that it would be a subset of levelbuilder access on levelbuilder, the reverse is closer to true. The situation is that levelbuilder access on prod has been considered safe to grant, because it doesn't let you edit (due to checking if levelbuilder_mode is enabled), and therefore to let external reviewers and partners access courses / scripts with a
published_stateofin_development, we've historically granted them levelbuilder access on prod (but not on levelbuilder).If we wanted to proceed down this route, which may not be worth doing in light of this, we'd need to create a new perms set that permits access to in_development courses (relevant code in ability.rb), without granting full levelbuilder access, and ALSO migrate existing accounts over to this (with some heuristic for preserving those, if any, that need levelbuilder access to prod, potentially for other reasons?).