Skip to content

Use custom max depth for parsing of Blockly JSON#66769

Merged
fisher-alice merged 8 commits into
stagingfrom
alice/json-parsing
Jun 27, 2025
Merged

Use custom max depth for parsing of Blockly JSON#66769
fisher-alice merged 8 commits into
stagingfrom
alice/json-parsing

Conversation

@fisher-alice

@fisher-alice fisher-alice commented Jun 27, 2025

Copy link
Copy Markdown
Contributor

This is a follow-up to #66614 which added profanity/privacy filtering for open-ended project types geared for young users.

We started getting HoneyBadger reports of JSON::NestingError: nesting of 101 is too deep. These are caused by Sprite Lab programs whose JSON has nesting greater than 100 levels.

I checked a couple programs which triggered this error and the number of levels was between 100-300. Thus, I increased max_nesting level of JSON.parse from the default value of 100 to 999. I added a DCDO key in case we need to update this value quickly (thinking about when classes begin in the fall).

Links

Testing story

Deployment strategy

Follow-up work

Privacy

Security

Caching

PR Checklist:

  • Tests provide adequate coverage
  • Privacy and Security impacts have been assessed
  • Code is well-commented
  • New features are translatable or updates will not break translations
  • Relevant documentation has been added or updated
  • User impact is well-understood and desirable
  • Pull Request is labeled appropriately
  • Follow-up work items (including potential tech debt) are tracked and linked

@fisher-alice fisher-alice marked this pull request as ready for review June 27, 2025 16:06
@fisher-alice fisher-alice requested review from a team and molly-moen June 27, 2025 16:10
Comment thread lib/test/cdo/test_share_filtering.rb Outdated
ShareFiltering.should_filter_program(with_indicator, 'playlab')
end

def test_extract_text_blockly_handles_deeply_nested_json

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure these tests are necessary. They are basically just testing whether the max depth parameter works. Do they take a long time since they have to generate a long string?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran these locally and they didn't take long at all. I don't have a strong preference!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think they are necessary!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although if we add a state of "project too big" for a share failure they may make sense!

Comment thread lib/cdo/share_filtering.rb Outdated

# Texts will include field values, text values in block inputs, comments, and variable names.
json = JSON.parse(stripped)
json = JSON.parse(stripped, max_nesting: DCDO.get('share_filtering_blockly_json_max_depth', JSON_MAX_DEPTH))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this fails, we will fail to load the project, right? This is maybe a longer-term fix but I wonder if we should still allow the project to load, but not allow it to be shared?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or could we catch the error and just return that there was a share failure, something like "project too big"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah! I was actually thinking about allowing the loading. But not sure about not allowing it to be shared...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as an intermediate step having it load but not be shareable (if that is easy to do here) seems like the right call. If we can't parse it, we don't know if it's shareable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense. I'll add a new FailureType for PROGRAM_TOO_LARGE.
Just noting that in activites_controller, we have a rescue for WebPurify::TextTooLongError and the program is still allowed.

if @level.game.sharing_filtered?
project_type = 'playlab'
begin
share_failure = ShareFiltering.find_share_failure(params[:program], locale, project_type)
rescue WebPurify::TextTooLongError, OpenURI::HTTPError, IO::EAGAINWaitReadable => exception
# If WebPurify or Geocoder fail, the program will be allowed, and we
# retain the share_filtering_error to log it alongside the level_source
# ID below.
share_filtering_error = exception
end
end

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth checking with Sam?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the commit that made programs too large not shareable: 2639841
However, I reverted since we checked with @samantha-code on Slack, and she said to let these programs still be shareable.

@fisher-alice fisher-alice requested a review from molly-moen June 27, 2025 18:41
begin
json = JSON.parse(stripped, max_nesting: DCDO.get('share_filtering_blockly_json_max_depth', JSON_MAX_DEPTH))
rescue JSON::NestingError
CDO.log.warn "ShareFiltering.extract_text_blockly: JSON too deep after #{JSON_MAX_DEPTH} levels"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for my information: where does this go?

@fisher-alice fisher-alice Jun 27, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not exactly sure! @bencodeorg I see you recently added a log in https://github.com/code-dot-org/code-dot-org/pull/65870/files. Do you know where we can access these afterwards?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought CloudWatch Logs? But that could be wrong.

@molly-moen molly-moen left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@fisher-alice fisher-alice merged commit 30fa6c8 into staging Jun 27, 2025
6 checks passed
@fisher-alice fisher-alice deleted the alice/json-parsing branch June 27, 2025 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants