Skip to content

App Lab - extract user text from source code before checking for profanity#70649

Merged
fisher-alice merged 3 commits into
stagingfrom
alice/applab-libraries-moderation
Feb 5, 2026
Merged

App Lab - extract user text from source code before checking for profanity#70649
fisher-alice merged 3 commits into
stagingfrom
alice/applab-libraries-moderation

Conversation

@fisher-alice

@fisher-alice fisher-alice commented Feb 4, 2026

Copy link
Copy Markdown
Contributor

This PR extracts user text from App Lab JavaScript source code before it is sent to be filtered for profanity.
In App Lab, only project code that is being shared as libraries with other students are moderated by our current text moderation service (WebPurify).

We have gotten repeated reports of projects being flagged as false positives such as this one: https://studio.code.org/projects/applab/07b62fe6-797f-4c19-8193-645188f76389/edit

Before update

Screen.Recording.2026-02-04.at.5.19.23.PM.mov

Slack thread with this SAME example from a couple years ago and reported recently in this Zendesk ticket. https://codeorg.zendesk.com/agent/tickets/575371

Although we added the offending word to the allowlist in WebPurify, the text is still being flagged. Apparently, adding a parentheses affects the result of the filtering against the allowlist.

When I tested the phrase with a parentheses ‘if(artistlist’ a violation was found, but without a parentheses there was no violation found.

Without parentheses: no violation found
without-paren

With parentheses: violation found
with-paren

Thus, I added a call to a function that extracts user text from the JavaScript source code including removing parentheses from text.

This is a similar approach to when we added moderation of open-ended K-5 project types (Blockly) as we had been receiving a lot of reports of false positives because block ids were being included in the code: See #67024, #66614

Links

Testing story

Added unit tests.

I tested locally with the source code above, and this is the text returned by the extractTextFromCode:

"artistInList - Takes an artist and returns the number of times the artist appears in the list artist string - music artist like "The Beatles" "Nirvana" "Taylor Swift" etc return number - the number of times the artist appears in the list artistsInYear - Takes in a year and returns a list of artists that released an album during that year year number - year an album was released return list - the list of artists that released an album during the given year test to see if it works test to see if it works test to see if it works console log artistsInYear 2028 console log artistsInYear 1993 test to see if it works test to see if it works test to see if it works console log artistInList "The Beatles" console log artistInList "Taylor Swift" RollingStone 500 Albums Artist RollingStone 500 Albums Year RollingStone 500 Albums Artist No artist found function artistInList artist var artistList getColumn var filteredArtistList for var artistList length if artistList artist appendItem filteredArtistList artist return filteredArtistList length function artistsInYear year var yearList getColumn var filteredArtists var artistList getColumn for var yearList length if yearList year appendItem filteredArtists artistList if filteredArtists length return else return filteredArtists"

Deployment strategy

Follow-up work

Privacy

Security

Caching

PR Creation Checklist:

  • Tests provide adequate coverage
  • Privacy impacts have been documented
  • Security impacts have been documented
  • Code is well-commented
  • New features are translatable or updates will not break translations
  • Relevant documentation has been added or updated
  • User impact is well-understood and desirable
  • Follow-up work items (including potential tech debt) are tracked and linked

@fisher-alice fisher-alice marked this pull request as ready for review February 4, 2026 23:30
@fisher-alice fisher-alice requested a review from a team February 4, 2026 23:30
Comment thread apps/src/utils.js Outdated
* @param {string} code JavaScript source code
* @returns {string} Extracted text content separated by spaces
*/
export const extractTextFromCode = code => {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this function go somewhere more specific than a generic utils function? Maybe within the libraries folder for now since it's only used there?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense...will move!

Comment thread apps/test/unit/utilsTest.js Outdated
it('extracts multi-line comments', () => {
const code = '/* This is a\nmulti-line comment */\nlet x = 1;';
const result = extractTextFromCode(code);
expect(result).to.include('This is a');

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we just have these tests be something like expect(result).to.equal(<whatever the resulting string is)? That would make it more clear as to what the util does.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated! It was a good exercise for me to update to use equals 😁

@molly-moen molly-moen left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

const code = 'if(artistList[i]) == artist';
const result = extractTextFromCode(code);
expect(result).to.equal('if artistList artist');
expect(result).to.not.include('if(artlistList');

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think we need the not include when we have the to equal above

@@ -0,0 +1,82 @@
/**

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would name this file extractTextFromCode to make it easier to find.

@fisher-alice fisher-alice merged commit 54ed237 into staging Feb 5, 2026
5 checks passed
@fisher-alice fisher-alice deleted the alice/applab-libraries-moderation branch February 5, 2026 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants