Skip to content

Deprecating Firebase#56276

Closed
snickell wants to merge 160 commits into
stagingfrom
unfirebase
Closed

Deprecating Firebase#56276
snickell wants to merge 160 commits into
stagingfrom
unfirebase

Conversation

@snickell

@snickell snickell commented Feb 5, 2024

Copy link
Copy Markdown
Collaborator

Tracking issue for the firebase deprecation project: https://github.com/orgs/code-dot-org/projects/4

This PR will probably not be merged, but we're writing a fair bit of code to build a bulk importer etc that we'll probably throw away (see #55189 ). We could have opened a new repo, but it seemed reasonable just to keep it on this branch.

snickell and others added 30 commits November 1, 2023 13:01
…t block

Co-authored-by: Cassi Brenci <cnbrenci@users.noreply.github.com>
Co-authored-by: Cassi Brenci <cnbrenci@users.noreply.github.com>
Co-authored-by: Cassi Brenci <cnbrenci@users.noreply.github.com>
snickell and others added 27 commits January 29, 2024 21:16
Extract DatablockStorageKvp.set_kvps
This commit causes the firebase and datablock storage APIs to diverge a little.

Co-authored-by: Cassi Brenci <cnbrenci@users.noreply.github.com>
…rageBackend() which returns the right backend.

Co-authored-by: Cassi Brenci <cnbrenci@users.noreply.github.com>
@snickell snickell requested a review from a team as a code owner February 5, 2024 10:09
@snickell

snickell commented Feb 5, 2024

Copy link
Copy Markdown
Collaborator Author

This PR is a continuation of #54643, which was mis-closed by the Git LFS migration (#55759).

Previous Comments:

  • snickell commented - Another way of looking at this is not just where we use "firebase" in method names, but where do we actually use the firebase(TM) library. That's limited to two files:
image
  • snickell commented - This is the one function used outside of apps/storage/firebaseMetadata.js that I wonder if we could easily handle through SQL, it feels like a firebase "subscription".... is it?
export function onColumnsChange(database, tableName, callback) {
  getColumnsRef(database, tableName).on('value', snapshot => {
    const columnsData = snapshot.val() || {};
    const columnNames = _.values(columnsData).map(column => column.columnName);
    callback(columnNames);
  });
}

In applab.j... more

The only thing that looks like it might need subscription like functionality is onRecordEvent:

onRecordEvent("MyNewTable", function(record, eventType) {
  console.log(eventType);
  console.log(record);
});
  • All of our APIs update one record at a time, using the record ID, so its easy on the backend to know exactly what change message to publis... more
  • snickell commented - Idea: could we /just/ use redis? Like actually use that as our DB?
  • snickell commented - Discussing with @cnbrenci we naively might consider four approaches:
  1. Back our data feature with MySQL and deprecate support for onRecordEvent (see question 1)
  2. Back our data feature with MySQL, applab projects hold an open websocket (or poll???), and use a redis pubsub channel to broadcast (tableId, recordId) change events (or potentially the whole row value, either way), and then rails backend to the websocket sends the new row data to the client along with a changeEvent
  3. Like store sprite.value again, use it properly in setSpriteSize #2,... more
  • snickell commented - Per first convo with product, we may be willing to just deprecate onRecordEvent, breaking existing apps that use it. Conventional wisdom is that its mainly used by chat apps that are technically not allowed under our terms of service anyway, which might explain some of the high cost issues with firebase.

I'd personally like to see how widely used the onRecordEvent block is in Applab projects that have been used/run/loaded in, say, the last 3 months. Ideally used would mean loaded even jus... more

require_relative './block_finder'
require 'active_support/time'
blockFinder(blockName: 'onRecord', game: 'applab', lastActiveAfter: 90.days.ago.to_date)

@cnbrenci a... more

  • snickell commented - Say we start by writing a script to determine percent_active_applabs_using_on_record_event():
def sample_random_student_source(game, last_active_after)
  # TO IMPLEMENT
end

def source_contains_block(block_type)
  # TO IMPLEMENT
end

def days_to_seconds (days)
  return days * 24 * 60 * 60
end

def percent_applabs_actively_using_on_record_event()
  num_samples = 1000
  num_samples_using = 0
  for i in O.num_samples do
    source = sample_random_student_source(game: appla... [more](https://github.com/code-dot-org/code-dot-org/pull/54643#issuecomment-1800824525)
- **snickell** [commented](https://github.com/code-dot-org/code-dot-org/pull/54643#issuecomment-1803024430) - The "Projects" table (project.rb, Project model) has all the data we need:
- The projects table includes both updated_at and project_type="applab"
- the ID column of the Projects table is the "storage_app_id", which is the SECOND number in the S3 paths
- the storage_id column of the projects table is the FIRST number in the S3 paths

So we can go:

project = Project.where(project_type: 'applab', updated_at: 30.days.ago...).take!
s3_path_prefix = "sources" # or in dev: "sources_develop... more

  • cnbrenci commented - > oject = Project.where(project_type: 'applab', updated_at: 30.days.ago...).take!

s3_path_prefix = "sources" # or in dev: "sources_development"
s3_path = "#{s3_path_prefix}/#{project.storage_id}/#{project.id}/main.json"
s3_bucket = "cdo-v3-sources"
source_code = AWS::S3.download_from_bucket(‘cdo-v3-sources’, s3_path)

Confirmed that this does work (I successfully downloaded the source)

  • cnbrenci commented - open Q to look into: can Project.state be used to determine if the project is active rather than updated time? (I see in my dev env all my projects are state='active')
  • cnbrenci commented - First pass on full script after everything we figured out yesterday.

The last step would be we need to run it against the prod dashboard DB with the "sources" s3_path_prefix instead of "sources_development"

#!/usr/bin/env ruby
require_relative('../config/environment')

# Gets a sampling of applab source code from s3
# computes the percentage of those sources that use the onRecordEvent block

def percent_projects_actively_using_block(project_type, block_type, last_active_afte... [more](https://github.com/code-dot-org/code-dot-org/pull/54643#issuecomment-1804922481)
- **snickell** [commented](https://github.com/code-dot-org/code-dot-org/pull/54643#issuecomment-1805128299) - Nice!! Curious what percent we're gonna end up with 🤞 when we run on production.

- We should probably use random samples rather than whatever MySQL happens to give for a default ordering. I generally avoid using DB-specific SQL, but since we're just a one-off script.... `Project.where(...).order('RANDOM()').take(sample_size)` (https://www.geeksforgeeks.org/sql-select-random/). This does require the DB to internally walk the entire set of matching rows, join with a random number each, and then... [more](https://github.com/code-dot-org/code-dot-org/pull/54643#issuecomment-1805128299)
- **snickell** [commented](https://github.com/code-dot-org/code-dot-org/pull/54643#issuecomment-1809461422) - We sampled the prod DB to find instances of the `onRecordEvent` block being used in applab projects. Where we found instances, we read the code to see if they were broken or TOS violations.

TL;DR: **0.002% of applab projects (that aren't stubs or chat apps) created in the last 2 months use the `onRecordEvent` block, with a sample size of 100k.**

1. Sampling applab projects created in the last 2 months, 6 out of 100000 applab projects created in the last 2 months used `onRecordEvent`. We re... [more](https://github.com/code-dot-org/code-dot-org/pull/54643#issuecomment-1809461422)
- **snickell** [commented](https://github.com/code-dot-org/code-dot-org/pull/54643#issuecomment-1813699715) - Product noticed (TY @moneppo) we hadn't considered that students use different blocks throughout the year as they complete different curriculum. 

So we went back to the sampling, and grabbed 105k samples covering all of 2022...

We scanned 2022 (samples from each month), and **we found 3 "valid" apps in 2022 out of 105k samples (=0.003%, about the same as the "last two months" sample)**. "Valid" meant apps that weren't either chat apps, or stub code (e.g. "drag some blocks out"). We found 1... [more](https://github.com/code-dot-org/code-dot-org/pull/54643#issuecomment-1813699715)
- **snickell** [commented](https://github.com/code-dot-org/code-dot-org/pull/54643#issuecomment-1815763762) - We couldn't figure out how people were writing gamelab based chat apps. I found the "secret" gamelab APIs thanks to Hannah Nees giving me a great sample of "banned chat apps", including samples from Gamelab. These use "setKeyValue" and "getKeyValue" functions.... which aren't exposed as blocks (?!?).

Relevant code is here: https://github.com/code-dot-org/code-dot-org/blob/e6905b0c899383ca2c7e43885f847cedf84478b2/apps/src/p5lab/gamelab/commands.js/#L25-L47

The PR that added them is here: ht... [more](https://github.com/code-dot-org/code-dot-org/pull/54643#issuecomment-1815763762)
- **davidsbailey** [commented](https://github.com/code-dot-org/code-dot-org/pull/54643#issuecomment-1815816682) - > @davidsbailey any chance you remember why we were adding these APIs without adding them as blocks?
 
 no, sadly I do not think I was involved in this decision.
- **snickell** [commented](https://github.com/code-dot-org/code-dot-org/pull/54643#issuecomment-1899550847) - - Next we should start implementing more methods from the firebase interface we extracted, and get corresponding tests passing.
 - We should also consider adding tests that actually create student source using the data blocks, and thereby test end to end, from JS api through the Rails controller. I think these tests, while a little more bulky, will actually catch much more and more helpful potential breakages in the future.

### Previous Reviews:

@snickell snickell closed this Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants