GitHub - codeque-co/codeque at 58d4fcbd300f8d6b3357f77ba93e97103b98ee8b

MVP CLI

Names:

magic search (more catchy for search)
code query (the best for set of tools)
code magic (taken :/)
magic code search (kinda too long)
Quecode
- quecode.dev if free
CodeQ
CodeQue
- codeque.dev if free

✅ Fix bug with <$>$</$>; matching too much - JSX text wildcard acts like $$ o.O

✅ restrict more than 2 wildcards on query parse level

✅ Adjust formatting of multiline code that is staring after some tokens

✅ Make CLI a product

✅ codeframe from babel
✅ investigate results formatting query :<Text $="ellipsis" ></Text>
- how we can present original code instead of generated one
- ✅ fix problem with 0 padding
✅ commander
✅ spinner while search
✅ results limit param
✅ convenient multiline input - ✅ find better tokenizer (fixed js-tokens)
✅ file path query
✅ runs in cwd

❌ Explore types matching and types literals -> tests on custom file

❌ Bug with JSX text compare -> found only once instead of two

<Text
  textAlign="center"
  mb={4}
  fontWeight={600}
  lineHeight={1.3}
  fontSize="xl"
>
  Please complete the short and completely confidential personality
  questionnaire.
</Text>

❌ Try to make type declarations optional in include mode, right now if code has types, eg return type, it cannot be found without it

❌ Add search by files changed since last commit

✅ Investigate why node_modules search does not work

❌ add tests block statement search (queue example in catch block and function block)

✅ parse errors should not crash whole search

✅ see if async can speed up files search

✅ Market research on eslint and babel auto plugins

https://eslint.org/docs/rules/no-restricted-syntax - one must know ASTs
Didn't found any valid resource on eslint, assume that no resources exist for babel

✅ Try

tsc --extendedDiagnostics How long stuff took and how big/complex it is.
tsc --listFiles List of every file included in the compilation.
tsc --explainFiles List of every file and why it's included.
https://github.com/amcasey/ts-analyze-trace

❌ add search by dependencies using dpdm

✅ Think if we can solve problems with exports using rev-dep

how to get root files in rev-dep
how many roots it would find
- root can be an orphan
look for dependencies analysis extensions
how we could use rev-dep in vscode ext
I NEED babel-plugin-undo-reexport
- done :D

✅ ensure create debug and other trash is not in pkg, install fresh pkg on linux to save space

✅ Bug with code generation for <$ $={${$$}} />; // use-case: probably redundant template literal (value can be not a string)

implement tests for this
implement tests for <$ $={fdsgg} />; // use-case: redundant template literal
template literal seems to not work properly in exact mode

✅ Support wildcards in JSXText

✅ Support for case insensitive search

only for wildcards for now
actually it might be easy, we should check if primitive value is string

✅ Support json

✅ Bundle/minify/obfuscate

❌ Invent / Implement license mechanism

✅ try webassembly
✅ Cleanup rust code
✅ cleanup rust deps
✅ cleanup build chunks
✅ add wasm files to package script
✅ obfuscate wasm identifiers
RSA-SHA256 License - ability to seamless renewal - ideally can validate license with public key, but create with private key - non-ideally - both keys (or one common key) is used to create and validate - license is stored on device and not removed on update - ✅ consider calling RSA crypto from rust
- key can be stored on js or rust
- result can be get in rust
  - to intercept attacker would have to modify / override crypto impl
  - Consideration result: Let's use AES for now and stop overthinking this security :D it might not be the issue if software would not sell well - investigate how to integrate tools like Paddle, Strip, Gumroad, Kofi for payments / memberships
temp key: dSgVkXp2s5v8y/B?
❌ local license key store
- to survive lib update
- to survive vscode update
- to survive vscode ext update
- need to save in user home directory
  - need to find package to handle that
    - we can use os.homedir() and .codeque file
💡 cli set license
cli authorize via github (later)
One key can be shared between many users -> company key does not make sense if they don't want to use eslint
- maybe we can generate key with device footprint, then we could validate footprint
  - footprint could be embeded in signature, so we need footprint either from JS or from Rust
  - footprint could be sha2 from some os properties - this could be reliable
- one license = up to 3 active footprints
- can we use sha identity for this?
  - we can, it would be usefull for CI servers
  - for human users we would require to sign in with github, which would return license key
    - some can still figure out that key is stored locally and they can copy it
      - if we replace key frequently, that wouldn't be worth cheating
      - footprint is a good idea
  - sha for CI could be used by some users to get access to search for many ppl of the company
    - sha access could be granted for 1h or some short-period of time
  - license key is same as it is now
- how we can register footprint from CI server running eslint checks, would that be stable ?
Each account could have it's own .wasm generated with custom key
- problem with versioning/updates
- we could generate wasm build on the fly if needed
- wasm build could be loaded async and cached
  - It's async already
- act like a 2 factor auth. needs matching key and lock
  - it would kind of secure flaky AES on wasm
- how we would fetch proper .wasm ?
  - organization id + user id/email + npm pkg version
  - anyone can (not easily) fetch some .wasm
  - if some one fetch .wasm, they need to decompile to get AES key
  - if they have AES key and .wasm they can generate key and use software
  - decompile of .wasm to get AES would be different for every user and version
    - harder than just one AES key for all versions and users
- cost of generation of .wasm assuming 10k customers and 5 minutes per build and 512RAM ~ $25 // 0.0000000083 * 1000 * 60 * 5 * 10000
  - assuming we have container with rust installed - should be possible - need PoC
✅ Each version/build to have different AES key?
- what are the implications ?
  - user would have to change key with each new version (we can add postinstall step)
  - we could verify if user even can have key for this new version (safer than checking dates on local machine)
  - a key still can be shared among many ppl, but due to updates (auto updates in vscode!) it would be frequently replaced
    - if we add fingerprint that would be safe enough, cannot easily copy-paste key
  - we could do nightly builds to force to replace key more often
  - each key get request would give you new, one-time refresh token
    - impossible to share refresh token with others
- harder to generate fake license (needs to deassembly key every time)
- what's the purpose of generating this key if we would have to use refresh token to get it?
  - software features are locked until you get the key
  - having a refresh token does not mean that you will be able to get a key (might have outdated account)
What if we would generate key on user device
- we would have to generate .wasm on demand
✅ Will partial .wasm impl be maintainable?
- let's do not overcomplicate wasm part
  - some really greedy cheaters would just lose their time
  - blocking key copy-paste is good enough - we will use fingerprint
- maybe we should build just JS on demand in the cloud?
- ✅ how we differentiate operations like search, eslint, replace on wasm side and still having nice API
  - remember codeQue can be used as a npm module
  - wasm would have to control the flow of the program - pain in the ass ?
    - too much work
  - we would have just different functions to do different things
    - maybe we can somehow pass current stack trace to authorize xD ?
    - if someone would try to overuse regular license to have company/project features - we don't care
      - we can obfuscate license checks, so it's harder to use "search" check in place of "eslint" check
License v1.0 - alpha
- shared AES key and on demand 6 months license gen
License v1.1 - beta
- each release changes the AES key
- license generated using account on server (auth via github)
- device fingerprint
- each license key valid for device & version, github auth/my server refresh token to refresh key
License v2 - with version for companies (eslint etc)
- fingerprint
- unique AES key for each organization/user
- cloud builds of .wasm
- sha keys for CI

To release vscode ext

✅ figure out how to store key in user home dir
❌ implement storing key in home dir
❌ implement module API
❌ release npm alpha pkg protected by AES key
❌ vscode ext implementation
- list features like
  - select code to search
  - include / exclude files dirs
  - mimic normal search

❌ PoC / Implement vscode extension - mostly to understand how to license

MVP needs to be vscode extension, cli is not convenient for users

✅ Add support for proposal syntaxes

✅ Add support for multiple wildcards

($$, $$) => {} is invalid while parsing function
$_refN - currently without ref analysis
$$_refN - currently without ref analysis

✅ Implement tests

✅ Add literal wildcards

string literal cannot be replaced with identifier in some scenarios eg import
we should be able to always use identifier wildcard in place of number
we still need number wildcard for some cases (we want to have number, not any identifier)

✅ Add support for regexp identifier matches (on$ -> onClick, onHover etc)

✅ Better handling of query errors

return outside a function
await outside async fn
explore parse result errors

✅ Regex matching of identifier seems to be slow

✅ one perf issue was caused by prettier - fixed!
double the time on mac for "import { $Plus } from 'react-icons$'"
maybe instead of "." regex we could be more specific
✅ it might be caused by lack of keywords for initial search
- try to use keywords regexes in tokens search
- ✅ try to escape "$" from tokens - should be faster than several regex
- try to use language keywords like import, for,as
  - might not help much

✅ improve query parsing

first try to parse without brackets, then add brackets and parse once again

✅ Add support for nested gitignore

✅ Do benchmark (done)

mac 1.4s
desktop 2.6s
laptop 4.5s

✅ Do profiling

maybe we can optimize by identifiers search
- probably there is amount of identifiers that we can search to gain time,but if we search for too many, we will lose time
- just one identifier is a good starting point

Get files edited since last commit echo $(git diff --name-only HEAD)

❌ Notion this Readme ! ❌ Think of strategy

1st make a tool and test it within friends and Dweet
2nd start youtube channel / blog / your other media here and speak about tooling, bundling etc
- make a list of videos with ToC that I would like to record

Further product development

💡 Feature import-based search

search in file and all files imported by a file
eg. your test failed
- you search for test based on name
- you specify a query to find failing code patterns in files imported by test

💡 Think of negation syntax and sense (just to make if future proof for now)

could be something like: $not('asd')
it might execute 2 (or more) searches and filter results if there are 2 the same

💡 Think of and, or syntax and sense (just to make if future proof for now)

could be something like: $and('asd', $not(() => {}))
jsx excluding some prop$and(<somejsx>, $not(<somejsx prop={$$}/>))

💡 Think of support for ref matching

user should be able to indicate that two wildcards are the same identifier
eg. const $_ref1 = 'string'; call($_ref1)

💡 Add query extensions

$type() - to create type matcher
- can be only used top-level
$exact(), $include(), $includeWithOrder() - to change mode in given code path
- <$ $={() => {}} /> will match functions with body, which we don't want
$fn(() => {}) - alias for 3 types of function definition
- effectively executes 3 queries
It might be useful to search for expressions within nested structures inside functions to make it more useful
- it might need special operator like $nested()
$jsx() - for jsx tags when children can be ignored
- executes 2 query for self-closing and not self closing

💡 Think of other use cases for the matching functionality (call the whole product code-magic)

should the product be an licensed cli ?
vscode search extension
- other editors extensions (how to, which languages)
- Webstorm
  - Java, but can execute JS somehow - need more reading
cli search - why not
standalone desktop app
eslint plugin restricted syntax
- check in autozone if custom plugins could be replaced
- check which of the existing plugins could be replaced
- plugin should have reference analisys (user should be able to mark that two identifiers should be the same, eg using $_ref1)
- there might be a problem to show error in specific line where it happens, since we usually need to outline more context in query to capture the problem.
  - eg. prop in JSX, you need to add some JSX code
  - we need a way to mark the place where the error is actually
  - maybe there should be 2 code queries, one for the whole pattern with error, and second, smaller, to highlight error in query itself
  - it could potentially be a comment like
```
  function(param) {
    someCode // code-que-error-here
  }
```
    As we are formatting agnostic, we can break up each code in the way, we have actual problem in one line (or multiple and use many comments)
- Market research
automated codemod - this one needs a PoC
- check some codemods
- program should be able to get diff of AST
- 3 steps
  - implement query
  - implement transformed query -> generate AST diff and use it as a transform (try use json-diff with removed misc keys)
  - show example result
predefined codemode snippets to apply on file
- eg. transform props into 1{prop1, prop2} based on which keys are used
- a) it could be eslint plugin / no need for code-magic for that
- b) it might be impossible to implement with current approach to codemod
for codemod and eslint we need to be able to reference a variable by identifier, to be able to track references for more complex cases
track duplicated code - how (eg. pattern to match all DB queries, then exact compare of AST)
- this could be integrated into editor, so it could search duplicates as you type code
- predefined patterns to find in current file
- if pattern is found in given file, search for exact code in other files
metrics: project has 1000 DB queries, project has 3000 react components
check what SonarQube can measure
tool like rev-dep could be part of code-magic toolset
- think how it could improve refactoring
- it should not only resolve imports, but references in code as well, so it would be more accurate (should resolve like stack trace)
- it helps with
  - refactoring & finding all refactored views to test them
  - saves a lot of time spent on manual references lookup
Feature: get all values of given property
- eg. to assert unique test-ids across all files
Feature import-based search
- search in file and all files imported by a file
- eg. your test failed
- you search for test based on name
  - you specify a query to find failing code patterns in files imported by test
Feature - get unique values of $_ref/$$_ref in query
Feature: ast-based diff to outline what actually changed in code logic
- needs more reading on how to integrate that into git
Tool : "Import hygiene" - dependency graph summary and statistics, assertions
- need research on how it could improve codebase on daily basis
- need research how to present information in consumable way
- sort imports to solve css ordering problem
- make assertions to not import certain file in certain paths
  - like file with api keys that should only be on server side
  - a given file should have list of allowed entry points
- some data on how convoluted and hard to maintain your dependency graph is
  - mostly for myself so I know if project is in a good shape

💡 Add support for suggestions based on equivalent/similar syntax

user input: <$ prop={"5"} />, suggestion: <$ prop="5" />
user input: <$ prop={$+$} />, suggestion: <$ prop={$-$} />

💡 Add hints based on first node

user input: {a:b}, hint: You probably needs ({a:b}), right now it is a block statement
use input "some string": You probably needs ("some string"), right now it is a directive

💡 To secure the code we should

verify license in WASM
implement parts of the algorithm in WASM
implemented parts do not work if license is not verified

💡 Add support for flow

Probably needs a refactor similar to different language refactor
- maybe we could look for @flow comment and configure babel based on that

💡 Pricing

if fingerprinting is added, each seat can have 3 fingerprints
Free only exact mode, no wildcards, no other features
- code stats
Paid $19 / year (dev)
- search with all features
- code stats
- exclude replace, ref analysis, import resolution
Paid $29 / year (pro)
- search + replace + ref analysis + import resolution
- code stats
Company/project $29 / month
- up to 10 users (+$3 for each additional user)
  - limiting users amount does not make sense, since we cannot validate that
  - we can if each user would have unique key + on CI we would use ssh identity to receive license key
- all the above + eslint rules

💡 Product website

home
- landing
- showcase
- pricing
docs
playground
examples

⌛ Code smells check script

more than 5 if statements in the block
spread overuse
react literal prop values
some others based on common eslint rules

⌛ Marketing Implement stats script and encourage ppl to share their results on Twitter

N files
N JS/TS files
N import statements
N require statements
N string literals
N empty strings
N zeros
N functions
N arrays
N objects literals

⌛ Marketing use-cases for search

You want to find how a component is used across the codebase to see examples without going to docs
You want to track where the piece of code is duplicated across the codebase
You spot a code pattern that can cause issues (eg. react falsy event listener) and you want to check where else it is used
You want to check places where a component with specific set of props is used while refactoring to test changes properly
You are curious about usage statistics of some patterns in the codebase (count of components, functions)
- there could be predefined set of measurements to run
More specific
- You are adding i18n and you want to check where in codebase a specific text string is used ⌛ Marketing use-cases for eslint rule

⌛ Marketing use-cases for eslint codemod

"Pay tech debt quicker"
after changing prisma data model we can find interface/class for changed entity and adjust fields

💡 Variable's binding tracking needed for refactoring (replace)

don't compare scope for non $_ref identifiers
check scope only for refs
- find a way to track scope of the ref
  - read about how to find shadowed identifiers
  - https://github.com/jamiebuilds/babel-handbook/blob/master/translations/nl/plugin-handbook.md#toc-scopes
- remember scope of the first ref occurrence
- match ref for every identifier in subtree
- mark if identifier is redeclared
  - create scope analysis object
- don't start scope analysis if identifier is not declared only once in query
- if there is identifier redeclaration that is not a part of query stop checking further
to start working on it we need to have multi body statements queries (extensively concidered in search & replace test file)

Some use cases for replace fn

From

import React from 'react'

$nested(
  React.useState($$)
)

To

import React, {useState} from 'react'

$nested(
  useState($$)
)

From

<Box>
  <Inner prop="val"/>
</Box>

To

<Inner prop="val"/>

From

<>
  {isMobile && (
    <SetInitialPageType
      pageType={pageType}
      setPageType={setPageType}
    />
  )}
</>

To

<>
  <SetInitialPageType
    pageType={pageType}
    setPageType={setPageType}
    isMobile={isMobile}
  />
</>

AST builder and AST finder - JScodeshift helpers

https://rajasegar.github.io/ast-finder/
https://rajasegar.github.io/ast-builder/
corresponding npm packages,
not used, around 0 downloads

/**
 * How to replace?
 * 
 * 1. We have to treat each body item in query as a sub-query, join them with logical 'AND'
 *    - sub-queries would be required only for non-exact mode
 *    - we should treat all block nested block statements similarly to sub-queries 
 *      - we cannot relay on index number e.g to delete a node from body, since actual file body might have more elements than query
 * 2. For each sub-query we generate hash, and we add that hash into file AST to mark node as "to replace" and link to the sub-query
 * 3. For each replacement sub-query we generate the same hash (hash has to be deterministic)
 * 4. We generate a diff for each sub-query
 *    - diff should be agnostic to custom matchers ($nested, $jsx etc.)
 *        - think of cases where it wouldn't be, maybe some $or $and $not ??
 *        - there should be a rule that you cannot change custom matchers in the "replace"
 *            - same applies to wildcard matches
 *              - maybe somehow we could support replacements in partial wildcards
 *                - like `some-path/$` to `new-path/$`
 *                - that's for later, can be done with regex, seems not be often used 
 *    - There are problems
 *       - what if a given subquery would be totally removed?
 *          - maybe we should have remove()
 *       - what if a given subquery is totally new?
 *          - maybe we should have $add()
 *       - similar problem occurs in nested block statements
 * 5. Once we have a diff (to add, to delete, to update) we should find node in file with hash matching the subquery hash
 * 6. We start to traverse the code to apply diff changes
 *    - since we are based on the diff we don't touch any props that are outside of the search match
 *    - we should take into consideration, that there would almost always be nodes nested to our query
 *      - eg. nested JSX in our diff which we cannot remove, some object expressions
 * 
 * Important note: Implementing replace for non-exact mode would be super hard
 *   - we would need custom diff algorithm to detect removal of some intermediate nodes
 *     - eg. removing <Box> from <Flex><Box><Text>Abc</Text></Box></Flex>
 *     - if <Text> contains some additional props not listed in query, we would just remove them with deep-object-diff based approach
 *     - need to think of how to reconcile that kind of change
 *     - needs custom diff algorithm that would traverse removed path to find a node with matching shape
 *       - matching shape means matching identifiers ??? maybe similar to how validateMatch works for different types
 *       - maybe it should run another deep-object-diff if it would find node with matching type
 *         - that could work 
 *         - we could call it 'replace with linked node' or 'short circuit node' or 'remove intermediate node'
 *     - case with removal of an if () {} but keeping part of block content. 
 *        - we should handle that so we need to check if removal was done in block context (or maybe more generic in nodes array)
 * Next steps 
 * 1. Implement a PoC with just one sub query for exact mode
 */

Code duplication research

bookmarks folder code duplication research
video summary in ./Movies/code-duplicates.mp4

✅ ❌ ⌛ 💡

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
__tests__		__tests__
crate		crate
src		src
tools		tools
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.prettierrc.json		.prettierrc.json
README.md		README.md
babel.config.js		babel.config.js
bin.js		bin.js
dependencygraph.svg		dependencygraph.svg
dependencygraph2.svg		dependencygraph2.svg
devFile		devFile
devQuery		devQuery
jest.config.js		jest.config.js
jest.setup.js		jest.setup.js
metrics		metrics
package.json		package.json
tsconfig.json		tsconfig.json
webpack.config.js		webpack.config.js
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MVP CLI

Further product development

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MVP CLI

Further product development

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages