Skip to content

Add debounce to import tracking#4875

Merged
rchiodo merged 8 commits into
masterfrom
rchiodo/debounce_import
Mar 21, 2019
Merged

Add debounce to import tracking#4875
rchiodo merged 8 commits into
masterfrom
rchiodo/debounce_import

Conversation

@rchiodo

@rchiodo rchiodo commented Mar 21, 2019

Copy link
Copy Markdown

Add steps on how to profile
Skip having test-results.xml checked in.

For #4852

  • Pull request represents a single change (i.e. not fixing disparate/unrelated things in a single PR)
  • Title summarizes what is changing
  • Has a news entry file (remember to thank yourself!)
  • Has sufficient logging.
  • Has telemetry for enhancements.
  • Unit tests & system/integration tests are added/updated
  • Test plan is updated as appropriate
  • package-lock.json has been regenerated by running npm install (if dependencies have changed)
  • The wiki is updated with any design decisions/details.

rchiodo added 2 commits March 21, 2019 11:01
Add steps on how to profile
Skip having test-results.xml checked in.
@rchiodo rchiodo self-assigned this Mar 21, 2019

@IanMatthewHuff IanMatthewHuff left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Comment thread CONTRIBUTING.md Outdated
Comment thread news/3 Code Health/4852.md Outdated
Comment thread src/client/common/utils/decorators.ts Outdated
Comment thread src/client/telemetry/importTracker.ts Outdated
}

// Debounce every 5 seconds. We don't need that much telemetry.
@debounce(5000, true)

@DonJayamanne DonJayamanne Mar 21, 2019

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't the right place for debounce and i don't think we can use this approach either.

  • User opens file a.py saves it
  • Next opens file b.py saves it

Only file b.py gets procecssed.
#Resolved

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's true. One message max every 5 seconds. We don't get 'every' import.

Where else would I put it? It doesn't work on the onDocumentSaved because the debounce doesn't work for the TextDocument object.


In reply to: 267895392 [](ancestors = 267895392)

@DonJayamanne DonJayamanne Mar 21, 2019

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you want a.py to get processed? If you do, using the debounce decorator will not work.
As mentioned with the current approach, only b.py will get processed and not a.py.
If a user opens 10 files one after another, only the last file gets processed. #Resolved

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. That's on purpose. That's what the debounce will do. The other alternative is not debounce but just delay each request.


In reply to: 267904335 [](ancestors = 267904335)

@DonJayamanne DonJayamanne Mar 21, 2019

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. That's on purpose. That's what the debounce will do.

If you're happy with the requirements, thats fine. I'm not aware of the requirements.
To me this doesn't feel right, not processing all other files.

The other alternative is not debounce but just delay each request.

That will still not work,
ideally we need to debounce per file.

#Resolved

@codecov

codecov Bot commented Mar 21, 2019

Copy link
Copy Markdown

Codecov Report

Merging #4875 into master will increase coverage by 1%.
The diff coverage is 84%.

@@          Coverage Diff           @@
##           master   #4875   +/-   ##
======================================
+ Coverage      62%     62%   +1%     
======================================
  Files         372     372           
  Lines       14591   14605   +14     
  Branches     1153    1154    +1     
======================================
+ Hits         9044    9055   +11     
- Misses       5340    5343    +3     
  Partials      207     207

Comment thread src/client/common/utils/decorators.ts Outdated
Comment thread CONTRIBUTING.md Outdated
1. npm run clean <- This will clean your output directory
1. npm run compile
1. npm run compile-webviews
1. Copy the 'out' directory from your source code overtop of the 'out' directory in your user/.vscode/extensions/ms-python directory

@DonJayamanne DonJayamanne Mar 21, 2019

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confused, didn't understand where out is to be copied from and into where #Resolved

@DonJayamanne DonJayamanne Mar 21, 2019

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note, we have source maps generated when building the extension they are just renamed from *.js.map to *.js.map.disabled. #Resolved

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the maps work in the profiler. At least I couldn't get them to work.


In reply to: 267899221 [](ancestors = 267899221)

Comment thread CONTRIBUTING.md Outdated
rchiodo added 3 commits March 21, 2019 12:24
Use lineAt instead of getText.splitLines
Implement per file timeout instead
Comment thread src/client/telemetry/importTracker.ts Outdated
private getDocumentLines(document: TextDocument) : string [] {
return Array.apply(null, {length: Math.min(document.lineCount, MAX_DOCUMENT_LINES)}).map((a, i) => {
const text = document.lineAt(i).text.trim();
if (text && text.length) {

@DonJayamanne DonJayamanne Mar 21, 2019

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use isEmptyOrWhitespace instead of trimming. #Resolved

Comment thread src/client/telemetry/importTracker.ts Outdated

private getDocumentLines(document: TextDocument) : string [] {
return Array.apply(null, {length: Math.min(document.lineCount, MAX_DOCUMENT_LINES)}).map((a, i) => {
const text = document.lineAt(i).text.trim();

@DonJayamanne DonJayamanne Mar 21, 2019

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible text is undefined? #Resolved

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. I added a check for that.


In reply to: 267935085 [](ancestors = 267935085)

Fix timeout to be when finally done typing
@@ -18,9 +19,14 @@ import { IImportTracker } from './types';
const ImportRegEx = /^(?!['"#]).*from\s+([a-zA-Z0-9_\.]+)\s+import.*(?!['"])|^(?!['"#]).*import\s+([a-zA-Z0-9_\., ]+).*(?!['"])/;

@brettcannon brettcannon Mar 21, 2019

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const ImportRegEx = /^(?!['"#]).*from\s+([a-zA-Z0-9_\.]+)\s+import.*(?!['"])|^(?!['"#]).*import\s+([a-zA-Z0-9_\., ]+).*(?!['"])/;
const ImportRegEx = /^\s*(?from|import)\s+([^.]+)/;

This should capture only absolute imports and only the top-level package which is all we would be interested in. #WontFix

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would miss something like
import sklearn.foo

And it would not handle the case where an import was in a string or a comment.


In reply to: 267956790 [](ancestors = 267956790)

@brettcannon brettcannon Mar 21, 2019

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right because I forgot to strip out the trailing \s. 😄

As for missing imports in a string or comment, why would we want that? If the imports aren't being used then why would we care about that case? And that should be rare enough of a case that it wouldn't skew learning what the most popular public packages used are. #WontFix

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I meant the opposite, your string would capture ones in a string I believe. Oh maybe it wouldn't. Let me try it.


In reply to: 267960402 [](ancestors = 267960402)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it does work with strings.

But makes it more difficult to parse 'from x import b' statements. I have to second parse to get rid of the 'import' value. I think I'll just leave the one I have there.


In reply to: 267961343 [](ancestors = 267961343,267960402)

private pendingDocs = new Set<string>();
private sentMatches: Set<string> = new Set<string>();
// tslint:disable-next-line:no-require-imports
private hashFn = require('hash.js').sha256;

@brettcannon brettcannon Mar 21, 2019

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason you aren't using HMAC? Either way you need to add a salt to the hash. #Pending

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would the salt be stored? Seems like just obfuscation.


In reply to: 267956915 [](ancestors = 267956915)

@brettcannon brettcannon Mar 21, 2019

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use HMAC you can initialize it with the salt. #Pending

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do that if we deem it necessary to add another layer of obfuscation. We're internally going to generate a table of known hashes, so generating the table with the salt in them shouldn't be any more work.


In reply to: 267961221 [](ancestors = 267961221)

@rchiodo rchiodo merged commit 4e7df1e into master Mar 21, 2019
@rchiodo rchiodo deleted the rchiodo/debounce_import branch March 21, 2019 23:20
@lock lock Bot locked as resolved and limited conversation to collaborators Jul 30, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants