-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Add debounce to import tracking #4875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
051eeae
655c8fd
c154aed
eb82e25
a40ef11
7835e39
bc43d3b
10b1353
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -28,4 +28,5 @@ obj/** | |
| tmp/** | ||
| .python-version | ||
| .vs/ | ||
| test-results.xml | ||
| !build/ | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1 @@ | ||
| Hash imports so we can get every import and not just known ones. We have to unhash them on the other side to determine if a specific package or not. | ||
| Hash imports so we can get every import and not just known ones. Should happen at most once every 5 seconds. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,8 +8,9 @@ import * as path from 'path'; | |
| import { TextDocument } from 'vscode'; | ||
|
|
||
| import { sendTelemetryEvent } from '.'; | ||
| import { noop, sleep } from '../../test/core'; | ||
| import { noop } from '../../test/core'; | ||
| import { IDocumentManager } from '../common/application/types'; | ||
| import { isTestExecution } from '../common/constants'; | ||
| import { IHistoryProvider } from '../datascience/types'; | ||
| import { ICodeExecutionManager } from '../terminals/types'; | ||
| import { EventName } from './constants'; | ||
|
|
@@ -18,9 +19,14 @@ import { IImportTracker } from './types'; | |
| const ImportRegEx = /^(?!['"#]).*from\s+([a-zA-Z0-9_\.]+)\s+import.*(?!['"])|^(?!['"#]).*import\s+([a-zA-Z0-9_\., ]+).*(?!['"])/; | ||
| const MAX_DOCUMENT_LINES = 1000; | ||
|
|
||
| // Capture isTestExecution on module load so that a test can turn it off and still | ||
| // have this value set. | ||
| const testExecution = isTestExecution(); | ||
|
|
||
| @injectable() | ||
| export class ImportTracker implements IImportTracker { | ||
|
|
||
| private pendingDocs = new Map<string, NodeJS.Timer>(); | ||
| private sentMatches: Set<string> = new Set<string>(); | ||
| // tslint:disable-next-line:no-require-imports | ||
| private hashFn = require('hash.js').sha256; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any reason you aren't using HMAC? Either way you need to add a salt to the hash. #Pending
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where would the salt be stored? Seems like just obfuscation. In reply to: 267956915 [](ancestors = 267956915)
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you use HMAC you can initialize it with the salt. #Pending
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll do that if we deem it necessary to add another layer of obfuscation. We're internally going to generate a table of known hashes, so generating the table with the salt in them shouldn't be any more work. In reply to: 267961221 [](ancestors = 267961221) |
||
|
|
@@ -44,21 +50,51 @@ export class ImportTracker implements IImportTracker { | |
| } | ||
|
|
||
| public async activate(): Promise<void> { | ||
| // Act like all of our open documents just opened. Don't do this now though. We don't want | ||
| // to hold up the activate. | ||
| await sleep(1000); | ||
| // Act like all of our open documents just opened. Our timeout will make sure this is delayed | ||
| this.documentManager.textDocuments.forEach(d => this.onOpenedOrSavedDocument(d)); | ||
| } | ||
|
|
||
| private getDocumentLines(document: TextDocument) : string [] { | ||
| return Array.apply(null, {length: Math.min(document.lineCount, MAX_DOCUMENT_LINES)}).map((a, i) => { | ||
| const line = document.lineAt(i); | ||
| if (line && !line.isEmptyOrWhitespace) { | ||
| return line.text; | ||
| } | ||
| return undefined; | ||
| }).filter((f: string | undefined) => f); | ||
| } | ||
|
|
||
| private onOpenedOrSavedDocument(document: TextDocument) { | ||
| // Make sure this is a python file. | ||
| if (path.extname(document.fileName) === '.py') { | ||
| // Parse the contents of the document, looking for import matches on each line | ||
| const lines = document.getText().splitLines({ trim: true, removeEmptyEntries: true }); | ||
| this.lookForImports(lines.slice(0, Math.min(lines.length, MAX_DOCUMENT_LINES)), EventName.KNOWN_IMPORT_FROM_FILE); | ||
| this.scheduleDocument(document); | ||
| } | ||
| } | ||
|
|
||
| private scheduleDocument(document: TextDocument) { | ||
| // If already scheduled, cancel. | ||
| if (this.pendingDocs.has(document.fileName)) { | ||
| clearTimeout(this.pendingDocs.get(document.fileName)); | ||
| this.pendingDocs.delete(document.fileName); | ||
| } | ||
|
|
||
| // Now schedule a new one. | ||
| if (testExecution) { | ||
| // During a test, check right away. It needs to be synchronous. | ||
| this.checkDocument(document); | ||
| } else { | ||
| // Wait five seconds to make sure we don't already have this document pending. | ||
| this.pendingDocs.set(document.fileName, setTimeout(() => this.checkDocument(document), 5000)); | ||
| } | ||
| } | ||
|
|
||
| private checkDocument(document: TextDocument) { | ||
| this.pendingDocs.delete(document.fileName); | ||
| const lines = this.getDocumentLines(document); | ||
| this.lookForImports(lines, EventName.KNOWN_IMPORT_FROM_FILE); | ||
| } | ||
|
|
||
| private onExecutedCode(code: string) { | ||
| const lines = code.splitLines({ trim: true, removeEmptyEntries: true }); | ||
| this.lookForImports(lines, EventName.KNOWN_IMPORT_FROM_EXECUTION); | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should capture only absolute imports and only the top-level package which is all we would be interested in. #WontFix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would miss something like
import sklearn.foo
And it would not handle the case where an import was in a string or a comment.
In reply to: 267956790 [](ancestors = 267956790)
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right because I forgot to strip out the trailing
\s. 😄As for missing imports in a string or comment, why would we want that? If the imports aren't being used then why would we care about that case? And that should be rare enough of a case that it wouldn't skew learning what the most popular public packages used are. #WontFix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I meant the opposite, your string would capture ones in a string I believe. Oh maybe it wouldn't. Let me try it.
In reply to: 267960402 [](ancestors = 267960402)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it does work with strings.
But makes it more difficult to parse 'from x import b' statements. I have to second parse to get rid of the 'import' value. I think I'll just leave the one I have there.
In reply to: 267961343 [](ancestors = 267961343,267960402)