Skip to content

asm2wasm debuginfo#895

Merged
kripken merged 7 commits into
masterfrom
asm2wasm-debuginfo
Feb 7, 2017
Merged

asm2wasm debuginfo#895
kripken merged 7 commits into
masterfrom
asm2wasm-debuginfo

Conversation

@kripken

@kripken kripken commented Feb 3, 2017

Copy link
Copy Markdown
Member

Users have complained that debugging wasm is hard, and it'll be a while til we have proper source maps or such. Meanwhile, this pr might help a little, it makes asm2wasm use debug info when present, and emit it as comments in the wast. For example, hello world might look like this:

  (func $_main (result i32)
    ;; tests/hello_world.c:4
    (drop
      (call $_printf
        (i32.const 1144)
        (i32.const 1160)
      )
    )
    ;; tests/hello_world.c:5
    (return
      (i32.const 0)
    )
  )

Technically, the debug info is translated into intrinsic calls. This is for efficiency, so that they incur no overhead when not using it (so we don't make each node have a debug info field or anything like that). Then we compile (and possibly optimize), and then we translate those intrinsics into annotations on the nodes, which are a map on the side per function of node => annotation. Then the printer just prints the annotation with the node, if there is one.

@kripken

kripken commented Feb 3, 2017

Copy link
Copy Markdown
Member Author

Some error on the bots and locally other.test_binaryen_names fails - posting to remind myself to look at this tomorrow.

@kripken kripken force-pushed the asm2wasm-debuginfo branch 3 times, most recently from d46b672 to 23c5752 Compare February 3, 2017 18:46
ignore --debuginfo if not emitting text, as wasm binaries don't support that yet
@kripken kripken force-pushed the asm2wasm-debuginfo branch from 23c5752 to 55b7289 Compare February 3, 2017 19:51
@kripken kripken force-pushed the asm2wasm-debuginfo branch from ff420e1 to 3e75a78 Compare February 3, 2017 21:07
@kripken

kripken commented Feb 3, 2017

Copy link
Copy Markdown
Member Author

This should be good now, if anyone wants to review.

Comment thread src/asm2wasm.h Outdated
if (out + 100 >= end) {
Fatal() << "error in handling debug info";
}
if (input[0] == '/' && input[1] == '/' && input[2] == '@' && input[3] == 'l' && input[4] == 'i' && input[5] == 'n' && input[6] == 'e') {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about startsWith(input, "//@line") ? Providing startsWith in a utility header.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing. I worry it will be less efficient, but I guess a modern compiler should be able to do this well...

Comment thread src/asm2wasm.h Outdated
*out++ = *input++;
*out++ = *input++;
*out++ = *input++;
*out++ = *input++;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memcpy?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Comment thread src/asm2wasm.h Outdated
// before it - this is guaranteed to be correct without opts,
// and is usually decently accurate with them.
auto size = strlen(input);
auto upperBound = Index(size * 1.25) + 100;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magic numbers are magic :)

Does this 100 match the 100 a few lines down?
Would be good to replace these with named constants for clarity.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point, fixing.

Comment thread src/asm2wasm.h Outdated
Function* processFunction(Ref ast);

public:
CallImport* isDebugInfo(Expression* curr) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is___ makes me thing "boolean". Better name would be just debugInfo, which can still be used in a boolean context.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, yeah. Changing to checkDebugInfo which is consistent with other stuff in wasm.h.

Comment thread src/mixed_arena.h

#include <atomic>
#include <cassert>
#include <cstdlib>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems unused

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed for abort, this just wasn't noticed before because other locations happened to have the include anyhow, I think.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense

@kripken

kripken commented Feb 6, 2017

Copy link
Copy Markdown
Member Author

Thanks for the feedback, added a commit with fixes.

Comment thread src/mixed_arena.h

#include <atomic>
#include <cassert>
#include <cstdlib>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense

Comment thread src/asm2wasm.h Outdated
std::string DEBUGINFO_INTRINSIC = EMSCRIPTEN_DEBUGINFO.str;
auto DEBUGINFO_INTRINSIC_SIZE = DEBUGINFO_INTRINSIC.size();
bool seenUseAsm = false;
auto startsWith = [&](const char *s) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this a top-level function as opposed to a lambda

  1. when reading this code it can be assumed that startsWith exists elsewhere, so it doesn't interrupt the flow of reading this function
  2. the way its currently implemented a better name would be inputAtTheCurrentPositionStartsWith("foo"). I feel startsWith(input, "foo") is more explicit at the call site and requires less context for the reader to keep in their head

Both should be inlineable by the compiler so it should be equivalent to the unrolled version, either way.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair points, but which toplevel? Of the object? Or the entire file?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my head it's a top-level function that requires no state, so of the entire file. In theory it could be put in a utils header and shared.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, refactored to top level in the file

Comment thread src/asm2wasm.h Outdated
out += line.size();
*out++ = ')';
*out++ = ';';
} else if (!seenUseAsm && (startsWith("asm'") || startsWith("asm\""))) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could theoretically be optimized by the compiler to be equivalent to what came before, but if (!seenUseAsm && startsWith(input, "asm") && (input[3] == '\'' || input[3] == '"')) is a possible rewrite that should almost-certainly be the same, while being I think equally legible. (I feel ambivalent about this either way though)

Comment thread src/asm2wasm.h
// before it - this is guaranteed to be correct without opts,
// and is usually decently accurate with them.
const auto SCALE_FACTOR = 1.25; // an upper bound on how much more space we need as a multiple of the original
const auto ADD_FACTOR = 100; // an upper bound on how much we write for each debug info element itself

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments are pretty helpful, thanks!

Comment thread src/asm2wasm.h
static bool startsWith(const char* string, const char *prefix) {
while (1) {
if (*prefix == 0) return true;
if (*string++ != *prefix++) return false;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also probably return false if *string == 0 for completeness, e.g. startsWith("foo", "foobar")

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, fixing.

@jgravelle-google jgravelle-google left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modulo nits, lgtm!

Comment thread src/asm2wasm.h Outdated
} else if (!seenUseAsm && (startsWith(input, "asm'") || startsWith(input, "asm\""))) {
// end of "use asm" or "almost asm"
seenUseAsm = true;
memcpy(out, input, 5);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth pulling the 5s into a constant?
Actually yeah it's not obvious why 5 and not 4, presumably consuming asm\'\n?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, it's actually the semicolon at the end, "use asm";. I'll clarify.

Comment thread src/asm2wasm.h
strcpy(out, line.c_str());
out += line.size();
*out++ = ')';
*out++ = ';';

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing to do here just a thought/comment/observation:
This interleaving section of strcpy and *out++ = char is pretty verbose and I wish there was a cleaner/more compact way to do it.
Something like outStream << DEBUGINFO_INTRINSIC << '(' << fileIndex << ',' << line << ");", though that would require a very different model of how this is done and probably isn't worth it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, might be a better way. The code is not expected to change much if at all, though, so I think it's maintainable as it is. If it were evolving code definitely it would need to be refactored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants