Skip to content

Add .clang-format#1407

Merged
kripken merged 3 commits into
WebAssembly:masterfrom
aheejin:clang-format
Feb 13, 2018
Merged

Add .clang-format#1407
kripken merged 3 commits into
WebAssembly:masterfrom
aheejin:clang-format

Conversation

@aheejin

@aheejin aheejin commented Feb 6, 2018

Copy link
Copy Markdown
Member

I picked Chromium style as a base style based on @dschuff's suggestion, but I don't care much about particulars as long as it is consistent. We can also use LLVM or Mozilla as a base style. Let me know your preferences. And I modified ColumnLimit to 160 because a lot of code in this repo have columns longer than the usual limit, 80. (I actually kinda prefer 80 personally though 😅 ) Any suggestions are welcome.

@dschuff

dschuff commented Feb 6, 2018

Copy link
Copy Markdown
Member

I'd love to get to a place where everyone always clang-formats so the style is always consistent and nobody ever argues about it (Currently the style is just inconsistent). I like Chromium style more than LLVM but If we can agree on something I don't much care what it is.

@dschuff

dschuff commented Feb 6, 2018

Copy link
Copy Markdown
Member

(also git clang-format) is great when you start out without the whole codebase being formatted. It just formats the code that is changed from master.

Comment thread .clang-format Outdated
@@ -0,0 +1,2 @@
BasedOnStyle: Chromium
ColumnLimit: 160

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. why not have new/changes code formatted to 80 cols like most other projects?

From a cursory look at the binaryen code it looks like most of the long lines are comments. I'm particularly keen on the nicely wrapping comments, so I would love to see this line removed :) But perhaps we can land it like this and remove in the followup CL?

@aheejin aheejin Feb 6, 2018

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to remove this too. If no one else has concerns, I'll remove it.

@binji

binji commented Feb 6, 2018

Copy link
Copy Markdown
Member

Personally, I like Chromium style, but I imagine not everyone knows what it looks like. Maybe it would be good to create a separate commit somewhere with all the C++ files updated so they can see what it would look like?

@kripken

kripken commented Feb 6, 2018

Copy link
Copy Markdown
Member

I like @binji's suggestion to get a "preview" first.

What are the effects of this PR? It doesn't look like it adds any tests for incorrect format changes. Does it just change what clang-format does if it is run? Do we not want it to be run automatically somehow?

Style-wise, I feel strongly about

  • If etc. arms on a new line need curly braces (that one saves me from bugs now and then).

I would prefer but do not feel strongly about

  • Line length limits. 80 feels much too short to me, personally.

@sbc100

sbc100 commented Feb 6, 2018

Copy link
Copy Markdown
Member

Oh yeah, I ben's idea. Then we could see how different the style is to what exists today. it would be good to choose a style based on what the code looks like today.

@sbc100

sbc100 commented Feb 6, 2018

Copy link
Copy Markdown
Member

This change just effects what clang-format does when run in this directory.

It doesn't do any enforcement. Which is probably a good thing initially I guess.

@aheejin

aheejin commented Feb 6, 2018

Copy link
Copy Markdown
Member Author

The reasons I didn't do any actual reformatting here were

  1. I was not sure about doing the massive change on the whole codebase and messing up git blame history by reformatting all the code
  2. I wanted to get everyone's preferences and feedback first on the style.

I like the preview idea too. Maybe I'll go ahead and submit the whole-codebase-reformatting PR for each style so that everyone can take a look. They wouldn't (and shouldn't) be landed; they are just to give a sense of how each style looks. After we decide on a style, we'll see if we need to actually reformat the whole codebase or not.

@sbc100

sbc100 commented Feb 6, 2018

Copy link
Copy Markdown
Member

Maybe you can just evaluate them locally, and upload branches for just the one or two configs that you find to have the minimal impact?

@aheejin

aheejin commented Feb 6, 2018

Copy link
Copy Markdown
Member Author

@kripken

If etc. arms on a new line need curly braces (that one saves me from bugs now and then).

Whichever style we choose, we can make sure this one sticks by overriding it.

Line length limits. 80 feels much too short to me, personally.

I personally prefer 80 cols, but don't mind extending it if others have concerns. Do you have some good number you feel comfortable with in mind?

@aheejin

aheejin commented Feb 6, 2018

Copy link
Copy Markdown
Member Author

@sbc100 That makes sense. I'll do that.

@eholk

eholk commented Feb 6, 2018

Copy link
Copy Markdown
Contributor

Line length limits. 80 feels much too short to me, personally.

I personally prefer 80 cols, but don't mind extending it if others have concerns. Do you have some good number you feel comfortable with in mind?

I don't do much work on binaryen, so I wouldn't put a lot of weight on my opinion, but I've been a staunch believer in 80 columns for most of my life. That said, Rust seems to have standardized on 100 columns and I find that is not only not terrible, but even somewhat pleasant. I could get behind standardizing on 100 columns.

@aheejin

aheejin commented Feb 7, 2018

Copy link
Copy Markdown
Member Author

@kripken Looks like clang-format itself cannot insert or delete any characters like curly braces. It just formats code. If we want to enforce curly braces for one-line ifs, we need clang-tidy instead. I can do that, but maybe that calls for another PR.

@aheejin

aheejin commented Feb 7, 2018

Copy link
Copy Markdown
Member Author

Comparisons on Binaryen

It's hard to tell exactly which style resembles the current codebase the most, but I think either Chromium or LLVM style would be OK. (Webkit uses the indent width of 4 and Mozilla breaks after the return types of top-level functions, which I found as most different things from the current codebase)

Differences between LLVM and Chromium styles

Here are some properties in which LLVM and Chromium have different values. (There are many other properties where their values are the same. I didn't list them up here.) You can compare Chromium vs. LLVM style output for binaryen here. Note that even if we choose one style, we can override each property separately, so it doesn't mean we should accept everything on that style.

AccessModifierOffset

Chromium: -1

class A {
 public:
 private:
};

LLVM: -2

class A {
public:
private:
};

AlignEscapedNewlines

Chromium: left

#define A   \
  int aaaa; \
  int b;    \
  int dddddddddd;

LLVM: right

#define A                                                                      \
  int aaaa;                                                                    \
  int b;                                                                       \
  int dddddddddd;

AllowAllParametersOfDeclarationOnNextLine

Chromium: false

void myFunction(int a,
                int b,
                int c,
                int d,
                int e);

LLVM: true

void myFunction(
    int a, int b, int c, int d, int e);

AllowShortFunctionsOnASingleLine

Chromium: Allow class-inlined functions only

class Foo {
  void f() { foo(); }
};
void f() {
  foo();
}
void f() {
}

LLVM: Allow all

class Foo {
  void f() { foo(); }
};
void f() { bar(); }

AlwaysBreakBeforeMultilineStrings

Chromium: true

aaaa =
    "bbbb"
    "cccc";

LLVM: false

aaaa =  "bbbb"
        "cccc";

AlwaysBreakTemplateDeclarations

Chromium: true

template <typename T>
class C {};

LLVM: false

template <typename T> class C {};

BinPackParameters

Chromium: false

void f(int aaaaaaaaaaaaaaaaaaaa,
       int aaaaaaaaaaaaaaaaaaaa,
       int aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) {}

LLVM: true

void f(int aaaaaaaaaaaaaaaaaaaa, int aaaaaaaaaaaaaaaaaaaa,
       int aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) {}

ConstructorInitializerAllOnOneLineOrOnePerLine

Chromium: true

SomeClass::Constructor()
    : aaaaaaaa(aaaaaaaa),
      aaaaaaaa(aaaaaaaa),
      aaaaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaa) {
  return 0;
}

LLVM: false

SomeClass::Constructor()
    : aaaaaaaa(aaaaaaaa), aaaaaaaa(aaaaaaaa),
      aaaaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaa) {
  return 0;
}

IndentCaseLabels

Chromium: true

switch (fool) {
  case 1:
    bar();
    break;
  default:
    plop();
} 

LLVM: false

switch (fool) {
case 1:
  bar();
  break;
default:
  plop();
} 

PointerAlignment

Chromium: left

int* a;

LLVM: right

int *a;

SpacesBeforeTrailingComments

Chromium: 2

void f() {
  if (true) {  // foo1
    f();       // bar
  }            // foo
}

LLVM: 1

void f() {
  if (true) { // foo1
    f();      // bar
  }           // foo
}

@kripken

kripken commented Feb 7, 2018

Copy link
Copy Markdown
Member

Thanks @aheejin, very interesting.

Overall any of the first 3 seem generally ok to me, they are the closest to the current code I think. I'd slightly prefer the LLVM style myself out of those.

I'd love as much line size as possible, personally. If Rust has 100, maybe that's ok, but I'd like 120 even better ;)

I'm not worried about changes to git blame - consistent style is worth the extra effort to add a step to look through a widespread style change.

So if I understand correctly, if I were to run clang-format in the root dir of this repo, it would update all the files to the proper style? And we'd expect people to do that before creating PRs (or we'd do it ourselves periodically)?

@sbc100

sbc100 commented Feb 7, 2018

Copy link
Copy Markdown
Member

Basically yes. As a first step we can ask people to run git clang-format origin/master on PRs. which only formats the lines they touch.

If you want to do a bulk reformat you just run clang-format (probably with -i for inplace since you are in a git checkout).

@aheejin

aheejin commented Feb 7, 2018

Copy link
Copy Markdown
Member Author

@kripken What I did is something like

$ cd binaryen/src
$ find ./ -name '*.cpp' -or -name '*.h' | xargs clang-format -i

(You can specify a style in the command line using something like clang-format -style=llvm .... If you have .clang-format in the top project directory that's not needed.)

@aheejin

aheejin commented Feb 7, 2018

Copy link
Copy Markdown
Member Author

@kripken Oh and I also noticed you preferred type* a over type *a in #1403. If we want to choose the LLVM style maybe we should tweak that, because LLVM defaults to type *a.

@dschuff

dschuff commented Feb 8, 2018

Copy link
Copy Markdown
Member

Here's my bikeshed color preference...
I mentioned before that I preferred chromium but looking at this list here, I actually prefer LLVM style for most of these (I think in general I must have a stronger preference to "use less vertical space" than the Chromium style does). EXCEPT for PointerAlignment which I feel more strongly about than pretty much all of the others combined. So my preference would be "LLVM style except for PointerAlignment" although there's a bit of an argument to be made to take a style altogether with no exceptions.

Actually if we're going to have a long line length, I think LLVM makes more sense because Chromium makes several choices (e.g. no BinPackParameters) that take up extra lines and don't use that extra horizontal space at all.

@dschuff

dschuff commented Feb 8, 2018

Copy link
Copy Markdown
Member

Also, if we want to just reformat everything, after we do it we could then add a presubmit check that runs clang-format over the code and fails if the output would change. That would prevent people from landing misformatted code (at the cost of making all contributors install clang-format, which would raise the barrier to contribution).

@kripken

kripken commented Feb 8, 2018

Copy link
Copy Markdown
Member

Is PointerAlignment the char* x vs char *x issue?

Yeah, I think we should go with char* (not the LLVM way). That prevents confusion in my experience.

@aheejin

aheejin commented Feb 9, 2018

Copy link
Copy Markdown
Member Author

@jgravelle-google Do you have any preferences?

@rongjiecomputer

Copy link
Copy Markdown
Contributor

Not a frequent contributor, but I really prefer IndentCaseLabels: true (which is what current codebase does), for the rest my opinion is the same as everyone else (LLVM + PointerAlignment: left).

@sbc100

sbc100 commented Feb 9, 2018

Copy link
Copy Markdown
Member

Oh me too ... please make sure we have IndentCaseLabels: true. I find it crazy that llvm doesn't do this.

@aheejin

aheejin commented Feb 11, 2018

Copy link
Copy Markdown
Member Author

So, to sum up the responses here,

Things everyone seems to be OK with:

  • LLVM as the base style
  • PointerAlignment: left (as in int* a)

Things people have different preferences on:

  • IndentCaseLabels

    • Our to-be base style (LLVM) sets this to 'false' (don't indent), and @dschuff said he wanted shorter columns
    • @aheejin prefers 'false' as well, but don't mind much
    • @rongjiecomputer, @kripken, @sbc100 seem to prefer 'true' (indent)
    • Current binaryen codebase mostly indents case lines, so 'true' (indent)
    • Conclusion: I think if no one has strong preferences we can do this as 'true' (indent). Would this be OK?
  • ColumnLimit

    • Several people expressed preferences for 80 cols (@dschuff, @aheejin, @sbc100, and @eholk)
    • @eholk also suggested 100 cols can be OK
    • @kripken suggested 120 cols
    • Conclusion: Well, it's hard to decide :) How about 100 as a midpoint?

Let me know if you have further concerns. If there's no more comments, I'll fix the style as

  • Base style: LLVM
  • PointerAlignment: left
  • IndentCaseLabels: true
  • ColumnLimit: 100

@aheejin

aheejin commented Feb 12, 2018

Copy link
Copy Markdown
Member Author

I found some more tweaks that more closely matches the current codebase, like

  • AllowShortCaseLabelsOnASingleLine: true
  • AllowShortBlocksOnASingleLine: true
  • AllowShortFunctionsOnASingleLine: Inline

Maybe I'll give this couple more days to see if I can find more.

@aheejin

aheejin commented Feb 13, 2018

Copy link
Copy Markdown
Member Author

Pushed some tweaks to follow the styles on the current codebase.

  • ContinuationIndentWidth: 2
// ContinuationIndentWidth: 2
int i = longFunction( // Assume this line is very long
  arg); // this line is indented by ContinuationIndentWidth

This was set to 4 in the LLVM style, but the current codebase almost always uses 2.

  • ConstructorInitializerIndentWidth: 2
// ConstructorInitializerIndentWidth: 2
MyClass::MyClass
  : init1(init1), init2(init2), ... // This line is indented by ConstructorInitializerIndentWidth

This was set to 4 in the LLVM style, but the current codebase almost always uses 2 as well.

  • AlignAfterOpenBracket: DontAlign
// AlignAfterOpenBracket: Align
someLongFunction(argument1,
                 argument2);

// AlignAfterOpenBracket: DontAlign
someLongFunction(argument1,
  argument2); 

// AlignAfterpenBracket: AlwaysBreak
someLongFunction(
  argument1, argument2);

LLVM's style sets this to Align, but the current codebase mostly don't align. And given that now we have longer-than-common column limit 100, setting this to Align can make lines like

someVeryVeryVeryVeryVeryVeryVeryVeryVeryVeryVeryVery...VeryVeryLongFunction(arg1,
                                                                            arg2);

I thought about adding these, but ended up not doing that.

  • AllowShortCaseLabelsOnASingleLine: true
  • AllowShortIfStatementsOnASingleLine: true
  • AllowShortBlocksOnASingleLine: true
  • AllowShortLoopsOnASingleLine: true

These allow (more precisely, force) short case labels / if statements / blocks / loops into a single line. For example, if AllowShortIfStatementsOnASingleLine is true,

if (condition)
  foo();

becomes

if (contidion) foo();

If these are set, all short case labels / if statements / blocks / loops are merged into a single line. What I wanted is to allow single lines but not to merge all existing codebase to single lines, but apparently there is no way to do that, because the output of clang-format should be determined based on code and style configurations.

@kripken

kripken commented Feb 13, 2018

Copy link
Copy Markdown
Member

Thanks, lgtm.

How should we move forward after this? A followup that applies the style sounds good to me (even though it changes history).

@kripken kripken merged commit 41faf24 into WebAssembly:master Feb 13, 2018
@aheejin aheejin deleted the clang-format branch February 14, 2018 00:31
@kripken

kripken commented Feb 14, 2018

Copy link
Copy Markdown
Member

Running this locally, I see it transform

-        if (j > 0) std::cout << ", ";
+        if (j > 0)
+          std::cout << ", ";

I thought that we were going to allow it to stay on a single line? Or maybe I misunderstood the discussion above.

@aheejin

aheejin commented Feb 15, 2018

Copy link
Copy Markdown
Member Author

In #1407 (comment), I wrote the reason why I ended up not doing that. Refer to the paragraph starts with

I thought about adding these, but ended up not doing that. ...


In short, if I enables these parameters, all code that are written in

if (j > 0)
  std::out << ", ";

will be all converted to

if (j > 0) std::cout << ", ";

While I wanted to allow the short one-line form, I was not sure if we should merge all existing two-line code into a single line. So there is no "leave code as is" kind of option. For example, if we set AllowShortIfStatementsOnASingleLine to true, all short if statements are merged into a single line if they fit within 100 cols, even if they are currently written in two lines. If we don't set the option, all if statements are converted to the two-line form as you saw. We have to choose one. We can set it to true if you prefer that.

Oh, and one more thing, even if we set AllowShortIfStatementsOnASingleLine to true, it does not handle if with else. Currently there is no option that allows this code:

if (j > 0) do_this();
else do_that();

They are always broken down to

if (j > 0)
  do_this();
else
  do_that();

@kripken

kripken commented Feb 15, 2018

Copy link
Copy Markdown
Member

I see, thanks. Make sense. In that case, we should add { } around those when applying the new style, and never put something (with an else) on the same line.

Btw, about the attribution issue, I had a silly thought: is there a practical way to run clang-format on just the code a person is already the last modifier of? That is, that clang-format would not change git blame much. Then each person could update their own. This doesn't seem trivial to do, but on the other hand, it seems like a useful thing for projects using clang-format and git, which should be quite a lot - maybe something like that exists?

@sbc100

sbc100 commented Feb 15, 2018

Copy link
Copy Markdown
Member

I don't see that much benefit to having git blame drive the clang-format. The problem with the reformatting IMHO is that it adds the extra commit so don't immediately see the last change to a given line. Preserving the author doesn't really buy much when you looking back the history, you really want to see the change itself.. at least not in my experience.

@aheejin

aheejin commented Feb 15, 2018

Copy link
Copy Markdown
Member Author

@kripken

  1. clang-format cannot add characters like { }. We need clang-tidy for that.
  2. I don't think there's a way to do that unless we develop a separate program that 1) runs clang-format to compute the result 2) computes a map of a person to lines and 3) asks each person to commit the corresponding change.

And I agree there's no much benefit to preserving only the author while we can't preserve the last 'history' anyway.

@sbc100

sbc100 commented Feb 15, 2018

Copy link
Copy Markdown
Member

Clang-format can be run on specific lines, right? (that is how git clang-format works) . So I think it is mostly likely possible to pipe the line numbers from blame into clang-format.. I'm only doubting the benefits.

@kripken

kripken commented Feb 15, 2018

Copy link
Copy Markdown
Member

Yeah, maybe the blame is no big deal either way. In that case, since I likely have most of the blame for bugs in this codebase ;) I could do a commit to update the style. I'd manually fix { } as is necessary as you said @aheejin

However, @sbc100 it sounds like you don't think even that is worth doing? In that case are you suggesting people only use the new format on new code going in?

@sbc100

sbc100 commented Feb 15, 2018

Copy link
Copy Markdown
Member

I wasn't expressing an option about reformatting the whole code, vs letting it happen organically.

I'm was just saying that if you are going to reformat the whole codebase its best to do it in one big commit, rather then trying to slice it up based on the author of a given line.

@aheejin

aheejin commented Feb 15, 2018

Copy link
Copy Markdown
Member Author

@kripken Maybe it's better to add .clang-tidy and fix {} thing automatically, to lessen the hassle.

@kripken

kripken commented Feb 15, 2018

Copy link
Copy Markdown
Member

@sbc100 oh ok, I'll prepare a commit then.

@aheejin I think there are not too many of those, I'll see if I can do them manually first.

@kripken

kripken commented Feb 15, 2018

Copy link
Copy Markdown
Member

Hmm yes, there are quite a lot of single-line ifs. So would need clang-tidy, which I don't seem to have installed anywhere, is that not part of clang?

@aheejin

aheejin commented Feb 15, 2018

Copy link
Copy Markdown
Member Author

Unlike clang-format that is a part of clang, clang-tidy is one of clang extra tools, so it isn't built by default unless you check out clang extra tools with llvm and clang as instructed here. clang+llvm official releases contain clang-tidy binaries too, so I think you don't need to build it by yourself.

@sbc100

sbc100 commented Feb 15, 2018

Copy link
Copy Markdown
Member

On linux systems it seems to be packaged (e.g. sudo apt-get install clang-tidy)

@kripken kripken mentioned this pull request Feb 21, 2018
@kripken

kripken commented Feb 21, 2018

Copy link
Copy Markdown
Member

Thanks, I opened #1435 with the formatting changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants