Skip to content

C# extractor: Improve performance by changing the caching#615

Merged
hvitved merged 5 commits into
github:masterfrom
calumgrant:cs/extractor-caching
Dec 21, 2018
Merged

C# extractor: Improve performance by changing the caching#615
hvitved merged 5 commits into
github:masterfrom
calumgrant:cs/extractor-caching

Conversation

@calumgrant
Copy link
Copy Markdown
Contributor

Achieves a ~10% single-threaded speedup (77s -> 69s) by changing the caching to entity->label, instead of entity->id & id->label. This removes one cache lookup in the case of a cache hit, and bypasses the ID cache which can be a little slow.

@calumgrant calumgrant added the C# label Dec 4, 2018
@calumgrant calumgrant requested a review from a team as a code owner December 4, 2018 18:03
Copy link
Copy Markdown
Contributor

@hvitved hvitved left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question.

Label existingLabel;
if (labelCache.TryGetValue(id, out existingLabel))
// Look up the Id in the idLabelCache
if (idLabelCache.TryGetValue(id, out label))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is idLabelCache needed? It appears to me that this is dead code, as entityLabelCache.TryGetValue() returns false iff idLabelCache.TryGetValue() returns false?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking about this, I see that this is indeed not dead code, but can be hit when two different entities share the same ID, and therefor label. Could you put a comment saying something like that here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some ways, it's a bug for two entities to map to the same label, and it could be a performance win if we removed this step.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I remember, this could happen because two underlying Roslyn symbols could represent the same thing, but I forgot under which circumstances.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some debugging to see where the conflicting labels occurred, which flushed out a few bugs that I've fixed. Overall, the new design gives around a 25% speedup excluding CIL (which doesn't use this code).

Copy link
Copy Markdown
Contributor

@hvitved hvitved left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds awesome with the improved performance. I have some suggestions for changes, as well as some questions.

switch (methodDecl.MethodKind)
var methodKind = methodDecl.MethodKind;

if(methodKind == MethodKind.ExplicitInterfaceImplementation)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space

/// <param name="factory">The entity factory.</param>
/// <param name="init">The initializer for the entity.</param>
/// <returns>The new/existing entity.</returns>
public Entity CreateEntity2<Type, Entity>(ICachedEntityFactory<Type, Entity> factory, Type type) where Entity : ICachedEntity
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateEntity2 -> CreateMaybeNullEntity?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not quite the right name for it. It's also used when you want to cache using the entity as the key, e.g.

  1. When init is null,
  2. When Type is a tuple. In each case the tuple may be different, but you want to index only Item1 (e.g. GetAlreadyCreated)
  3. When you want to use the same init to create different entities depending on Type. E.g. when you want either a Type or a TypeRef.

return true;
var entity = factory.Create(this, type);

if(entityLabelCache.TryGetValue(entity, out var label))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space

private Entity CreateNonNullEntity<Type, Entity>(ICachedEntityFactory<Type, Entity> factory, Type init) where Entity : ICachedEntity
{
if (objectEntityCache.TryGetValue(init, out var cached))
return (Entity)cached;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this cast needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - it's cached as an ICachedEntity and could theoretically give an InvalidCastException if you try to create it with a different type next time.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I failed to see that Entity was a type parameter; misread it as IEntity...

entity.Label = label;
return entity;
}
else
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer branches myself -- I moved the return out of the branches instead.

}

readonly Dictionary<IId, Label> labelCache = new Dictionary<IId, Label>();
readonly Dictionary<IId, ICachedEntity> idLabelCache = new Dictionary<IId, ICachedEntity>();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add #if DEBUG_LABELS

{
return factory.CreateEntity(cx, (t1, t2));
}
where Entity : ICachedEntity => factory.CreateEntity2(cx, (t1, t2));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not factory.CreateEntity -- (t1, t2) is never null? Same below

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following on from the comment above - you want to use Tuple.Item1 as the key (the other columns may be different), which is the behaviour of CreateEntity2.


bool NeedsPopulation { get; }

object UnderlyingObject { get; }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why added?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was quite useful for debugging. I'm minded to keep it in if that's ok?

get; private set;
}

object ICachedEntity.UnderlyingObject => symbol;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why added?

entityLabelCache[entity] = label;
DefineLabel(label, id);
if (entity.NeedsPopulation)
Populate(null, entity);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why always null argument, and not init as ISymbol?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a bug....

    Add more tests for duplicated entities, and fix some duplicated entities.
    Update the TupleTypes output - some extraneous results gone so it's probably better.
@pavgust
Copy link
Copy Markdown
Contributor

pavgust commented Dec 20, 2018

This pull request introduces 1 alert when merging d73b28e into a600353 - view on LGTM.com

new alerts:

  • 1 for Redundant ToString() call

Comment posted by LGTM.com

@hvitved hvitved merged commit 5478155 into github:master Dec 21, 2018
cklin pushed a commit that referenced this pull request May 23, 2022
apply the implicit-this patch to the remaining go code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants