Skip to content

4.x: Add MockResolverIT#replace_cluster_test()#335

Merged
dkropachev merged 4 commits into
scylladb:scylla-4.xfrom
Bouncheck:scylla-4.x-215-test-3node
Sep 6, 2024
Merged

4.x: Add MockResolverIT#replace_cluster_test()#335
dkropachev merged 4 commits into
scylladb:scylla-4.xfrom
Bouncheck:scylla-4.x-215-test-3node

Conversation

@Bouncheck

@Bouncheck Bouncheck commented Sep 2, 2024

Copy link
Copy Markdown

Adds MockResolverIT#replace_cluster_test() as a test method that runs the replace cluster scenario and checks if driver managed to reconnect.

In the test we create three node cluster and replace it with completely new one. Hostname is mocked and points to all 3 nodes.

getNodeInetAddress(ccmBridge, 2),
getNodeInetAddress(ccmBridge, 3)
}));
ResolverProvider.setDefaultResolverFactory(mockResolverFactory);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make sure that there is only one instance of MockResolverFactory that is created and registered.
In the same way we do that for CCM_RULE.
I would recomment to have some method CreateAndRegisterMockResolverFactory that does that.

Also we need to move MockResolverFactory initialization to the class level.

@Bouncheck Bouncheck Sep 3, 2024

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure If I understand the intent here. Don't we want to have different test methods in this class that test different scenarios? So in that case we should be able to set different resolvers for each test method.

And for the scenario where we kill old cluster and start new with different IPs don't we want to have the ability to change the resolver to the new IPs too? I think this change would prevent that.

@dkropachev dkropachev Sep 3, 2024

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure If I understand the intent here. Don't we want to have different test methods in this class that test different scenarios? So in that case we should be able to set different resolvers for each test method.

Yes we want.
Not we don't need different resolvers, test can work with resolver targeting only dns records that are relevant to the test.

And for the scenario where we kill old cluster and start new with different IPs don't we want to have the ability to change the resolver to the new IPs too? I think this change would prevent that.

how so ? don't see anything that would block you from doing it.

ResolverProvider is a global factory, when you call ResolverProvider.setDefaultResolverFactory it changes resolverFactory globally.
Now, if you do that twice, second call will override first factory, but test won't be aware of that and fail, next you will have to debug such test to figure out what happend.
Not only that, if classes were already initialized with old resolver factory, you may get half of classes with old factory, half with new, which is going to create lot's of confusion.
To catch this issue early, we need ResolverFactory to throw an exception when this happnes.

@Bouncheck Bouncheck Sep 4, 2024

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that now it's as we discussed

@Bouncheck Bouncheck force-pushed the scylla-4.x-215-test-3node branch 2 times, most recently from 4c6e0b1 to 6281ff1 Compare September 3, 2024 17:17
@Bouncheck

Bouncheck commented Sep 3, 2024

Copy link
Copy Markdown
Author

I'll squash review adjustments commit into others before merge.
Actually we probably don't want to merge this until the reconnection to the new cluster works, which means fixing the unresolved socket getting overwritten.

@Bouncheck Bouncheck force-pushed the scylla-4.x-215-test-3node branch 2 times, most recently from 5a302d9 to 18ac7ff Compare September 4, 2024 11:58
@Bouncheck Bouncheck mentioned this pull request Sep 4, 2024
@Bouncheck Bouncheck force-pushed the scylla-4.x-215-test-3node branch from 18ac7ff to a6754ed Compare September 4, 2024 15:29
@Bouncheck Bouncheck marked this pull request as ready for review September 4, 2024 15:48
* @return new {@link Resolver}.
*/
public static Resolver getResolver(Class<?> clazz) {
public static synchronized Resolver getResolver(Class<?> clazz) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make it work correctly in 100% cases you better use ReadWriteLock, synchronized does not solve any problem here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it solves the problem I described. With synchronized you cannot have getResolver race with setDefaultResolverFactory and it's simpler

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually with synchronized i think we can have normal booleans instead

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it solves the problem I described. With synchronized you cannot have getResolver race with setDefaultResolverFactory and it's simpler

I am pretty sure it is not the case, can you make a test to test it.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, let's convert all of them to regular attributes

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Bouncheck Bouncheck force-pushed the scylla-4.x-215-test-3node branch 2 times, most recently from 3b40118 to 1d8f6f2 Compare September 5, 2024 14:23
@dkropachev

Copy link
Copy Markdown

@Bouncheck , can you please address test failures on CICD

@Bouncheck

Copy link
Copy Markdown
Author

Created #340 . It should solve the 6.1.1 failures.

should_connect_with_mocked_hostname() will use CcmBridge.Builder builder
instead.
@Bouncheck Bouncheck force-pushed the scylla-4.x-215-test-3node branch 3 times, most recently from 55e1557 to 77bca82 Compare September 6, 2024 14:05
Adds another method that runs scenario in which we replace three node
cluster with the completely new three node cluster. This method runs 20 times
called by `run_replace_test_20_times()` test method.
Adds "--wait-other-notice", "--wait-for-binary-proto" if missing.
@Bouncheck Bouncheck force-pushed the scylla-4.x-215-test-3node branch from 77bca82 to e9d0b9c Compare September 6, 2024 14:46
@dkropachev

Copy link
Copy Markdown

@Bouncheck , is it ready ?

@Bouncheck

Copy link
Copy Markdown
Author

Yes, but it was a little flaky with cassandra. Should i reduce number of nodes to 2?

@dkropachev

Copy link
Copy Markdown

Yes, but it was a little flaky with cassandra. Should i reduce number of nodes to 2?

I see it is fine, let's address it later.

@dkropachev dkropachev merged commit 275bade into scylladb:scylla-4.x Sep 6, 2024
@Bouncheck Bouncheck self-assigned this Sep 9, 2024
@dkropachev dkropachev changed the title Add MockResolverIT#replace_cluster_test() 4.x: Add MockResolverIT#replace_cluster_test() Aug 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implements DNS enpoint tests when nodes are being replaced one by one in the cluster

2 participants