Multiple fixes to speed up querys and remove exceptions at shutdown by stephenrauch · Pull Request #45 · python-zeroconf/python-zeroconf

stephenrauch · 2016-03-17T23:35:03Z

Fix ability for a cache lookup to match properly

When querying for a service type, the response is processed. During the
processing, an info lookup is performed. If the info is not found in
the cache, then a query is sent. Trouble is that the info requested is
present in the same packet that triggered the lookup, and a query is not
necessary. But two problems caused the cache lookup to fail.

The info was not yet in the cache. The call back was fired before
all answers in the packet were cached.
The test for a cache hit did not work, because the cache hit test
uses a DNSEntry as the comparison object. But some of the objects in
the cache are descendents of DNSEntry and have their own eq()
defined which accesses fields only present on the descendent. Thus the
test can NEVER work since the descendent's eq() will be used.

Also continuing the theme of some other recent pull requests, add three
_GLOBAL_DONE tests to avoid doing work after the attempted stop, and
thus avoid generating (harmless, but annoying) exceptions during
shutdown

When querying for a service type, the response is processed. During the processing, an info lookup is performed. If the info is not found in the cache, then a query is sent. Trouble is that the info requested is present in the same packet that triggered the lookup, and a query is not necessary. But two problems caused the cache lookup to fail. 1) The info was not yet in the cache. The call back was fired before all answers in the packet were cached. 2) The test for a cache hit did not work, because the cache hit test uses a DNSEntry as the comparison object. But some of the objects in the cache are descendents of DNSEntry and have their own __eq__() defined which accesses fields only present on the descendent. Thus the test can NEVER work since the descendent's __eq__() will be used. Also continuing the theme of some other recent pull requests, add three _GLOBAL_DONE tests to avoid doing work after the attempted stop, and thus avoid generating (harmless, but annoying) exceptions during shutdown

coveralls · 2016-03-21T22:31:50Z

Coverage decreased (-0.6%) to 76.031% when pulling 7403dd7 on stephenrauch:master into f33b8f9 on jstasiak:master.

stephenrauch · 2016-03-21T22:42:27Z

Remove unnecessary packet send in ServiceInfo.request()

Added another commit. My understanding of this code was not complete when writing up the previous description. The two fixes above are needed and improved the situation, but more is necessary if a simple query isn't going to generally take 5-10 seconds:

When performing an info query via request(), a listener is started, and a packet is formed. As the packet is formed, known answers are taken from the cache and placed into the packet. Then the packet is sent. The packet is self received (via multicast loopback, I assume). At that point the listener is fired and the answers in the packet are propagated back to the object that started the request. This is a really long way around the barn.

This PR queries the cache directly in request() and then calls update_record(). If all of the information is in the cache, then no packet is formed or sent or received. This approach was taken because, for whatever reason, the reception of the packets on windows via the loopback was proving to be unreliable. The method has the side benefit of being a whole lot faster.

This PR also incorporates the joins() from PR #30. In addition it moves the two joins() in close() to their own thread because they can take quite a while to execute.

coveralls · 2016-03-22T23:02:34Z

Coverage decreased (-0.3%) to 76.353% when pulling 9b5e4cb on stephenrauch:master into f33b8f9 on jstasiak:master.

stephenrauch · 2016-03-22T23:03:01Z

Fix locking race condition in Engine.run()

This commit fixes a race condition in which the receive engine was waiting
against its condition variable under a different lock than the one it
used to determine if it needed to wait. This was causing the code to
sometimes take 5 seconds to do anything useful.

When fixing the race condition, decided to also fix the other
correctness issues in the loop which was likely causing the errors that
led to the inclusion of the 'except Exception' catch all. This in turn
allowed the use of EBADF error due to closing the socket during exit to
be used to get out of the select in a timely manner.

Finally, this allowed reorganizing the shutdown code to shutdown from
the front to the back. That is to say, shutdown the recv socket first,
which then allows a clean join with the engine thread. After the engine
thread exits, most everything else is inert as all callbacks have been
unwound.

coveralls · 2016-03-26T00:38:38Z

Coverage decreased (-0.2%) to 76.424% when pulling 2f9d535 on stephenrauch:master into f33b8f9 on jstasiak:master.

stephenrauch · 2016-03-26T00:48:33Z

Shutdown the service listeners in an organized fashion

The zc.close() was not doing a join() to these threads, and thus could leave some callbacks half done on termination, because they are daemon threads and would quit when others quit.

The check in also adds names to the various threads to make debugging easier.

This should be my last commit on this topic of thread safety. The code I am using this with is now running well. If you need any assistance in integrating this, please let me know.

stephenrauch · 2016-03-31T01:01:37Z

In doing a bit of review of the other open pull requests, I am pretty sure the code here covers the functionality of the PRs #20, #30 and #41. In addition it fixes Issue #47.

When performing an info query via request(), a listener is started, and a packet is formed. As the packet is formed, known answers are taken from the cache and placed into the packet. Then the packet is sent. The packet is self received (via multicast loopback, I assume). At that point the listener is fired and the answers in the packet are propagated back to the object that started the request. This is a really long way around the barn. The PR queries the cache directly in request() and then calls update_record(). If all of the information is in the cache, then no packet is formed or sent or received. This approach was taken because, for whatever reason, the reception of the packets on windows via the loopback was proving to be unreliable. The method has the side benefit of being a whole lot faster. This PR also incorporates the joins() from PR python-zeroconf#30. In addition it moves the two joins() in close() to their own thread because they can take quite a while to execute.

This fixes a race condition in which the receive engine was waiting against its condition variable under a different lock than the one it used to determine if it needed to wait. This was causing the code to sometimes take 5 seconds to do anything useful. When fixing the race condition, decided to also fix the other correctness issues in the loop which was likely causing the errors that led to the inclusion of the 'except Exception' catch all. This in turn allowed the use of EBADF error due to closing the socket during exit to be used to get out of the select in a timely manner. Finally, this allowed reorganizing the shutdown code to shutdown from the front to the back. That is to say, shutdown the recv socket first, which then allows a clean join with the engine thread. After the engine thread exits most everything else is inert as all callbacks have been unwound.

With the restructure of shutdown, Listener() now needs to throw EBADF on a closed socket to allow a timely and graceful shutdown.

Also adds names to the various threads to make debugging easier.

Add more needed shutdown cleanup found via additional test coverage. Force timeout calculation from milli to seconds to use floating point.

coveralls · 2016-04-02T20:57:20Z

Coverage increased (+13.5%) to 90.11% when pulling 75232cc on stephenrauch:master into f33b8f9 on jstasiak:master.

coveralls · 2016-04-02T21:05:14Z

Coverage increased (+14.2%) to 90.818% when pulling d909942 on stephenrauch:master into f33b8f9 on jstasiak:master.

stephenrauch · 2016-04-02T21:07:25Z

Improve test coverage from 76% to 91%

Add more needed shutdown cleanup found via additional test coverage.

Force timeout calculation from milliseconds to seconds to use floating point.

Initialize ServiceInfo._properties

In the last commit note, I said it was likely the last one on this PR. Well this time I really mean it.

stephenrauch changed the title ~~Fix ability for a cache lookup to match properly~~ Multiple fixes to speed up querys and remove exceptions at shutdown Mar 22, 2016

stephenrauch mentioned this pull request Mar 31, 2016

call register_service twice, it will crash #47

Closed

stephenrauch added 5 commits April 2, 2016 13:48

Remove a now invalid test case

7bbee59

With the restructure of shutdown, Listener() now needs to throw EBADF on a closed socket to allow a timely and graceful shutdown.

Shutdown the service listeners in an organized fashion

ad3c248

Also adds names to the various threads to make debugging easier.

Improve test coverage

75232cc

Add more needed shutdown cleanup found via additional test coverage. Force timeout calculation from milli to seconds to use floating point.

stephenrauch force-pushed the master branch from 2f9d535 to 75232cc Compare April 2, 2016 20:54

init ServiceInfo._properties

d909942

stephenrauch mentioned this pull request Apr 9, 2016

A few minors issues are addressed and a new example is added #51

Closed

jstasiak mentioned this pull request Apr 12, 2016

Test Case and fixes for DNSHInfo #49

Merged

stephenrauch merged commit 183cd81 into python-zeroconf:master Jun 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple fixes to speed up querys and remove exceptions at shutdown#45

Multiple fixes to speed up querys and remove exceptions at shutdown#45
stephenrauch merged 7 commits into
python-zeroconf:masterfrom
stephenrauch:master

stephenrauch commented Mar 17, 2016

Uh oh!

coveralls commented Mar 21, 2016

Uh oh!

stephenrauch commented Mar 21, 2016

Uh oh!

coveralls commented Mar 22, 2016

Uh oh!

stephenrauch commented Mar 22, 2016

Uh oh!

coveralls commented Mar 26, 2016

Uh oh!

stephenrauch commented Mar 26, 2016

Uh oh!

stephenrauch commented Mar 31, 2016

Uh oh!

coveralls commented Apr 2, 2016

Uh oh!

coveralls commented Apr 2, 2016

Uh oh!

stephenrauch commented Apr 2, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

stephenrauch commented Mar 17, 2016

Fix ability for a cache lookup to match properly

Uh oh!

coveralls commented Mar 21, 2016

Uh oh!

stephenrauch commented Mar 21, 2016

Remove unnecessary packet send in ServiceInfo.request()

Uh oh!

coveralls commented Mar 22, 2016

Uh oh!

stephenrauch commented Mar 22, 2016

Fix locking race condition in Engine.run()

Uh oh!

coveralls commented Mar 26, 2016

Uh oh!

stephenrauch commented Mar 26, 2016

Shutdown the service listeners in an organized fashion

Uh oh!

stephenrauch commented Mar 31, 2016

Uh oh!

coveralls commented Apr 2, 2016

Uh oh!

coveralls commented Apr 2, 2016

Uh oh!

stephenrauch commented Apr 2, 2016

Improve test coverage from 76% to 91%

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants