New: AbstractQuerySet.__Iter__ is no more greedy by alanjds · Pull Request #389 · apache/cassandra-python-driver

alanjds · 2015-08-20T14:46:24Z

On https://datastax.github.io/python-driver/query_paging.html#handling-paged-results is told how to iterate ove large querysets since 2.6.0 .

However, the cqlengine querysets are fetching all pages before yielding the first result. This leads to timeouts or memory inflation on clients.

This PR makes __ iter__ become a non-greedy generator, yielding until the actual page is exhausted, and then fetching the next one.

The __iter__ now yields until the end of actual page, than start to fetch the next page.

datastax-bot · 2015-08-20T14:46:26Z

Hi @alanjds, thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes.

Sincerely,
DataStax Bot.

datastax-bot · 2015-08-20T15:02:57Z

Thank you @alanjds for signing the Contribution License Agreement.

Cheers,
DataStax Bot.

aholmberg · 2015-08-20T15:17:17Z

Thanks a lot for the contribution! We will have a cqlengine-focused release next, so I'll be looking at this then.
I've added a reference to this from an existing JIRA for tracking: https://datastax-oss.atlassian.net/browse/PYTHON-323

alanjds · 2015-08-20T15:37:34Z

Thanks for the information, @aholmberg.

However, I still think that this is not an "improvement", but a "bugfix". I really would expect __ iter __ to fetch on demand. Otherwise it leads to the pointed annoyances: Timeouts and memory inflation.

aboudreault · 2016-03-03T21:27:28Z

Thank you @alanjds for your contribution. Your work has been included in #508 . I also added some perf improvements in iter and getitem to avoid using count() when possible (that was making iteration slow with large dataset). Feel free to comments.

alanjds added 2 commits August 20, 2015 14:31

Debug for automatic pagination machinery

49abb35

AbstractQuerySet.__iter__ is no more greedy. Fetches on demand.

d433631

The __iter__ now yields until the end of actual page, than start to fetch the next page.

datastax-bot added the cla-missing label Aug 20, 2015

datastax-bot removed the cla-missing label Aug 20, 2015

aboudreault closed this Mar 3, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New: AbstractQuerySet.Iter is no more greedy#389

New: AbstractQuerySet.Iter is no more greedy#389
alanjds wants to merge 2 commits into
apache:masterfrom
alanjds:iter_not_greedy

alanjds commented Aug 20, 2015

Uh oh!

datastax-bot commented Aug 20, 2015

Uh oh!

datastax-bot commented Aug 20, 2015

Uh oh!

aholmberg commented Aug 20, 2015

Uh oh!

alanjds commented Aug 20, 2015

Uh oh!

aboudreault commented Mar 3, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

alanjds commented Aug 20, 2015

Uh oh!

datastax-bot commented Aug 20, 2015

Uh oh!

datastax-bot commented Aug 20, 2015

Uh oh!

aholmberg commented Aug 20, 2015

Uh oh!

alanjds commented Aug 20, 2015

Uh oh!

aboudreault commented Mar 3, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants