Skip to content

New: AbstractQuerySet.__Iter__ is no more greedy#389

Closed
alanjds wants to merge 2 commits into
apache:masterfrom
alanjds:iter_not_greedy
Closed

New: AbstractQuerySet.__Iter__ is no more greedy#389
alanjds wants to merge 2 commits into
apache:masterfrom
alanjds:iter_not_greedy

Conversation

@alanjds

@alanjds alanjds commented Aug 20, 2015

Copy link
Copy Markdown
Contributor

On https://datastax.github.io/python-driver/query_paging.html#handling-paged-results is told how to iterate ove large querysets since 2.6.0 .

However, the cqlengine querysets are fetching all pages before yielding the first result. This leads to timeouts or memory inflation on clients.

This PR makes __ iter__ become a non-greedy generator, yielding until the actual page is exhausted, and then fetching the next one.

The __iter__ now yields until the end of actual page,
than start to fetch the next page.
@datastax-bot

Copy link
Copy Markdown

Hi @alanjds, thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes.

Sincerely,
DataStax Bot.

@datastax-bot

Copy link
Copy Markdown

Thank you @alanjds for signing the Contribution License Agreement.

Cheers,
DataStax Bot.

@aholmberg

Copy link
Copy Markdown
Contributor

Thanks a lot for the contribution! We will have a cqlengine-focused release next, so I'll be looking at this then.
I've added a reference to this from an existing JIRA for tracking: https://datastax-oss.atlassian.net/browse/PYTHON-323

@alanjds

alanjds commented Aug 20, 2015

Copy link
Copy Markdown
Contributor Author

Thanks for the information, @aholmberg.

However, I still think that this is not an "improvement", but a "bugfix". I really would expect __ iter __ to fetch on demand. Otherwise it leads to the pointed annoyances: Timeouts and memory inflation.

@aboudreault

Copy link
Copy Markdown
Contributor

Thank you @alanjds for your contribution. Your work has been included in #508 . I also added some perf improvements in iter and getitem to avoid using count() when possible (that was making iteration slow with large dataset). Feel free to comments.

@aboudreault aboudreault closed this Mar 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants