@@ -93,7 +93,7 @@ for (Row row : rs) {
9393}
9494```
9595
96- ### Manual paging
96+ ### Saving and reusing the paging state
9797
9898Sometimes it is convenient to save the paging state in order to restore
9999it later. For example, consider a stateless web service that displays a
@@ -215,3 +215,48 @@ There are two situations where you might want to use the unsafe API:
215215
216216[ gpsu ] : http://www.datastax.com/drivers/java/2.0/com/datastax/driver/core/ExecutionInfo.html#getPagingStateUnsafe()
217217[ spsu ] : http://www.datastax.com/drivers/java/2.0/com/datastax/driver/core/Statement.html#setPagingStateUnsafe(byte[])
218+
219+ ### Offset queries
220+
221+ Saving the paging state works well when you only let the user move from
222+ one page to the next. But it doesn't allow random jumps (like "go
223+ directly to page 10"), because you can't fetch a page unless you have
224+ the paging state of the previous one. Such a feature would require
225+ * offset queries* , but they are not natively supported by Cassandra (see
226+ [ CASSANDRA-6511] ( https://issues.apache.org/jira/browse/CASSANDRA-6511 ) ).
227+ The rationale is that offset queries are inherently inefficient (the
228+ performance will always be linear in the number of rows skipped), so the
229+ Cassandra team doesn't want to encourage their use.
230+
231+ If you really want offset queries, you can emulate them client-side.
232+ You'll still get linear performance, but maybe that's acceptable for
233+ your use case. For example, if each page holds 10 rows and you show at
234+ most 20 pages, this means you'll fetch at most 190 extra rows, which
235+ doesn't sound like a big deal.
236+
237+ For example, if the page size is 10, the fetch size is 50, and the user
238+ asks for page 12 (rows 110 to 119):
239+
240+ * execute the statement a first time (the result set contains rows 0 to
241+ 49, but you're not going to use them, only the paging state);
242+ * execute the statement a second time with the paging state from the
243+ first query;
244+ * execute the statement a third time with the paging state from the
245+ second query. The result set now contains rows 100 to 149;
246+ * skip the first 10 rows of the iterator. Read the next 10 rows and
247+ discard the remaining ones.
248+
249+ You'll want to experiment with the fetch size to find the best balance:
250+ too small means many background queries; too big means bigger messages
251+ and too many unneeded rows returned (we picked 50 above for the sake of
252+ example, but it's probably too small -- the default is 5000).
253+
254+ Again, offset queries are inefficient by nature. Emulating them
255+ client-side is a compromise when you think you can get away with the
256+ performance hit. We recommend that you:
257+
258+ * test your code at scale with the expected query patterns, to make sure
259+ that your assumptions are correct;
260+ * set a hard limit on the highest possible page number, to prevent
261+ malicious users from triggering queries that would skip a huge amount
262+ of rows.
0 commit comments