Skip to content

Commit f4b8e1c

Browse files
committed
Merge pull request #1674 from tseaver/1551-bigquery-job_query_status
Expand/clarify synchronous query usage docs.
2 parents f29d0bc + 9ed6997 commit f4b8e1c

File tree

1 file changed

+75
-13
lines changed

1 file changed

+75
-13
lines changed

docs/bigquery-usage.rst

Lines changed: 75 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -291,26 +291,88 @@ Run a query which can be expected to complete within bounded time:
291291

292292
>>> from gcloud import bigquery
293293
>>> client = bigquery.Client()
294-
>>> query = """\
295-
SELECT count(*) AS age_count FROM dataset_name.person_ages
296-
"""
297-
>>> query = client.run_sync_query(query)
294+
>>> QUERY = """\
295+
... SELECT count(*) AS age_count FROM dataset_name.person_ages
296+
... """
297+
>>> query = client.run_sync_query(QUERY)
298+
>>> query.timeout_ms = 1000
299+
>>> query.run() # API request
300+
>>> query.complete
301+
True
302+
>>> len(query.schema)
303+
1
304+
>>> field = query.schema[0]
305+
>>> field.name
306+
u'count'
307+
>>> field.field_type
308+
u'INTEGER'
309+
>>> field.mode
310+
u'NULLABLE'
311+
>>> query.rows
312+
[(15,)]
313+
>>> query.total_rows
314+
1
315+
316+
If the rows returned by the query do not fit into the inital response,
317+
then we need to fetch the remaining rows via ``fetch_data``:
318+
319+
.. doctest::
320+
321+
>>> from gcloud import bigquery
322+
>>> client = bigquery.Client()
323+
>>> QUERY = """\
324+
... SELECT * FROM dataset_name.person_ages
325+
... """
326+
>>> query = client.run_sync_query(QUERY)
327+
>>> query.timeout_ms = 1000
328+
>>> query.run() # API request
329+
>>> query.complete
330+
True
331+
>>> query.total_rows
332+
1234
333+
>>> query.page_token
334+
'8d6e452459238eb0fe87d8eb191dd526ee70a35e'
335+
>>> do_something_with(query.schema, query.rows)
336+
>>> token = query.page_token # for initial request
337+
>>> while True:
338+
... do_something_with(query.schema, rows)
339+
... if token is None:
340+
... break
341+
... rows, _, token = query.fetch_data(page_token=token)
342+
343+
344+
If the query takes longer than the timeout allowed, ``query.complete``
345+
will be ``False``. In that case, we need to poll the associated job until
346+
it is done, and then fetch the reuslts:
347+
348+
.. doctest::
349+
350+
>>> from gcloud import bigquery
351+
>>> client = bigquery.Client()
352+
>>> QUERY = """\
353+
... SELECT * FROM dataset_name.person_ages
354+
... """
355+
>>> query = client.run_sync_query(QUERY)
298356
>>> query.timeout_ms = 1000
299357
>>> query.run() # API request
358+
>>> query.complete
359+
False
360+
>>> job = query.job
300361
>>> retry_count = 100
301-
>>> while retry_count > 0 and not job.complete:
362+
>>> while retry_count > 0 and job.state == 'running':
302363
... retry_count -= 1
303364
... time.sleep(10)
304-
... query.reload() # API request
305-
>>> query.schema
306-
[{'name': 'age_count', 'type': 'integer', 'mode': 'nullable'}]
307-
>>> query.rows
308-
[(15,)]
365+
... job.reload() # API call
366+
>>> job.state
367+
'done'
368+
>>> token = None # for initial request
369+
>>> while True:
370+
... rows, _, token = query.fetch_data(page_token=token)
371+
... do_something_with(query.schema, rows)
372+
... if token is None:
373+
... break
309374

310-
.. note::
311375

312-
If the query takes longer than the timeout allowed, ``job.complete``
313-
will be ``False``: we therefore poll until it is completed.
314376

315377
Querying data (asynchronous)
316378
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)