Opening this feature request for discussion. There is currently no way to provide a timeout for cases when to_dataframe() continues to run and eventually stalls due to a query returning results that are too large for a pandas DataFrame.
A timeout given to QueryJob.to_dataframe should probably pass this timeout to the result() function, and use the remaining time to construct the DataFrame.
There is some ambiguity in how to handle RowIterator.to_dataframe because it does not call result(), so there are two separate timeouts that can be given:
client.query(sql).result(timeout=10).to_dataframe(timeout=10)
It would likely be confusing to users that the timeout given to to_dataframe() will apply to both the query job and the DataFrame construction when given to a QueryJob, but will only apply to the DataFrame construction when given to a RowIterator.
Opening this feature request for discussion. There is currently no way to provide a timeout for cases when
to_dataframe()continues to run and eventually stalls due to a query returning results that are too large for a pandas DataFrame.A timeout given to
QueryJob.to_dataframeshould probably pass this timeout to theresult()function, and use the remaining time to construct the DataFrame.There is some ambiguity in how to handle
RowIterator.to_dataframebecause it does not callresult(), so there are two separate timeouts that can be given:client.query(sql).result(timeout=10).to_dataframe(timeout=10)It would likely be confusing to users that the timeout given to
to_dataframe()will apply to both the query job and the DataFrame construction when given to aQueryJob, but will only apply to the DataFrame construction when given to aRowIterator.