Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files
🚀 New features to boost your workflow:
|
temoto
left a comment
There was a problem hiding this comment.
- tests are required
Thanks for your work, great job.
|
Great review. @temoto I have completed a new cookie handler according to your suggestion. I will update the PR later.
|
|
tests/test_cookie.py python{2,3}/httplib2test.py is original, now deprecated test suite, and
|
b2172ca to
190f31b
Compare
There was a problem hiding this comment.
Super, thanks. It's going to be the best cookie implementation.
There is only few more things to change and you can leave to me if you think it's too much work. All credit is yours if I do this too.
- it would probably be good idea to extract all cookie related code to separate file cookie.py
- CookieJar should only contain public API for cookie storage, same as other (null/database/etc) jar implementation; common helpers like normalize_path should go to
Cookieor top level functions - CookieJar
_load_saveprobably should be private - your call - please create Dummy/Null/Noop-CookieJar that always returns empty cookies and ignores
set, then you can forceHttpobject to always have a cookie manager (jar) present; also then you can remove Http.get/set/clear_cookie shortcuts because it's accessible withHttp.cookie_jar.get/set/clear - modify headers once in _request, before any _conn_request call, that way cookie related changes are less spread across code
- please run
black python?/httplib2/cookie.pyto fix double quotes and long lines
It's alright, I will take it.
Good idea, Will do.
No problem. Will do.
I think it's better to keep them as public API. So user still can loads / saves cookies manually when the cookie persistence function is disabled(didn't provide cookie
It's fine to implement a NullCookieJar to elimiate the instead of
Will do. |
190f31b to
73ae9b9
Compare
Nice. Maybe |
temoto
left a comment
There was a problem hiding this comment.
Thanks, you are very fast. Marked important changes. You can skip minor if you like.
|
|
||
| if self.filename: | ||
| self.save() | ||
|
|
There was a problem hiding this comment.
(important)
all CookieJar methods below are not related to file storage, should go to Cookie or module level functions
There was a problem hiding this comment.
_match_path and _match_domain are used by get_header to help search cookies when walking through the <domain>/<path>/<name> tree. They're not related to cookie directly, since at that moment no cookie is get. And they are only used in CookieJar. I think it makes sense to place them in CookieJar as private methods.
There was a problem hiding this comment.
get_header is out of scope file storage (CookieJar) too. Imagine SqliteCookieJar, it needs all these methods but it definitely must not couple with private methods of file storage (CookieJar).
There was a problem hiding this comment.
Sorry, I still didn't get your point about this change.
The httplib2.CookieJar is not a file storage. It's a cookie jar in memory with functions to load cookies from/or save to file. Just like get()/set()/clear(), the get_header() is not special, it means getting the cookies in the format of HTTP header. Similarly, extract_header means setting the cookies from HTTP header.
Maybe it will be much clear if separating the FileCookieJar from the CookieJar. For example:
class CookieJarBase(CookieJar):
<null operations>
NullCookieJar = CookieJarBase
class CookieJar(CookieJarBase):
1. Virtual interfaces to persist the cookies: load()/save()
2. Implement all other operations(also includes `get_header`/`extract_header`) on self._cookies
class FileCookieJar(CookieJar):
Re-implement save()/load() interface with file storage
class SqliteCookieJar(CookieJar): <-- your example
Re-implement save()/load() interface with sqlite databaseThere was a problem hiding this comment.
The word "jar" means storage for cookies. So CookieJar should contain code that is related to storing cookies: in memory, file or somewhere else. get_header, extract_header are helper functions that are not related to storage.
It's very fine and clear for some functions to be at module level. There is no need to implement Java style class hierarchy. That virtual/implement/override nightmare is part of what made Python stdlib HTTP client so hard to use in the first place. Because you can't clearly read code: you start reading a method, it calls something in parent, look there, which calls something not in parent, look back, it's very easy for machine, but we should care about people.
If you want to separate cookie access and storage, it's a great idea but additional work. Because right now, each set() writing to disk is a performance problem.
class CookieJar(object):
"""Cookie in-memory storage"""
def __init__(self):
self._items = {}
def get/set/clear(): # the cookie access methods only
def __hash__(): # to help storage know when contents were modified and should be written to disk/db/etc
# not related to any storage or cookie access
def get_header():
def extract_header():
# and then separate storage, named as json and all other serialization APIs
def load(fileobj) -> CookieJar:
def dump(fileobj, CookieJar):
# this way, new database backend would only implement load/dumpThere was a problem hiding this comment.
Variant of your code with base jar is also fine:
class NullCookieJar(object):
def get/set/clear(): ...
# no load/save/get_header
class MemoryCookieJar(object):
def get/set/clear(): ...
# no load/save/get_header
# inherit MemoryCookieJar only as API convenience, no virtual/override methods
class FileCookieJar(MemoryCookieJar):
def load/save(): ...
# no get/set/clear/get_header
class SqliteCookieJar(object):
def get() -> db.SELECT
def set() -> db.INSERT/UPDATE
def clear() -> db.DELETE
# no load/save/get_header
class BatchSqliteCookieJar(MemoryCookieJar):
def load() -> db.SELECT
def save() -> db.INSERT/UPDATE
# no get/set/clear/get_headerThere was a problem hiding this comment.
Please let me try to explain the issue here from another point.
CookieJar is public API, in sense that new storage backend must implement some methods for everything to work. So we must document what is the API (or at least provide refence implementation that other people may copy/modify). So what's the CookieJar API? You say it includes get_header/extract_header. It's even visible in that NullCookieJar implements get_header (which is wrong). Now somebody needs to store cookies in database. It completely makes sense that they must create get/set/clear, right?
class DatabaseCookieJar(object):
def get/set/clear(): ... # makes sensebut it doesn't work, because Http._set_cookie_header calls your jar get_header. And they think, what my database cookie jar implementation must do differently in get_header? Ah there is (and could be) no new behavior (mind alarm 1). So what are their options?
# option 1
class DatabaseCookieJar(object):
def get/set/clear(): ...
@staticmethod
def get_header(*a, **kw): return CookieJar.get_header(*a, **kw)
# option 2
class DatabaseCookieJar(CookieJar): # why? only to make _set_cookie_header happy
def get/set/clear(): ...And they think, why my database implementation must couple with file cookie storage? There is no good reason. Because _set_cookie_header calls some common helper function via .cookie_jar property.
And yet another point: one may want to construct/parse cookie headers without knowing about file/memory/database cookie storage. Right?
There was a problem hiding this comment.
Variant of your code with base jar is also fine:
class NullCookieJar(object): def get/set/clear(): ... # no load/save/get_header class MemoryCookieJar(object): def get/set/clear(): ... # no load/save/get_header # inherit MemoryCookieJar only as API convenience, no virtual/override methods class FileCookieJar(MemoryCookieJar): def load/save(): ... # no get/set/clear/get_header class SqliteCookieJar(object): def get() -> db.SELECT def set() -> db.INSERT/UPDATE def clear() -> db.DELETE # no load/save/get_header class BatchSqliteCookieJar(MemoryCookieJar): def load() -> db.SELECT def save() -> db.INSERT/UPDATE # no get/set/clear/get_header
Without the load/save virtual interface in MemoryCookieJar, you cannot keep updating the file/sqldatabse on each set/clear operation in the FileCookieJar/BatchSqliteCookieJar
There was a problem hiding this comment.
Yes, you can do like getattr(http.cookie_jar, 'save', lambda: None)() but that is ugly.
temoto
left a comment
There was a problem hiding this comment.
Removing _match_path, _match_domain, get_header from CookieJar is the last important change to be made. Sorry I understand your position but disagree. You have done incredible work. 👍
@temoto I would appreciate it if you can continue the improvement work on this part. I will be busy in the next few weeks. Thank you, for your help and patience. |
Yes, no problem, I will. Thanks for your big contribution. |
Track and set the cookie header automatically.
Cookie session can be enabled by the new parameter
cookie_jar:For cookie persistence, just providing a filename. All the cookie changes will be updated to file in the disk.
A new attribute
cookiesof response object to get parsed cookies:To set custom cookies to the
request, you can add them toHttp.request()method as a parameter directly, the cookies only serve this request.Or you can add them to the cookie jar, these cookies will be managed by cookie jar in subsequent requests.
Fixes #129