bpython / bpython

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix weird boto docstrings #656

Merged

sebastinas merged 1 commit into master from fix-653

Nov 19, 2016

Member

thomasballinger commented Nov 18, 2016 •

edited

Boto is doing something pretty weird: in Python 3, it makes it possible to end up with bytestring docstrings. We fix this here by always assuming utf8 in this case. Previously we assumed ascii, and did it implicitly by letting string.split(u'\n') turn it into unicode, which was no good.


        fix weird boto docstrings

ab1cbec

thomasballinger mentioned this pull request

Something about boto3 crashes bpython #653

Closed

sebastinas reviewed

bpython/curtsiesfrontend/replpainter.py

  
            elif isinstance(docstring, str if py3 else unicode):

                pass

            else:

                return []

sebastinas Nov 18, 2016 •

edited

Contributor

Is the elif and else really necessary? Or in other words: does the elif really cover all valid cases?

thomasballinger Nov 18, 2016 •

edited

Author Member

The cases to cover:

Py2 bytes -> decode
Py2 unicode -> nop
Py2 something else (integer etc) -> abort

Py3 bytes -> shouldn't happen, but decode
Py3 bytes -> nop
Py3 something else -> abort

Might be nicer to:

if unicode:
    pass
else:
    try:
        docstring = docstring.decode

thomasballinger Nov 18, 2016

Author Member

To answer your question, docstrings should always be unicode in python 3, and in Python 2 they should always be bytestrings. (since we're getting them from pydoc.getdoc, which does this normalization) If we got a unicode string somehow in Python 2 that would be ok, but I don't know how that would happen. If we got a bytestring in Python3, which shouldn't happen, we would try to decode. So this does cover all valid cases, but it covers some extra too.

Now that I see where docstring comes from (pydoc.getdoc) I agree that the else isn't necessary.

The correct thing to do here is to find out the encoding of the source file the docstring comes from, since it doesn't have to be utf8, or at least catch errors here so a bad docstring doesn't crash bpython.

sebastinas merged commit f4f05b2 into master

2 checks passed

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed

Details

continuous-integration/travis-ci/push The Travis CI build passed

Details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

You can’t perform that action at this time.