Fix #335 Overloaded varargs method preference by jhubbardnso · Pull Request #336 · jython/jython

jhubbardnso · 2024-06-10T19:27:43Z

The resolution for #221 resulted in non varargs methods/ctors being given preference as designed. However it also resulted in a change in which vararg methods/ctors was selected. To restore the old behavior only update the candidate varagsMatch the first time a candidate match is found. This restores the previous behavior WRT var args while retaining the desired preference for non varargs methods/ctors.

Update the test case to attempt to trigger the issue that the rest of this commit addresses.

The resolution for jython#221 resulted in non varargs methods/ctors being given preference as designed. However it also resulted in a change in which vararg methods/ctors was selected. To restore the old behavior only update the candidate varagsMatch the first time a candidate match is found. This restores the previous behavior WRT var args while retaining the desired preference for non varargs methods/ctors. Update the test case to attempt to trigger the issue that the rest of this commit addresses.

jhubbardnso · 2024-06-10T19:31:52Z

I attempted to update the test case to cover this but I'm not sure that the test case is actually being run. At first I had issues with other tests failing locally (specifically Test org.python.modules._locale._localeTest). Even after ignoring that I couldn't seem to trigger the updated test case to fail. Looking through run report on my terminal I couldn't actually find any evidence of the test being run. I'm happy to work on fixing that if someone wants to point me in the right direction. (I have very little ant experience.)

Stewori · 2024-06-11T10:08:45Z

@jhubbardnso do you have an idea why the regrtest fails? (I cannot look into the details for now.) Is the failing test "bug-compatible" to the old behavior? Or does it reveal a legitimate issue with your patch?

jhubbardnso · 2024-06-11T19:10:49Z

The Ubuntu JDK 8 regression test failure is due to my change. I think that the failing test is bug-compatible with the 2.7.3 behavior but may create issues for @tpoliaw.

Ironically the first failure is exposing exactly the issue I as trying to address! I think the comment @jeff5 left in the mailing list is correct:

The code by which a call is matched to a method always looked somewhat heuristic to me. (I believe this comes from a Greek word meaning "luck".)

Something about my class structure means that the intrinsic ordering has the candidate method which accepts an int... before the candidate method which accepts a boolean.... By restoring the 'use the first match' behavior that previously existed it got my use case working again. However the code is still fundamentally flawed and now one of the test cases which was passing (due to a lucky intrinsic ordering of its methods) is now failing.

The more I look at this the more I'm convinced that the original #221 change is not correct. Consider the methods:

foo(boolean)
foo(int...)

If one were to call foo(3) under the new behavior the non-varargs boolean method would be preferred which results in 3 being converted to the true. This doesn't match the Java behavior. A varargs method is not fundamentally less worthy than a non var args method. To be fair the old behavior might have resolved to the int... method or it might have resovled to the boolean method; it was luck of the draw.

@robertpatrick and @tpoliaw discussed on the original change and I'd appreciate their input there.

I think what needs to happen here is we need a proper heuristic for selecting the optimal function given multiple functions which could all be correct. There seems to be some code for this already in place but it isn't quite being used. I'm probably not going to have the time to dig into that right away.

Depending on how other people feel, it might be nice to go ahead and get this change merged. If others agree that keeping as close as we can to the old behavior is good I can fix-up the tests so they pass but that won't actually address the underlying issue.

For reference here is the test failure

     [exec] ======================================================================
     [exec] FAIL: test_constructor_overloading (test.test_joverload.ComplexOverloadingTests)
     [exec] ----------------------------------------------------------------------
     [exec] Traceback (most recent call last):
     [exec]   File "/home/runner/work/jython/jython/dist/Lib/test/test_joverload.py", line 206, in test_constructor_overloading
     [exec]     self.assertEqual(Reflection.Overloaded(1, 1, 1, 1).constructorVersion, 'int, int...')
     [exec] AssertionError: u'boolean, boolean...' != 'int, int...'
     [exec] 
     [exec] ======================================================================
     [exec] FAIL: test_method_overloading (test.test_joverload.ComplexOverloadingTests)
     [exec] ----------------------------------------------------------------------
     [exec] Traceback (most recent call last):
     [exec]   File "/home/runner/work/jython/jython/dist/Lib/test/test_joverload.py", line 223, in test_method_overloading
     [exec]     self.assertEqual(over.foo(1), "int, int...")
     [exec] AssertionError: u'int...' != 'int, int...'
     [exec]

robertpatrick · 2024-06-11T19:26:18Z

@jhubbardnso There are two interrelated issues:

selecting the overloaded method based on the longest set of matching positional arg types (vs. varargs)
discriminating between various scalar types (byte, boolean, short, int, long, etc.)

The fix in #221 dealt only with number one and restored previous functionality that was working in Jython 2.2.1 but was broken along the way to 2.7.3. Everyone was aware that this did not resolve number 2 (which was broken in every version of Jython even before #221).

Number 2 is very tricky because Jython never really handled this situation. Even if you were able to fix this completely, it would almost certainly break backwards compatibility with existing scripts. Whether that is ok or not is not my decision but, if asked, I would vote against it until the next major release.

tpoliaw · 2024-06-11T20:25:23Z

Glad more people are looking at this, it'd be good for it to be more reliable. Without having looked at this again since 221, is this problem stemming from booleans being ints in python? Or everything being having a truthiness? I don't think there can be a canonical answer and we're going to have to make an arbitrary decision somewhere if foo(int) and foo(bool) are both valid options.

(Going from memory of how it works so might be missing something), could we rely on java reflection to pick the method for us instead of getting the full list of options and then picking ourselves? If every argument had tojava called, could we pass the types to class.getMethod? If that fails we could fall back on the current coercion approach and see if there's any way to get it to run.

robertpatrick · 2024-06-11T22:42:56Z

I think that one of the fundamental problems is converting between Python types and Java types, since there isn’t a definitive one-to-one conversion (e.g., a Python int could be a boolean, short, int, long, or the wrapped version of each in Java). So given a Python int as an argument and an overloaded Java method where there are versions with each Java type, which one do you match against?

Stewori · 2024-06-12T10:26:19Z

I think what would be needed is a scoring mechanism, where each type scores by "similarness". That would require a similarity score table between Java and Python types but should be doable. One issue is that the choice between short, int, long and BigInteger would have to be value-dependent (at least in future Jython 3). https://peps.python.org/pep-0237/ https://stackoverflow.com/questions/2104884/how-does-python-manage-int-and-long
To make a draw less likely, scores of earlier args should be weighted higher, i.e. there should be a weight function for arg position. Args that fall into the vararg range of a method would score against the vararg type, perhaps with some slight additional penalty for being vararg. Finally the highest scoring method is selected (or lowest scoring, if it is more intuitive to score by a type-difference measure).
Implementing such a mechanism would be a bigger change, although it might have enough flexibility to design it such that all use cases may work as expected. I think such an effort would only make sense if coordinated also with Jython 3.

A perhaps simpler approach to eliminate order ambiguity would be to sort the methods "type-lexicographically" wherein types would be assigned a well-defined order (probably by bit number).

It is left as an exercise to the reader to figure out how to deal with array types :) Oh, and modern Java may feature overloaded method signatures that only differ in bounds of generics.

Okay, maybe the idea to rely more on Java reflection is a better starting point.

a Python int could be a boolean, short, int, long, or the wrapped version of each in Java

I think Java would handle at least one conversion direction automatically, i.e. that e.g. a short can also be used as an int, an int as a long, etc. The only real issue would be boolean. What would be needed is an iteration that at first converts every type to its closest fit, asks Java (in succeed early manner), and repeats with other conversion variants. It would have to try every int<->boolean exchange that makes sense (after scanning the Java signatures for possible boolean/int positions). Probably doable, but not trivial. In that light, perhaps the scoring mechanism would even be simpler and more systematic. The scoring mechanism would probably also be the most efficient one, since it can utilize lookup tables and the score for each individual arg should be computable in more or less O(1).

robertpatrick · 2024-06-12T11:51:06Z

I don’t think Java will automatically handle converting from, for example, a PyInt (the class representing a Python int) to a Java int or Integer. All of this type conversion is currently handled by the Jython code.

I am not trying to dissuade you from making this work better—I think this is a large and complex problem. I also had designs on fixing this part of the problem but it was extremely messy so I settled for restoring the general matching functionality that ironically was broken by someone trying to improve varargs matching.

Stewori · 2024-06-12T12:48:13Z

I don’t think Java will automatically handle converting from, for example, a PyInt (the class representing a Python int) to a Java int or Integer.

I mean after initially converting PyInt->int, PyFloat->double, etc. Java would solve its own numeric tower except for boolean. Yes, PyLong->BigInteger rarely exceeds the long range, so the (possibly sometimes misfitting) PyLong->long conversion should be attempted as well. That issue would be more severe for Jython 3 and then should probably be value-dependent (if the value is available at conversion time), i.e. use the tightest fit so Java would resolve the numeric tower as broadly as possible.

I think this is a large and complex problem

Agreed. Still, it needs to be solved and apparently, the previous solutions have been too naive to address the complexity. That's why I am suggesting to take a step back and re-evaluate the problem as a whole. Then I just see the scoring approach popping up as a solution that can address the complexity adequately.

fixing this part of the problem but it was extremely messy

I know, this can quickly become messy - I dealt with similar stuff in other projects. (Big topic in pytypes and more recently in APIdia, for resolving javadoc @links. There I'm using a score-based approach as a fallback for badly written/ambiguous cases. Still haven't got it right for overridden methods that only differ in bounds of generics, but that case is extremely rare.)
Anyway, for Jython I would think the score-based approach would be the least becoming-messy-prone one because the scoring IMHO requires compared to other approaches the least decision structures and special case handling. I would hope it could be implemented rather systematically and in sufficiently modular manner to address additional types and type structures in the future (e.g. to take Python type annotations into account). At this point, however, that is just an intuition.
Do you have experience from your solution designs that speaks against a scoring approach?

robertpatrick · 2024-06-12T14:16:06Z

Do you have experience from your solution designs that speaks against a scoring approach?

No, but it’s not clear exactly what the changes would be. The theory seems good but let’s see how implementing it works out.

Stewori · 2024-06-13T12:17:21Z

Looking at ReflectedArgs.precedence and ReflectedArgs.compare, there seems to be already kind of scoring idea. That is used to assign an order to method signatures. So, the existing attempt is (correct me if I'm wrong) to bring the signatures into one order that permits a greedy/fail-fast/succeed-fast check for later calls. That attempt is probably the core of the simplification that now causes trouble.

I think it is necessary to let ReflectedArgs.matches return an int score rather than a boolean. It should evaluate to 0 if there is no match and be a higher value the better the match. Computation of the score can rely on per-arg scores produced by ReflectedArgs.compare, which should previously be enriched. It already has some scoring idea, i.e. +-1 for insignificant mismatch, +-2 for significant mismatch. That should be expanded into a proper, partially ordering distance measure, perhaps utilizing score values already used in ReflectedArgs.precedence.

Then the selection mechanism in PyReflectedFunction/Constructor can be adjusted to check all matches and select the highest score among the feasible, i.e. non-zero scoring, ones. That is potentially less performant than the previous succeed-fast algorithm. OTOH, there are usually only very few overloaded method signatures and the score computation may be negligible (unmeasurable?) compared to the reflection call overhead (and other performance issues of Jython).

The sketched approach shows how the scoring mechanism can be implemented with minimal changes to existing API, in fact only the return type of ReflectedArgs.matches would have to change. Okay, better style is to introduce the new variant with a different name, e.g. match_score, keep the old one for reference as deprecated, unused method. Then again, although theoretically public API, it is fairly internal to Jython.

jhubbardnso · 2024-07-18T14:04:25Z

Thanks all for your input and sorry that it's taken me so long to reply. I'm also of the opinion that applying an actual order to the list of candidate methods and choosing the best match is the right solution here. I'll try to find some time to implement that.
For now I've gone ahead and closed this as IMO it isn't worth merging. It also turns out that this doesn't fully revert the previous behavior change (I'm not sure why that is) so even if this were merged it wouldn't turn 2.7.3-beta or 2.7.4 into something we could actually use as-is.

jeff5 · 2024-07-20T09:32:55Z

This was never going to make 2.7.4 anyway, so there always was time to think carefully.

The existing code contains some sort of similarity score, but I recall not being able to work out when it influenced the answer.

Is the problem simply that __tojava__ tries too hard to please? I was thinking (Jython 3) that there should be a way to ask whether coercion is possible before demanding it. Perhaps it could return a "distance" that would put Python 1 closer to Java int than to boolean.

If there are N arguments in the call and M methods with the name, the process is one of elimination from a set of M alternatives in at most N steps. I think the set shrinks pretty quickly in almost all real cases, so that better algorithms don't seem worth seeking.

I think it must be possible to work left-to-right asking the question "could we accept this object as a ... " for each of the signatures still in play and reach the same conclusion Java would for primitives. If you reach the end carrying more than one possibility, isn't this where the aggregate score comes in?

Stewori · 2024-07-20T16:23:11Z

If you reach the end carrying more than one possibility, isn't this where the aggregate score comes in?

I think the current implementation returns the first fit, i.e. would never reach the end carrying multiple possibilities. That's what I refer to as "greedy/succeed fast". Indeed I suppose the better solution would be to carry possibly multiple possibilities to the end and score the candidates. Currently the scoring is used to sort the signatures into a most promising order for the greedy approach.

jhubbardnso closed this Jul 18, 2024

jhubbardnso mentioned this pull request May 12, 2026

Fix #335 Improve overloaded varargs resolution #425

Open

Conversation

jhubbardnso commented Jun 10, 2024

Uh oh!

jhubbardnso commented Jun 10, 2024

Uh oh!

Stewori commented Jun 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jhubbardnso commented Jun 11, 2024

Uh oh!

robertpatrick commented Jun 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tpoliaw commented Jun 11, 2024

Uh oh!

robertpatrick commented Jun 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Stewori commented Jun 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robertpatrick commented Jun 12, 2024

Uh oh!

Stewori commented Jun 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robertpatrick commented Jun 12, 2024

Uh oh!

Stewori commented Jun 13, 2024

Uh oh!

jhubbardnso commented Jul 18, 2024

Uh oh!

jeff5 commented Jul 20, 2024

Uh oh!

Stewori commented Jul 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Stewori commented Jun 11, 2024 •

edited

Loading

robertpatrick commented Jun 11, 2024 •

edited

Loading

robertpatrick commented Jun 11, 2024 •

edited

Loading

Stewori commented Jun 12, 2024 •

edited

Loading

Stewori commented Jun 12, 2024 •

edited

Loading