Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This one is definitely not ready yet. A summary of my discussion with the AI on this one:
Issue #4926: False Positives on Array Access with Mixed Types
Problem Statement
When assigning to different fields of arrays that have a mixed base type (e.g., from
json_decode()), Phan emits false positivePhanTypeInvalidDimOffsetwarnings. The issue occurs because after assigning to one field, Phan creates a restrictiveArrayShapeTypethat only includes the assigned field, causing errors when accessing other fields.Test Cases from Issue
Case 1: Simple Function Parameter
Case 2: Assignment from json_decode
Case 3: Assignment with List Annotation
Root Cause Analysis
AssignmentVisitor Issue (Write Side)
In
src/Phan/Analysis/AssignmentVisitor.php, thevisitDim()method handles assignments to array dimensions like$arr[$key] = $value.Original Behavior:
This creates an
array{x:int}when assigning$arr["x"] = 5, even if$arroriginally had typemixed|array{x:string,y:string}. When this restrictive shape is merged with the original mixed type, subsequent accesses to$arr["y"]fail.Why This Happens:
$GLOBALS['a']['b'] = 'value', we WANT restrictive shapes to track what fields existjson_decode(), we DON'T want restrictive shapes because the full structure might exist at runtimeUnionTypeVisitor Issue (Read Side)
In
src/Phan/AST/UnionTypeVisitor.php, theresolveArrayShapeElementTypes()method handles reading from arrays like$arr[$key].Original Behavior:
When
resolveArrayShapeElementTypesForOffset()returnsfalse(meaning array shapes in the union don't have the key), it would emitPhanTypeInvalidDimOffseterrors even if OTHER types in the union (likemixedor generic arrays) could have the key.Case 2 & 3 Discovery:
Cases 2 and 3 weren't failing on the WRITE side (AssignmentVisitor) but on the READ side. After fixing the write side for case 1, cases 2 and 3 still failed because:
list<array{x:int}>|list<array{x:string,y:string}>(from PHPDoc annotation + assignment inference)$barArr[0]["y"], Phan checked if ANY array shape in the union had field "y"array{x:int}didn't have "y", causing an errorarray{x:string,y:string}DID have "y", so no error should occurListType Discovery:
Case 3 revealed that
ListTypewasn't properly handled when extracting element types. The code treatedListTypeas "could have any key" which is incorrect - lists only accept integer keys >= 0.Solutions Implemented
1. AssignmentVisitor Fix (Write Side)
File:
src/Phan/Analysis/AssignmentVisitor.phpLines: 885-936
Strategy: Detect whether the base array comes from an external mixed source vs. a new structure being created.
Key Decisions:
$GLOBALSalways use array shapes (for tracking fields)Edge Case: GLOBALS Test
The
tests/rasmus_files/src/0024_GLOBALS.phptest showed that$GLOBALSmust always use array shapes, not generic arrays, even though it has a mixed base type. This is because GLOBALS is being POPULATED, not read from an external source.2. UnionTypeVisitor Fix (Read Side)
File:
src/Phan/AST/UnionTypeVisitor.phpLines: 2088-2140
Strategy: Don't emit errors if ANY type in the union (PHPDoc or real) could have the accessed key.
Original Bug:
The original code had:
This treated
ListTypeas "could have any key" which is wrong. Lists can only have integer keys >= 0, not string keys like "aces".Fix:
Split ListType into its own case with proper validation:
3. ListType Element Extraction
File:
src/Phan/AST/UnionTypeVisitor.phpLines: 2200-2216
Problem: When accessing
$barArr[0]where$barArrislist<array{x:string,y:string}>, Phan wasn't extracting the element typearray{x:string,y:string}from the list.Solution:
Failed Attempts:
$type->asGenericArrayType($code_base)->genericArrayElementTypes(true, $code_base)- wrong signature$type->asGenericArrayType(KEY_INT)- still wrong, asGenericArrayType expects int enumgenericArrayElementType()(singular) which returns a Type, then convert to UnionTypeTest Results
✅ What Works
Issue #4926 Test Cases:
Existing Tests:
$deck['aces']on alist<mixed>12 Test Failures:
Analysis:
Most failures appear to be related to:
Most Concerning: 0816_json_decode_objects.php
This is a FALSE NEGATIVE - a warning that SHOULD be emitted but isn't:
Expected:
PhanTypeArraySuspiciousNullable Suspicious array access to $content of nullable type ?bool|?float|?int|?list<mixed>|?stringActual: No warning
The fix may have suppressed this warning because
list<mixed>is in the union and we now consider it "could have any key".Attempted Solutions That Failed
Attempt 1: Simple Mixed Check
Problem: Broke GLOBALS test because $GLOBALS has mixed type but needs array shapes.
Attempt 2: Simple Variable Check
Problem: Case 2 failed because
$barArr["x"]is a simple variable.Attempt 3: Check for Array Shapes in Union
Problem: Many mixed unions have array shapes from PHPDoc annotations, so this check was too broad.
Attempt 4: Wrong ListType Method Calls
Solution: Use
genericArrayElementType()(singular) which returns a Type, then convert.Open Questions
1. Is the "Could Have Key" Logic Too Permissive?
The current fix checks if ANY type in a union could have a key. This means:
array{x:int}|mixedaccessing["y"]→ No error (mixed could have "y")array{x:int}|list<mixed>accessing["y"]→ No error (generic array could have "y")Is this the right behavior? Or should we:
2. Should Empty List Emit Warnings?
Test 0816_json_decode_objects.php expects a warning when accessing
$content[0]where$contentis?bool|?float|?int|?list<mixed>|?string.The list is nullable AND could be empty, so accessing
[0]is suspicious. But our fix treatslist<mixed>as "could have any integer key" which suppresses the nullable warning.Should we:
3. Should We Track Assignment Context?
Currently we distinguish external sources by checking if the root variable has generic/mixed types. But what if:
array{}(empty)array{x:string,y:string}$arr["x"] = 5Should the assignment create a restrictive
array{x:int}or respect the PHPDoc annotation and keep the full shape?4. Performance Impact?
The fix adds:
Has this introduced any measurable performance penalty on large codebases?
Recommendations
Option A: Ship with Known Regressions
Option B: Refine the Fix
Option C: Hybrid Approach
Files Modified
src/Phan/Analysis/AssignmentVisitor.php
src/Phan/AST/UnionTypeVisitor.php
tests/files/src/4926_array_mixed_access.php (new)
tests/files/expected/4926_array_mixed_access.php.expected (new)
Test Commands
Debugging Techniques Used
@phan-debug-var Annotation
Type Flow Tracing
Created test files with progressive steps to see where types changed:
/tmp/test_barr0_detailed.php- Showed when array shapes were created/tmp/test_union_access.php- Demonstrated union type field access/tmp/test_just_index0.php- Isolated ListType element extractionTest-Driven Investigation
Conclusion
The fix successfully resolves issue #4926 for all three reported cases by:
However, the fix introduces 12 test regressions that need investigation. The most concerning is a false negative in json_decode nullable array access. The root issue is balancing between:
Further refinement may be needed to find the right balance, possibly with configuration options for strictness levels or special handling for nullable types.