-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding methods for sequential processing and filtering of class reflections #1475
base: 6.52.x
Are you sure you want to change the base?
Adding methods for sequential processing and filtering of class reflections #1475
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Patch has very good quality, and is worth looking at, but I think you spiked an approach and pushed it too hard here, leading to a bigger than necessary change upfront.
Before deciding on which approach to pick, we should attempt a second patch that:
- isolates filtering into a single, dedicated source locator implementation
- makes the default implementation of
iterable<ReflectionClass>
aGenerator
(happy to break BC there, and release a new major version!)
public function iterateClasses(?SourceFilter $filter = null): Generator | ||
{ | ||
/** @var Generator<ReflectionClass> $allClasses */ | ||
$allClasses = $this->sourceLocator->iterateIdentifiersByType( | ||
$this, | ||
new IdentifierType(IdentifierType::IDENTIFIER_CLASS), | ||
$filter, | ||
); | ||
|
||
yield from $allClasses; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think SourceFilter
should be given at call-time: it should be an implementation detail of the source locator only, IMO.
/** | ||
* Iterate classes available in the scope specified by the SourceLocator. | ||
* | ||
* @return Generator<ReflectionClass> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I'd say that we can change the default reflectAllClasses()
API to return an iterable<ReflectionClass>
, rather than adding new API here.
Yes, it's a breaking change, but it's a good one to have
public function iterateIdentifiersByType( | ||
Reflector $reflector, | ||
IdentifierType $identifierType, | ||
?SourceFilter $sourceFilter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned elsewhere: this is IMO something that shouldn't be given at call-time, but should be instead intrinsic of the way you are looking at a codebase (effectively, a FilteredSourceLocator
)
interface SourceFilter | ||
{ | ||
public function getKey(): string; | ||
|
||
public function isAllowed(string $source, ?string $name, ?string $filename = null): bool; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this API is too specific to your use-case, to be considered as part of the library itself.
Also, following general FP patterns, and having BC boundaries in mind, I'd rather say that isAllowed()
should be relying on a userland callback, and since each locator operates slightly differently, it's hard to design a generic Context
object to be used for filtering decisions
new AggregateFilter( | ||
new FileSizeFilter(10000), | ||
new SourceContainsFilter(['class ReflectionMethod', 'class ReflectionClass']) | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I see, you attempted to design this code so that you could push a runtime filter down the stack, into the lookup operation, for example, to select all classes in a specific namespace.
I understand the rationale, but am a bit conflicted on the design, and the attempt to create a structured filter here, which is extremely risky due to the internals of source locators being squishy, whilst this interface being public
.
I don't have a clear design/suggestion for you right now, but I think a first attempt should be done at creating a single FilteringSourceLocator
that can wrap a $next
one and apply filtering at that level, operating on a LocatedSource|null
: that would make the use-case more specific, and would isolate the filtering change onto a single implementation.
$content = file_get_contents($this->fileName); | ||
if (! $content) { | ||
return null; | ||
} | ||
|
||
$name = $identifier->getName(); | ||
|
||
if ( | ||
$name !== Identifier::WILDCARD && | ||
! str_starts_with($name, ReflectionClass::ANONYMOUS_CLASS_NAME_PREFIX) | ||
) { | ||
|
||
$shortName = array_reverse(explode('\\', $identifier->getName()))[0]; | ||
|
||
if (! str_contains($content, $shortName)) { | ||
return null; | ||
} | ||
} | ||
|
||
return new LocatedSource( | ||
file_get_contents($this->fileName), | ||
$identifier->getName(), | ||
$content, | ||
$name, | ||
$this->fileName, | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably attempt filtering in an alternative implementation of SingleFileSourceLocator
, then look for who is instantiating SingleFileSourceLocator
, and see if we can somehow inject the constructor for it there 🤔
It's just an idea, but it would allow for filtering and transforming sources before skipping or passing them to parsing/reflection
Related: #614 |
@Ocramius , thank you for the review! I agree that the changes turned out to be too big. In this case, I suggest splitting this work into 3 separate parts with 3 separate PRs:
I'll start doing this in the specified order if you don't mind :) UPD: first part completed here: #1476 |
would be great to have some kind of benchmark to compare the performance before/after this PR |
I made a small optimization of the
FileIteratorSourceLocator
so that all files are not loaded into the array before filtering, but only after it, and also added new functionality for sequential work with files and the ability to filter them.Motivation
When working with really big projects, performance issues often arise. They can be solved with some techniques, such as sequential file processing. This was my main motivation for this work.
What has been done?
*Reflection
have been added toSourceLocators
;DefaultReflector
for sequentially obtaining reflect classes;*Reflection
. In most cases, filters are triggered before all complex operations with AST, so filtering in this way is a much more productive method than obtaining individual classes usingreflectClass
;FileIteratorSourceLocator
has been changed. Now they are processed sequentially, and not immediately loaded into an array. This reduces resource consumption and speeds up the work of this locator.