Conversation
…neural network layers.
…hensive unit tests and fixed broken source file link in the documentation
…d updated documentation with fixed source file link. Added `Parametric` interface to define parameterized layers.
…ith `NumPower` utilities
…ical stability during inference, and gradient computation logic
…ical stability during inference, and gradient computation logic
…itional tests and updated shape handling
…itional tests and updated shape handling
…interface definition for output layers.
…entation and unit tests
…d/backward passes
…ence/backward passes, unit tests, and documentation updates
…ith `NumPower` utilities
…rd/inference/backward passes, unit tests
…ference/backward passes, unit tests
…rward/inference/backward passes, unit tests
There was a problem hiding this comment.
Pull request overview
This PR appears to migrate parts of the neural-network stack toward NumPower-backed implementations and PHPUnit attribute-based tests, adding a new NumPower-based Network and several NumPower-based layers plus an MLPRegressor variant under a new namespace.
Changes:
- Added NumPower-based neural network primitives (
Network, layer contracts, multiple layers) and snapshotting support. - Added extensive PHPUnit tests for the new NumPower-based components and modernized some existing tests to use PHPUnit attributes.
- Updated initializers to pass
loc: 0.0to NumPower normal/truncatedNormal calls and adjusted a shape assertion exception message.
Reviewed changes
Copilot reviewed 57 out of 58 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/Specifications/SamplesAreCompatibleWithDistanceTest.php | Converts test annotations to PHPUnit attributes. |
| tests/Regressors/MLPRegressors/MLPRegressorTest.php | Adds test coverage for new Rubix\ML\Regressors\MLPRegressor\MLPRegressor. |
| tests/NeuralNet/Snapshots/SnapshotTest.php | Adds tests for new snapshot implementation. |
| tests/NeuralNet/NumPower/NumPowerTest.php | Adds a basic NumPower transpose behavior test. |
| tests/NeuralNet/Networks/NetworkTest.php | Adds tests for new Rubix\ML\NeuralNet\Networks\Network. |
| tests/NeuralNet/Layers/Swish/SwishTest.php | Adds tests for NumPower-based Swish layer. |
| tests/NeuralNet/Layers/Placeholder1DTest.php | Normalizes strict_types formatting. |
| tests/NeuralNet/Layers/Placeholder1D/Placeholder1DTest.php | Adds tests for NumPower-based Placeholder1D layer. |
| tests/NeuralNet/Layers/PReLU/PReLUTest.php | Adds tests for NumPower-based PReLU layer. |
| tests/NeuralNet/Layers/Noise/NoiseTest.php | Adds tests for NumPower-based Noise layer. |
| tests/NeuralNet/Layers/Multiclass/MulticlassTest.php | Adds tests for NumPower-based Multiclass output layer. |
| tests/NeuralNet/Layers/Dropout/DropoutTest.php | Adds tests for NumPower-based Dropout layer. |
| tests/NeuralNet/Layers/Dense/DenseTest.php | Adds tests for NumPower-based Dense layer. |
| tests/NeuralNet/Layers/Continuous/ContinuousTest.php | Adds tests for NumPower-based Continuous output layer. |
| tests/NeuralNet/Layers/Binary/BinaryTest.php | Adds tests for NumPower-based Binary output layer. |
| tests/NeuralNet/Layers/BatchNorm/BatchNormTest.php | Adds tests for NumPower-based BatchNorm layer. |
| tests/NeuralNet/Layers/Activation/ActivationTest.php | Adds tests for NumPower-based Activation layer wrapper. |
| tests/NeuralNet/FeedForwards/FeedForwardTest.php | Adds tests for NumPower-based FeedForward wrapper. |
| tests/NeuralNet/FeedForwardTest.php | Migrates existing FeedForward test to PHPUnit attributes. |
| tests/Helpers/GraphvizTest.php | Fixes CoversClass target and migrates to PHPUnit attributes. |
| tests/Datasets/Generators/SwissRoll/SwissRollTest.php | Adds tests for SwissRoll generator. |
| src/Traits/AssertsShapes.php | Switches to project InvalidArgumentException and updates message. |
| src/Regressors/MLPRegressor/MLPRegressor.php | Introduces new NumPower-based MLPRegressor under a sub-namespace. |
| src/NeuralNet/Snapshots/Snapshot.php | Adds Snapshot utility for restoring parametric layer parameters. |
| src/NeuralNet/Parameters/Parameter.php | Changes cloning to deep-copy via array roundtrip for stability. |
| src/NeuralNet/Networks/Network.php | Adds NumPower-based Network implementation (infer/roundtrip/exportGraphviz). |
| src/NeuralNet/Layers/Swish/Swish.php | Adds NumPower-based Swish layer implementation. |
| src/NeuralNet/Layers/Placeholder1D/Placeholder1D.php | Adds NumPower-based Placeholder1D input layer. |
| src/NeuralNet/Layers/PReLU/PReLU.php | Adds NumPower-based PReLU layer. |
| src/NeuralNet/Layers/Noise/Noise.php | Adds NumPower-based Noise layer. |
| src/NeuralNet/Layers/Multiclass/Multiclass.php | Adds NumPower-based Multiclass output layer. |
| src/NeuralNet/Layers/Dropout/Dropout.php | Adds NumPower-based Dropout layer. |
| src/NeuralNet/Layers/Dense/Dense.php | Adds NumPower-based Dense layer. |
| src/NeuralNet/Layers/Continuous/Continuous.php | Adds NumPower-based Continuous output layer. |
| src/NeuralNet/Layers/Binary/Binary.php | Adds NumPower-based Binary output layer. |
| src/NeuralNet/Layers/BatchNorm/BatchNorm.php | Adds NumPower-based BatchNorm layer. |
| src/NeuralNet/Layers/Base/Contracts/Parametric.php | Adds Parametric layer contract for parameter enumeration/restore. |
| src/NeuralNet/Layers/Base/Contracts/Output.php | Adds Output layer contract including backprop API. |
| src/NeuralNet/Layers/Base/Contracts/Layer.php | Adds common Layer contract (width/initialize/forward/infer). |
| src/NeuralNet/Layers/Base/Contracts/Input.php | Adds Input layer marker contract. |
| src/NeuralNet/Layers/Base/Contracts/Hidden.php | Adds Hidden layer contract including backprop API. |
| src/NeuralNet/Layers/Activation/Activation.php | Adds NumPower-based Activation wrapper layer. |
| src/NeuralNet/Initializers/Xavier/XavierNormal.php | Updates truncatedNormal call to include explicit loc parameter. |
| src/NeuralNet/Initializers/Normal/TruncatedNormal.php | Updates truncatedNormal call to include explicit loc parameter. |
| src/NeuralNet/Initializers/Normal/Normal.php | Updates normal call to include explicit loc parameter. |
| src/NeuralNet/Initializers/LeCun/LeCunNormal.php | Updates truncatedNormal call to include explicit loc parameter. |
| src/NeuralNet/Initializers/He/HeNormal.php | Updates truncatedNormal call to include explicit loc parameter. |
| src/NeuralNet/FeedForwards/FeedForward.php | Adds NumPower-based FeedForward wrapper (extends new Network namespace). |
| src/Datasets/Generators/SwissRoll/SwissRoll.php | Adds SwissRoll dataset generator implemented with NumPower. |
| phpunit.xml | Sets phpunit process memory_limit to 256M. |
| docs/neural-network/hidden-layers/swish.md | Updates docs namespace/path for Swish. |
| docs/neural-network/hidden-layers/prelu.md | Updates docs namespace/path for PReLU. |
| docs/neural-network/hidden-layers/placeholder1d.md | Adds docs for Placeholder1D. |
| docs/neural-network/hidden-layers/noise.md | Updates docs namespace/path for Noise. |
| docs/neural-network/hidden-layers/dropout.md | Updates docs namespace/path for Dropout. |
| docs/neural-network/hidden-layers/dense.md | Updates docs namespace/path for Dense. |
| docs/neural-network/hidden-layers/batch-norm.md | Updates docs namespace/path for BatchNorm. |
| docs/neural-network/hidden-layers/activation.md | Updates docs namespace/path for Activation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Gradient of the loss with respect to beta | ||
| // dL/dbeta = sum_over_batch(dL/dy * dy/dbeta) | ||
| // Here we use a simplified formulation: dL/dbeta ~ sum(dOut * input) | ||
| $dBetaFull = NumPower::multiply($dOut, $this->input); | ||
|
|
||
| // Sum over the batch axis (axis = 1) to obtain a gradient vector [width] | ||
| $dBeta = NumPower::sum($dBetaFull, axis: 1); | ||
|
|
||
| $this->beta->update($dBeta, $optimizer); | ||
|
|
||
| $input = $this->input; | ||
| $output = $this->output; | ||
|
|
||
| $this->input = $this->output = null; | ||
|
|
||
| return new Deferred([$this, 'gradient'], [$input, $output, $dOut]); | ||
| } | ||
|
|
||
| /** | ||
| * Calculate the gradient for the previous layer. | ||
| * | ||
| * @internal | ||
| * | ||
| * @param NDArray $input | ||
| * @param NDArray $output | ||
| * @param NDArray $dOut | ||
| * @return NDArray | ||
| */ | ||
| public function gradient(NDArray $input, NDArray $output, NDArray $dOut) : NDArray | ||
| { | ||
| $derivative = $this->differentiate($input, $output); | ||
|
|
||
| return NumPower::multiply($derivative, $dOut); | ||
| } | ||
|
|
||
| /** | ||
| * Return the parameters of the layer. | ||
| * | ||
| * @internal | ||
| * | ||
| * @throws \RuntimeException | ||
| * @return Generator<Parameter> | ||
| */ | ||
| public function parameters() : Generator | ||
| { | ||
| if (!$this->beta) { | ||
| throw new RuntimeException('Layer has not been initialized.'); | ||
| } | ||
|
|
||
| yield 'beta' => $this->beta; | ||
| } | ||
|
|
||
| /** | ||
| * Restore the parameters in the layer from an associative array. | ||
| * | ||
| * @internal | ||
| * | ||
| * @param Parameter[] $parameters | ||
| */ | ||
| public function restore(array $parameters) : void | ||
| { | ||
| $this->beta = $parameters['beta']; | ||
| } | ||
|
|
||
| /** | ||
| * Compute the Swish activation function and return a matrix. | ||
| * | ||
| * @param NDArray $input | ||
| * @throws RuntimeException | ||
| * @return NDArray | ||
| */ | ||
| protected function activate(NDArray $input) : NDArray | ||
| { | ||
| if (!$this->beta) { | ||
| throw new RuntimeException('Layer has not been initialized.'); | ||
| } | ||
|
|
||
| // Reshape beta vector [width] to column [width, 1] for broadcasting | ||
| $betaCol = NumPower::reshape($this->beta->param(), [$this->width(), 1]); | ||
|
|
||
| $zHat = NumPower::multiply($betaCol, $input); | ||
|
|
||
| $activated = $this->sigmoid->activate($zHat); | ||
|
|
||
| return NumPower::multiply($activated, $input); | ||
| } | ||
|
|
||
| /** | ||
| * Calculate the derivative of the activation function at a given output. | ||
| * Formulation: derivative = (output / input) * (1 - output) + output | ||
| * | ||
| * @param NDArray $input | ||
| * @param NDArray $output | ||
| * @throws RuntimeException | ||
| * @return NDArray | ||
| */ | ||
| protected function differentiate(NDArray $input, NDArray $output) : NDArray | ||
| { | ||
| if (!$this->beta) { | ||
| throw new RuntimeException('Layer has not been initialized.'); | ||
| } | ||
|
|
||
| // Prevent division by zero if the input contains zero values | ||
| $denominator = NumPower::add($input, EPSILON); | ||
| $term1 = NumPower::divide($output, $denominator); | ||
|
|
||
| $oneMinusOutput = NumPower::subtract(1.0, $output); | ||
| $product = NumPower::multiply($term1, $oneMinusOutput); | ||
|
|
||
| return NumPower::add($product, $output); | ||
| } |
There was a problem hiding this comment.
Swish’s backprop does not account for the trainable beta parameter: (1) differentiate() computes dy/dx as sigma + xsigma(1-sigma), which is only correct when beta=1; the correct derivative includes a beta factor (sigma + betaxsigma*(1-sigma)). (2) back() computes dBeta as sum(dOut * input), but dy/dbeta depends on x^2 * sigma*(1-sigma). This will produce incorrect gradients once beta updates away from 1 and can prevent the layer from training properly. Please update both gradient calculations to include beta and use the correct dy/dbeta formulation.
| public function forward(NDArray $input) : NDArray | ||
| { | ||
| // Build dropout mask using NumPower's uniform RNG. Each unit is kept | ||
| // with probability (1 - ratio) and scaled by $this->scale. | ||
| $shape = $input->shape(); | ||
|
|
||
| // Uniform random numbers in [0, 1) with same shape as input | ||
| $rand = NumPower::uniform($shape, 0.0, 1.0); | ||
|
|
||
| // mask = (rand > ratio) * scale | ||
| $mask = NumPower::greater($rand, $this->ratio); | ||
| $mask = NumPower::multiply($mask, $this->scale); | ||
|
|
||
| $output = NumPower::multiply($input, $mask); | ||
|
|
||
| $this->mask = $mask; | ||
|
|
||
| return $output; | ||
| } |
There was a problem hiding this comment.
Dropout::forward() uses randomness but does not verify the layer has been initialized (width is set in initialize()) before applying the mask. Other layers consistently throw when used uninitialized; calling forward() on an uninitialized Dropout would currently succeed silently. Consider adding an initialization guard (and potentially handling ratio/stdDev edge cases consistently) to keep layer lifecycle behavior uniform.
| #[Test] | ||
| #[TestDox('Method forward() applies dropout mask with correct shape and scaling')] | ||
| public function testForward() : void | ||
| { | ||
| $this->layer->initialize($this->fanIn); | ||
|
|
||
| $forward = $this->layer->forward($this->input); | ||
|
|
||
| $inputArray = $this->input->toArray(); | ||
| $forwardArray = $forward->toArray(); | ||
|
|
||
| self::assertSameSize($inputArray, $forwardArray); | ||
|
|
||
| $scale = 1.0 / (1.0 - 0.5); // ratio = 0.5 | ||
|
|
||
| $nonZero = 0; | ||
| $total = 0; | ||
|
|
||
| foreach ($inputArray as $i => $row) { | ||
| foreach ($row as $j => $x) { | ||
| $y = $forwardArray[$i][$j]; | ||
| $total++; | ||
|
|
||
| if (abs($x) < 1e-12) { | ||
| // If input is (near) zero, output should also be ~0 | ||
| self::assertEqualsWithDelta(0.0, $y, 1e-7); | ||
| continue; | ||
| } | ||
|
|
||
| if (abs($y) < 1e-12) { | ||
| // Dropped unit | ||
| continue; | ||
| } | ||
|
|
||
| $nonZero++; | ||
|
|
||
| // Kept unit should be scaled input | ||
| self::assertEqualsWithDelta($x * $scale, $y, 1e-6); | ||
| } | ||
| } | ||
|
|
||
| // Roughly (1 - ratio) of units should be non-zero; allow wide tolerance | ||
| $expectedKept = (1.0 - 0.5) * $total; | ||
| self::assertGreaterThan(0, $nonZero); | ||
| self::assertLessThan($total, $nonZero); | ||
| self::assertEqualsWithDelta($expectedKept, $nonZero, $total * 0.5); | ||
| } |
There was a problem hiding this comment.
This test is probabilistic and can be flaky: with ratio=0.5 on a 3x3 input, there is a non-trivial chance (~0.4%) that all units are dropped or none are dropped, which will fail the assertGreaterThan(0, $nonZero) / assertLessThan($total, $nonZero) assertions. To make CI stable, seed the RNG used by NumPower (if supported) or use a deterministic mask (e.g., restore a precomputed mask) for assertions about dropout behavior.
| #[Test] | ||
| #[TestDox('Computes forward pass that adds Gaussian noise with correct shape and scale')] | ||
| public function testForwardAddsNoiseWithCorrectProperties() : void | ||
| { | ||
| $this->layer->initialize($this->fanIn); | ||
|
|
||
| $forward = $this->layer->forward($this->input); | ||
|
|
||
| self::assertInstanceOf(NDArray::class, $forward); | ||
|
|
||
| $inputArray = $this->input->toArray(); | ||
| $forwardArray = $forward->toArray(); | ||
|
|
||
| // 1) Shape is preserved | ||
| self::assertSameSize($inputArray, $forwardArray); | ||
|
|
||
| // 2) At least one element differs (very high probability) | ||
| $allEqual = true; | ||
| foreach ($inputArray as $i => $row) { | ||
| if ($row !== $forwardArray[$i]) { | ||
| $allEqual = false; | ||
| break; | ||
| } | ||
| } | ||
| self::assertFalse($allEqual, 'Expected forward output to differ from input due to noise.'); | ||
|
|
||
| // 3) Empirical std dev of (forward - input) is ~ stdDev, within tolerance | ||
| $diffs = []; | ||
| foreach ($inputArray as $i => $row) { | ||
| foreach ($row as $j => $v) { | ||
| $diffs[] = $forwardArray[$i][$j] - $v; | ||
| } | ||
| } | ||
|
|
||
| $n = count($diffs); | ||
| $mean = array_sum($diffs) / $n; | ||
|
|
||
| $var = 0.0; | ||
| foreach ($diffs as $d) { | ||
| $var += ($d - $mean) * ($d - $mean); | ||
| } | ||
| $var /= $n; | ||
| $std = sqrt($var); | ||
|
|
||
| // Mean of noise should be near 0, std near $this->stdDev | ||
| self::assertEqualsWithDelta(0.0, $mean, 2e-1); // +/-0.2 around 0 | ||
| self::assertEqualsWithDelta(0.1, $std, 1e-1); // +/-0.1 around 0.1 | ||
| } |
There was a problem hiding this comment.
This test relies on random Gaussian noise without seeding the RNG and makes statistical assertions (mean/std) on only 9 samples. Even with wide tolerances, it can intermittently fail depending on the random draw and underlying RNG implementation. To avoid flaky CI, seed the RNG used by NumPower (if supported) or change the assertions to deterministic properties (shape/type) and/or use a fixed, injected noise tensor for verification.
| /** | ||
| * Return the parameters of the layer. | ||
| * | ||
| * @return Generator<\Rubix\ML\NeuralNet\Parameter> |
There was a problem hiding this comment.
The Parametric interface docblock has an incorrect return type reference (Generator<\Rubix\ML\NeuralNet\Parameter>). The actual Parameter type is Rubix\ML\NeuralNet\Parameters\Parameter (and the file imports that class). Updating the docblock improves static analysis and avoids confusing API docs.
| * @return Generator<\Rubix\ML\NeuralNet\Parameter> | |
| * @return Generator<Parameter> |
No description provided.