Skip to content

Commit 35e94d0

Browse files
authored
Merge branch 'main' into keras-v3
2 parents 8284757 + 3d4c8f3 commit 35e94d0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

66 files changed

+1700
-295
lines changed

.pre-commit-config.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ repos:
4141
args: ["--py310-plus"]
4242

4343
- repo: https://github.com/pycqa/flake8
44-
rev: 7.1.2
44+
rev: 7.2.0
4545
hooks:
4646
- id: flake8
4747
exclude: docs/conf.py

CITATION.cff

+2-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@ type: software
44
authors:
55
- given-names: "FastML Team"
66
title: "hls4ml"
7-
version: "v1.0.0"
7+
version: "v1.1.0"
8+
date-released: "2025-03-17"
89
doi: 10.5281/zenodo.1201549
910
repository-code: "https://github.com/fastmachinelearning/hls4ml"
1011
url: "https://fastmachinelearning.org/hls4ml"

README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -73,9 +73,9 @@ If you use this software in a publication, please cite the software
7373
@software{fastml_hls4ml,
7474
author = {{FastML Team}},
7575
title = {fastmachinelearning/hls4ml},
76-
year = 2024,
76+
year = 2025,
7777
publisher = {Zenodo},
78-
version = {v1.0.0},
78+
version = {v1.1.0},
7979
doi = {10.5281/zenodo.1201549},
8080
url = {https://github.com/fastmachinelearning/hls4ml}
8181
}

docs/advanced/extension.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@ Extension API
55
``hls4ml`` natively supports a large number of neural network layers.
66
But what if a desired layer is not supported?
77
If it is standard enough and its implementation would benefit the community as a whole, we would welcome a contribution to add it to the standard set of supported layers.
8-
However, if it is a somewhat niche custom layer, there is another approach we can take to extend hls4ml through the *extension API*.
8+
However, if it is a somewhat niche custom layer, there is another approach we can take to extend hls4ml through the *extension API*. This feature is support for both keras and pytorch layers.
99

10-
This documentation will walk through a complete `complete end-to-end example <https://github.com/fastmachinelearning/hls4ml/blob/main/test/pytest/test_extensions.py>`_, which is part of our testing suite.
10+
Complete end-to-end examples are available for both `keras <https://github.com/fastmachinelearning/hls4ml/blob/main/test/pytest/test_extensions.py>`_ and `pytorch <https://github.com/fastmachinelearning/hls4ml/blob/main/test/pytest/test_extensions_pytorch.py>`_, which are part of our testing suite. The description here uses the keras example.
1111
To implement a custom layer in ``hls4ml`` with the extension API, the required components are:
1212

1313
* Your custom layer class

docs/intro/setup.rst

+38-18
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,8 @@ If you want to use our :doc:`profiling <../advanced/profiling>` toolbox, you mig
2020
2121
pip install hls4ml[profiling]
2222
23-
``hls4ml`` is also available as a ``conda`` package in the ``conda-forge`` repository. To install, run:
24-
2523
.. warning::
26-
Version of hls4ml available on ``conda-forge`` is outdated, we recommend installing with ``pip`` to get the latest version.
27-
28-
.. code-block::
29-
30-
conda install -c conda-forge hls4ml
24+
Previously, versions of hls4ml were made available on ``conda-forge``. These are outdated and should NOT be used. Installing with ``pip`` is currently the only supported method.
3125

3226
Development version
3327
-------------------
@@ -90,29 +84,55 @@ Here we give line-by-line instructions to demonstrate the general workflow.
9084
.. code-block:: python
9185
9286
import hls4ml
87+
import tensorflow as tf
88+
from tensorflow.keras.layers import Dense
89+
90+
# Construct a basic keras model
91+
model = tf.keras.models.Sequential()
92+
model.add(Dense(64, input_shape=(16,), name='Dense', kernel_initializer='lecun_uniform', kernel_regularizer=None))
93+
model.add(Activation(activation='elu', name='Activation'))
94+
model.add(Dense(32, name='Dense2', kernel_initializer='lecun_uniform', kernel_regularizer=None))
95+
model.add(Activation(activation='elu', name='Activation2'))
96+
97+
# This is where you would train the model in a real-world scenario
9398
94-
# Fetch a keras model from our example repository
95-
# This will download our example model to your working directory and return an example configuration file
96-
config = hls4ml.utils.fetch_example_model('KERAS_3layer.json')
99+
# Generate an hls configuration from the keras model
100+
config = hls4ml.utils.config_from_keras_model(model)
97101
98-
# You can print it to see some default parameters
102+
# You can print the config to see some default parameters
99103
print(config)
100104
101-
# Convert it to a hls project
102-
hls_model = hls4ml.converters.keras_to_hls(config)
105+
# Convert the model to an hls project using the config
106+
hls_model = hls4ml.converters.convert_from_keras_model(
107+
model=model,
108+
hls_config=config,
109+
backend='Vitis'
110+
)
111+
112+
Once converted to an HLS project, you can connect the project into the Python runtime and use it to run predictions on a numpy array:
113+
114+
.. code-block:: python
115+
116+
import numpy as np
117+
118+
# Compile the hls project and link it into the Python runtime
119+
hls_model.compile()
120+
121+
# Generate random input data
122+
X_input = np.random.rand(100, 16)
103123
104-
# Print full list of example model if you want to explore more
105-
hls4ml.utils.fetch_example_list()
124+
# Run the model on the input data
125+
hls_prediction = hls_model.predict(X_input)
106126
107-
After that, you can use :code:`Vivado HLS` to synthesize the model:
127+
After that, you can use :code:`Vitis HLS` to synthesize the model:
108128

109129
.. code-block:: python
110130
111-
# Use Vivado HLS to synthesize the model
131+
# Use Vitis HLS to synthesize the model
112132
# This might take several minutes
113133
hls_model.build()
114134
115-
# Print out the report if you want
135+
# Optional: print out the report
116136
hls4ml.report.read_vivado_report('my-hls-test')
117137
118138
Done! You've built your first project using ``hls4ml``! To learn more about our various API functionalities, check out our tutorials `here <https://github.com/fastmachinelearning/hls4ml-tutorial>`__.

docs/intro/status.rst

+5-4
Original file line numberDiff line numberDiff line change
@@ -89,14 +89,15 @@ A summary of the on-going status of the ``hls4ml`` tool is in the table below.
8989

9090
Other feature notes:
9191

92-
* ``hls4ml`` is tested on Linux, and supports
92+
* ``hls4ml`` is tested on the following platforms. Newer versions might work just fine, but try at your own risk.
9393
* Vivado HLS versions 2018.2 to 2020.1
94-
* Intel HLS versions 20.1 to 21.4
95-
* Vitis HLS versions 2022.2 to 2024.1
94+
* Intel HLS versions 20.1 to 21.4, versions \> 21.4 have not been tested.
95+
* Vitis HLS versions 2022.2 to 2024.1. Versions \<= 2022.1 are known not to work.
9696
* Catapult HLS versions 2024.1_1 to 2024.2
9797
* oneAPI versions 2024.1 to 2025.0
9898

99-
* Windows and macOS are not supported
99+
* ``hls4ml`` supports Linux and requires python \>=3.10. hlsml does not require a specific Linux distribution version and we recommended to follow the requirements of the HLS tool you are using.
100+
* Windows and macOS are not supported. Setting up ``hls4ml`` on these platforms, for example using the Windows Subsystem for Linux (WSL) should be possible, but we do not provide support for such use cases.
100101
* BDT support has moved to the `Conifer <https://github.com/thesps/conifer>`__ package
101102

102103
Example Models

hls4ml/backends/catapult/passes/bn_quant.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ def transform(self, model, node):
9696
bn_layer.get_weights('scale').data, bn_layer.get_weights('bias').data, node.get_attr('threshold', 0.5)
9797
)
9898
# Remove the BatchNormalization layer
99-
model.remove_node(bn_layer, rewire=True)
99+
model.remove_node(bn_layer)
100100
# Replace the old Activation layer with this one
101101
model.replace_node(node, bnbt_layer)
102102

hls4ml/backends/fpga/passes/clone.py

+5-5
Original file line numberDiff line numberDiff line change
@@ -61,8 +61,8 @@ def match(self, node):
6161

6262
# Check if the output is used more than once
6363
output_map = node.get_output_use_map()
64-
in_output = node.name in node.model.outputs
6564
for output in node.outputs:
65+
in_output = output in node.model.outputs
6666
if len(output_map[output]) + in_output > 1:
6767
# model output also need a stream
6868
return True
@@ -72,10 +72,10 @@ def match(self, node):
7272
def transform(self, model, node):
7373

7474
output_map = node.get_output_use_map()
75-
in_output = node.name in node.model.outputs
7675

7776
transformed = False
7877
for output in node.outputs:
78+
in_output = output in node.model.outputs
7979
n_outputs = len(output_map[output]) + in_output
8080
if n_outputs == 1:
8181
continue
@@ -90,8 +90,8 @@ def transform(self, model, node):
9090
init_stream_idx = 1
9191
if in_output:
9292
# If the value is used as output, add one extra stream
93-
idx = node.model.outputs.index(node.name)
94-
node.model.outputs[idx] = node.name + '_cpy1'
93+
idx = node.model.outputs.index(output)
94+
node.model.outputs[idx] = output + '_cpy1'
9595
init_stream_idx = 2
9696
for i, layer in enumerate(output_map[output], init_stream_idx):
9797
idx = layer.inputs.index(output)
@@ -102,7 +102,7 @@ def transform(self, model, node):
102102
'clone_' + node.name,
103103
attrs,
104104
[output],
105-
[output + '_cpy' + str(i + 1) for i in range(n_outputs)],
105+
[f'{output}_cpy{i + 1}' for i in range(n_outputs)],
106106
)
107107
for i in range(n_outputs):
108108
key = output + '_cpy' + str(i + 1)

hls4ml/backends/fpga/passes/final_reshape.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,7 @@ def match(self, node):
1212
def transform(self, model, node):
1313
if model.config.get_config_value('IOType') == 'io_parallel':
1414
print('WARNING: Final layer is a Reshape, which does not affect the output for io_parallel; removing it')
15-
# remove, but don't rewire because it's the output layer
16-
model.remove_node(node, rewire=False)
15+
model.remove_node(node)
1716
return True
1817
elif model.config.get_config_value('IOType') == 'io_stream':
1918
print(

hls4ml/backends/fpga/passes/hgq_proxy_model.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ def match(self, node: Layer):
5353

5454
def transform(self, model, node: FixedPointQuantizer):
5555
if node.fusible:
56-
model.remove_node(node, rewire=True)
56+
model.remove_node(node)
5757
return True
5858

5959
if model.config.config['IOType'] != 'io_parallel':

hls4ml/backends/fpga/passes/remove_softmax.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,5 @@ def match(self, node):
99
return is_softmax and remove_softmax
1010

1111
def transform(self, model, node):
12-
model.remove_node(node, rewire=True)
12+
model.remove_node(node)
1313
return True

hls4ml/backends/oneapi/oneapi_backend.py

+3
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
from hls4ml.model.layers import GRU, LSTM, Activation, Conv1D, Conv2D, Dense, Embedding, Layer, SimpleRNN, Softmax
1111
from hls4ml.model.optimizer import get_backend_passes, layer_optimizer
1212
from hls4ml.model.types import FixedPrecisionType, IntegerPrecisionType, NamedType
13+
from hls4ml.report import parse_oneapi_report
1314
from hls4ml.utils import attribute_descriptions as descriptions
1415

1516
# from hls4ml.report import parse_oneapi_report
@@ -207,6 +208,8 @@ def build(self, model, build_type='fpga_emu', run=False):
207208
executable = builddir / f'{model.config.get_project_name()}.{build_type}'
208209
subprocess.run(f'{str(executable)}', shell=True, cwd=builddir, check=True)
209210

211+
return parse_oneapi_report(model.config.get_output_dir())
212+
210213
@layer_optimizer(Layer)
211214
def init_base_layer(self, layer):
212215
reuse_factor = layer.model.config.get_reuse_factor(layer)

hls4ml/backends/oneapi/passes/bn_quant.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@ def transform(self, model, node):
149149
bn_layer.get_weights('scale').data, bn_layer.get_weights('bias').data, node.get_attr('threshold', 0.5)
150150
)
151151
# Remove the BatchNormalization layer
152-
model.remove_node(bn_layer, rewire=True)
152+
model.remove_node(bn_layer)
153153
# Replace the old Activation layer with this one
154154
model.replace_node(node, bnbt_layer)
155155

hls4ml/backends/oneapi/passes/transform_types.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ def transform(self, model, node):
3333
new_var = self.interface_var_converter.convert(var, pragma='stream')
3434
elif out_name in node.model.outputs:
3535
new_var = self.interface_var_converter.convert(var, pragma='stream')
36-
if isinstance(var, InplaceTensorVariable):
36+
elif isinstance(var, InplaceTensorVariable):
3737
new_var = self.inplace_stream_var_converter.convert(var, pragma='stream')
3838
else:
3939
new_var = self.stream_var_converter.convert(var, pragma='stream')

hls4ml/backends/quartus/passes/bn_quant.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ def transform(self, model, node):
9696
bn_layer.get_weights('scale').data, bn_layer.get_weights('bias').data, node.get_attr('threshold', 0.5)
9797
)
9898
# Remove the BatchNormalization layer
99-
model.remove_node(bn_layer, rewire=True)
99+
model.remove_node(bn_layer)
100100
# Replace the old Activation layer with this one
101101
model.replace_node(node, bnbt_layer)
102102

hls4ml/backends/vitis/vitis_backend.py

+4
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ def create_initial_config(
5050
namespace=None,
5151
write_weights_txt=True,
5252
write_tar=False,
53+
tb_output_stream='both',
5354
**_,
5455
):
5556
"""Create initial configuration of the Vitis backend.
@@ -64,6 +65,8 @@ def create_initial_config(
6465
write_weights_txt (bool, optional): If True, writes weights to .txt files which speeds up compilation.
6566
Defaults to True.
6667
write_tar (bool, optional): If True, compresses the output directory into a .tar.gz file. Defaults to False.
68+
tb_output_stream (str, optional): Controls where to write the output. Options are 'stdout', 'file' and 'both'.
69+
Defaults to 'both'.
6770
6871
Returns:
6972
dict: initial configuration.
@@ -79,6 +82,7 @@ def create_initial_config(
7982
'Namespace': namespace,
8083
'WriteWeightsTxt': write_weights_txt,
8184
'WriteTar': write_tar,
85+
'TBOutputStream': tb_output_stream,
8286
}
8387

8488
return config

hls4ml/backends/vivado/passes/bn_quant.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ def transform(self, model, node):
9696
bn_layer.get_weights('scale').data, bn_layer.get_weights('bias').data, node.get_attr('threshold', 0.5)
9797
)
9898
# Remove the BatchNormalization layer
99-
model.remove_node(bn_layer, rewire=True)
99+
model.remove_node(bn_layer)
100100
# Replace the old Activation layer with this one
101101
model.replace_node(node, bnbt_layer)
102102

hls4ml/backends/vivado/passes/recurrent_templates.py

+67-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
from hls4ml.backends.backend import get_backend
22
from hls4ml.backends.template import FunctionCallTemplate, LayerConfigTemplate
3-
from hls4ml.model.layers import GRU, LSTM
3+
from hls4ml.model.layers import GRU, LSTM, TimeDistributed
44

55
# recurrent multiplication template
66

@@ -237,3 +237,69 @@ def format(self, node):
237237
template = recr_function_template
238238

239239
return template.format(**params)
240+
241+
242+
time_distributed_config_template = """struct config{index} : nnet::time_distributed_config {{
243+
static const unsigned dim = {dim};
244+
245+
static const unsigned n_time_steps = {n_time_steps};
246+
static const unsigned in_height = {in_height};
247+
static const unsigned in_width = {in_width};
248+
static const unsigned n_chan = {n_chan};
249+
}};\n"""
250+
251+
time_distributed_loop_start_template = """for (int ts = 0; ts < config{index}::n_time_steps; ts++) {{
252+
{loop_mode}
253+
nnet::read_time_step_{dim}d<{input_t}, {config}>(ts, {input}, {output});"""
254+
255+
time_distributed_loop_end_template = """ nnet::write_time_step_{dim}d<{output_t}, {config}>(ts, {input}, {output});
256+
}}"""
257+
258+
time_distributed_include_list = ['nnet_utils/nnet_time_distributed.h']
259+
260+
261+
class TimeDistributedConfigTemplate(LayerConfigTemplate):
262+
def __init__(self):
263+
super().__init__(TimeDistributed)
264+
self.template = time_distributed_config_template
265+
266+
def format(self, node):
267+
params = self._default_config_params(node)
268+
269+
input_shape = node.get_input_variable().shape
270+
params['dim'] = len(input_shape)
271+
if node.name.endswith('_end'):
272+
params['dim'] += 1 # The input variable will be from the wrapped layer, without time dimension
273+
params['in_height'] = input_shape[-3] if params['dim'] == 4 else 1
274+
params['in_width'] = input_shape[-2] if params['dim'] >= 3 else 1
275+
params['n_chan'] = input_shape[-1]
276+
277+
return self.template.format(**params)
278+
279+
280+
class TimeDistributedFunctionTemplate(FunctionCallTemplate):
281+
def __init__(self):
282+
super().__init__((TimeDistributed), include_header=time_distributed_include_list)
283+
self.template_start = time_distributed_loop_start_template
284+
self.template_end = time_distributed_loop_end_template
285+
286+
def format(self, node):
287+
params = self._default_function_params(node)
288+
289+
input_shape = node.get_input_variable().shape
290+
params['dim'] = len(input_shape)
291+
if node.name.endswith('_end'):
292+
params['dim'] += 1 # The input variable will be from the wrapped layer, without time dimension
293+
294+
loop_mode = node.get_attr('time_step_loop_parallelism')
295+
if loop_mode == 'unroll':
296+
params['loop_mode'] = '#pragma HLS UNROLL'
297+
elif loop_mode == 'pipeline':
298+
params['loop_mode'] = '#pragma HLS PIPELINE'
299+
else:
300+
params['loop_mode'] = ''
301+
302+
if node.attributes['wrapped_layer'].name == node.name + '_end':
303+
return self.template_start.format(**params)
304+
else:
305+
return self.template_end.format(**params)

0 commit comments

Comments
 (0)