Skip to content
View Niccolo-Ajroldi's full-sized avatar

Organizations

@mlcommons @LAION-AI @seal-rg @OpenEuroLLM

Block or report Niccolo-Ajroldi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Niccolo-Ajroldi/README.md

I like ML, in particular deep learning optimization, efficency and benchmarks!

Here you can find a copy of my CV.

You can find me also on:

Niccolo-Ajroldi Niccolo-Ajroldi Niccolo-Ajroldi

Highlights

  • May, 2025. ⭐️ Our paper on large-scale evaluations of Weight Averaging has been accepted at ICML 2025!
  • March, 2025. 📄 Our paper on large-scale evaluations of Weight Averaging has been accepted at the ICLR 2025 First Workshop on Open Science for Foundation Models!
  • February, 2025. 🗣️ Gave my first talk! Presented our work on Weight Averaging for large scale ML at the First AlgoPerf Workshop.
  • October, 2024. 📄 Our paper on Loss Landscape Characterization of Neural Networks without Over-Parametrization has been accepted to NeurIPS 2024!
  • August, 2024. 🎉 Our submission to AlgoPerf scored third 🥉 in the inaugural benchmark results! We scored first among non-industry submissions! Checkout the MLCommons blogpost and our submissions in the official repo.
  • July, 2024. 📂 Released plainLM, a minimal open-source repository for pre-training Transformers on Language Modeling. It is written in PyTorch, supports distributed training, and contains a minimal Transformer implementation, with RoPE, RMSNorm, GLU.

Selected Publications

When, Where and Why to Average Weights?
Ajroldi, Orvieto, Geiping, to appear in Proceedings of the 42st International Conference on Machine Learning (ICML 2025)
We perform a large scale benchmarking of weight averaging techniques on AlgoPerf. Our evaluation across seven architectures and datasets reveals that averaging significantly accelerates training and yields considerable efficiency gains across all considered workloads.

Loss Landscape Characterization of Neural Networks without Over-Parametrization
Islamov, Ajroldi, Orvieto, Lucchi, Advances in Neural Information Processing Systems 2024 (NeurIPS 2024)
We introduce a new function class that better captures neural network loss landscapes, ensuring convergence for several SGD-based algorithms, and showing its applicability across several Deep Learning tasks!

Conformal Prediction Bands for Two-Dimensional Functional Time Series
Ajroldi, Diquigiovanni, Fontana, Vantini, Computational Statistics & Data Analysis, 2023.
We develop algorithms to forecast time evolving surfaces and estimate prediction uncertainty. We introduce estimation techniques for functional autoregressive models and revisit distribution-free uncertainty quantification techniques for this setting.

Continuous and early prediction of Acute Kidney Injury in critically ill patients
Alfieri, Ancona, Tripepi, Rubeis, Ajroldi, Finazzi, Cauda, Fagugli, (2023), on PLOS ONE.
We propose a novel ML model to continuosly predict Acute Kidney Injury episodes in Intensive Care Units using routinely-available data. The model is tested through a multi-centric, multi-national external validation procedure.

Pinned Loading

  1. plainLM plainLM Public

    Minimal pretraining script for language modeling in PyTorch. Supporting torch compilation and DDP. It includes a model implementation and a data preprocessing.

    Python 27 6

  2. algorithmic-efficiency algorithmic-efficiency Public

    Forked from mlcommons/algorithmic-efficiency

    MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.

    Python

  3. PCA-for-Surfaces PCA-for-Surfaces Public

    Principal Component Analysis of surfaces, i.e. functions defined on a bivariate domain.

    R

  4. cluster_101 cluster_101 Public

    Some minimal examples on how to submit job in a SLURM-based or CONDOR-based computing clusters.

    Shell 2

  5. ARMA-Surfaces ARMA-Surfaces Public

    Simulation of Functional Autoregressive Moving Average Processes for surface date.

    R

  6. Functional-BNP-clustering Functional-BNP-clustering Public

    Bayesian nonparametric clustering of functional data.

    R 2 1