Skip to content
@cerndb

CERN Database and Analytics Group

Popular repositories Loading

  1. dist-keras dist-keras Public archive

    Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.

    Python 623 167

  2. spark-dashboard spark-dashboard Public

    Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an Apache Spark Performance Dashboard using containers technology.

    Dockerfile 129 20

  3. SparkPlugins SparkPlugins Public

    Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are initialized. This also allows extending the Spark metrics syst…

    Scala 93 14

  4. grafana-mimir-cardinality-dashboards grafana-mimir-cardinality-dashboards Public

    Grafana Mimir dashboards used for cardinality exploration

    57 9

  5. hdfs-metadata hdfs-metadata Public

    Tool for gathering blocks and replicas meta data from HDFS. It also builds a heat map showing how replicas are distributed along disks and nodes.

    Java 55 18

  6. SparkDLTrigger SparkDLTrigger Public

    Code and links to the data for the article "Machine Learning Pipelines with Modern Big DataTools for High Energy Physics"

    Jupyter Notebook 31 14

Repositories

Showing 10 of 67 repositories
  • sparkMeasure Public

    This is a mirror of https://github.com/LucaCanali/sparkMeasure - sparkMeasure is a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics.

    cerndb/sparkMeasure’s past year of commit activity
    Scala 16 Apache-2.0 3 0 0 Updated Oct 3, 2025
  • pyspark-root-datasource Public

    Python DataSource for Apache Spark 4 to read ROOT files (High Energy Physics) as DataFrames, powered by uproot, awkward, and PyArrow.

    cerndb/pyspark-root-datasource’s past year of commit activity
    Python 1 Apache-2.0 0 0 0 Updated Oct 2, 2025
  • grafana-mimir-cardinality-dashboards Public

    Grafana Mimir dashboards used for cardinality exploration

    cerndb/grafana-mimir-cardinality-dashboards’s past year of commit activity
    57 Apache-2.0 9 1 0 Updated Sep 17, 2025
  • spark-dashboard Public

    Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an Apache Spark Performance Dashboard using containers technology.

    cerndb/spark-dashboard’s past year of commit activity
    Dockerfile 129 Apache-2.0 20 0 0 Updated Aug 29, 2025
  • SparkTraining Public

    Material for the course "Introduction to Apache Spark APIs for Data Processing" https://sparktraining.web.cern.ch/

    cerndb/SparkTraining’s past year of commit activity
    Jupyter Notebook 17 CC-BY-4.0 6 0 0 Updated May 13, 2025
  • SparkPlugins Public

    Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are initialized. This also allows extending the Spark metrics systems with user-provided monitoring probes.

    cerndb/SparkPlugins’s past year of commit activity
    Scala 93 Apache-2.0 14 3 0 Updated May 9, 2025
  • SparkExecutorPlugins2.4 Public archive

    Spark Executor Plugins Examples for Spark 2.4

    cerndb/SparkExecutorPlugins2.4’s past year of commit activity
    Java 6 Apache-2.0 2 0 0 Updated May 7, 2025
  • opentelemetry-collector-contrib Public Forked from open-telemetry/opentelemetry-collector-contrib

    Contrib repository for the OpenTelemetry Collector

    cerndb/opentelemetry-collector-contrib’s past year of commit activity
    Go 0 Apache-2.0 3,085 0 0 Updated Apr 12, 2025
  • hadoop-xrootd Public

    Mirror of CERN db/hadoop-xrootd. Hadoop-XRootD Filesystem Connector

    cerndb/hadoop-xrootd’s past year of commit activity
    Java 6 Apache-2.0 3 3 1 Updated Sep 25, 2024
  • SparkDLTrigger Public

    Code and links to the data for the article "Machine Learning Pipelines with Modern Big DataTools for High Energy Physics"

    cerndb/SparkDLTrigger’s past year of commit activity
    Jupyter Notebook 31 Apache-2.0 14 0 0 Updated Jun 11, 2024

Top languages

Loading…

Most used topics

Loading…