Skip to content

Releases: DataDog/datadog-agent

7.69.0

14 Aug 17:01
7.69.0
4623166
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2025-08-14

Upgrade Notes

  • The cilium conntracker is now enabled by default in the system-probe, and now expects the /sys/fs/bpf to be mounted at /host/sys/fs/bpf in containerized environments. The conntracker, if enabled, will fail to load unless this mount is provided, with the log line "not loading cilium conntracker since cilium maps are not present" in system-probe's log file. Users who have enabled this feature can either upgrade to the latest helm chart or add this mount to their container

New Features

  • Adds additional information and data related to the setsockopt hook.
    • Socket Information:
      • Socket type
      • Socket family
      • Socket protocol
    • Filter Information
      • Disassembled filter
      • Filter hash
  • You can now set the JAVA_TOOL_OPTIONS that JMXFetch uses by setting the jmx_java_tool_options configuration option in the datadog.yaml config file. This allows you to pass additional JVM options to JMXFetch, such as memory settings or system properties.
  • Adding a TracerPayloadModifier to the Trace Agent.
  • pkg/trace/api: Container tags hash is returned as a response header of the info endpoint.
  • Added new config option include_ephemeral_containers to collect Kubernetes ephemeral containers. The option is disabled by default. When enabled, the Agent will report container.* and kubernetes.* metrics for ephemeral containers. It will also collect logs and schedule checks for ephemeral containers when configured to do so.
  • Data Streams Monitoring: Adds new feature allowing users to retrieve messages from Kafka topics.
  • Change collect_gpu_tags config flag to be enabled by default. Now the Agent collects an additional gpu_host host tag for all hosts that have Nvidia GPUs.
  • Added new processing rule to omit truncated logs from being sent to ingest
  • GPU: Add GPM collector for Hopper and newer NVIDIA GPUs
  • Adds VPN tunnels and route table data collection to SNMP. This can be enabled/disabled using the collect_vpn config.
  • The NTP check on Windows now discovers the primary domain controller (PDC) on domain-joined hosts when use_local_defined_servers is enabled. If the PDC is unavailable, it automatically falls back to registry-defined servers. Check now performs order-insensitive server list comparisons, reduces log noise, and avoids using itself as a time source when running on a domain controller.
  • [Preview] The agent can now connect to the AWS SSM, AWS Secrets, Hashicorp Vault and Azure Keyvault secret management solutions to resolve secrets without requiring a user provided binary. For this, two new settings are introduced: secret_backend_type and secret_backend_config. For more information see: https://docs.datadoghq.com/agent/configuration/secrets-management
  • Added support in DDOT for the datadogexporter.proxy_url configuration option. This allows users to specify proxy settings for DDOT with the collector configuration.
  • Windows: Add CRL monitoring to the Windows Certificate Store integration.

Enhancement Notes

  • The serverless-init build uses the new TracerPayloadModifier to add Function Tags to the _dd.tags.function tag of the Tracer Payload to support serverless trace tagging.
  • Agents are now built with Go 1.24.5.
  • The user is now able to specify which features they want enabled inside of the converter. Previously, the user would have to either enable or disable everything.
  • DDOT now uses zstd compression for logs by default.
  • If a check has both a Go and a Python version, the Go version now has priority by default. This change should not have any visible impact, but if needed, you can disable this configuration by setting prioritize_go_check_loader to false.
  • GPUM: the "status" command now returns status of the system-probe part of GPU monitoring
  • Added new DogStatsD configuration option "dogstatsd_flush_incomplete_buckets". When enabled, DogStatsD will flush all received metrics during shutdown, regardless of which time-interval based bucket they belong to.
  • Agent integration metadata payloads now include the JMX integrations.
  • Allow users to configure the HTTP timeout for the Logs Agent.
  • No longer have the Logs Agent fall back to TCP when configuring logs_config.logs_dd_url with a http(s):// prefix.
  • If the Oracle can_connect check is critical, also set the can_query check to critical.
  • Display the number of times each log processor has been used in the Logs Agent status endpoint.
  • Reduce binary size by removing the Sensitive Data Scanner (SDS) from the logs agent.
  • OTLP spans support db.namespace semantic and map to db.name for DBM support.
  • Generate a more detailed warning when the Logs Agent tailer limit is reached.
  • Improved the granularity of the Logs Agent pipeline monitor to track the capacity of each individual component of the pipeline.
  • Remote Agent management operations on Windows now attempt to force stop the Agent services if they do not respond to Service Control Manager requests.
  • Remote Agent management on Windows now automatically retries when the MSI returns error 1618 (ERROR_INSTALL_ALREADY_RUNNING).

Bug Fixes

  • Correctly respect the ecs_collect_resource_tags_ec2 variable when calling the ECS Agent. Start caching tags to reduce burden on the ECS Agent. Start logging error responses from the ECS agent.
  • Fix a panic in Docker streams log parsing when stream messages are truncated on transmission.
  • Fix the cgroup reader bug that would prevent the generic container check from sending metrics when the Agent encountered a permission error.
  • Fixes invalid logs compression error in DDOT, sets DDOT logs compression to gzip.
  • Add support for selecting the endpoint resolution method using advanced AD identifiers in Kubernetes endpoint check configurations defined in files or configmaps. This enables static pod check configurations to correctly resolve the endpoint by setting resolve method to "ip".
  • Fixed the serializer exporter for the OSS Collector, which was not setting the correct proxy variables when sending metric data.
  • Fixed Windows installer overwriting install_info from setup scripts. When using Fleet Automation setup scripts, the subsequent MSI installation now skips writing install_info via a new SKIP_INSTALL_INFO flag, preserving the original setup script installation method tracking.
  • Fix Jetson check to correctly parse the output of tegrastats for Orin boards.
  • Fix incorrect container.memory.kernel value when running with Kernel >= 5.19 and cgroupv2
  • Breaking change - Fixes the Oracle service name tag to be service_name instead of service. This corrects the conflict with the APM service tag. This is a breaking change for any users who had been relying on the service tag to be set to the Oracle service name. The service tag can still be set explicitly in the tags configuration if needed.
  • Metrics sent from the process check on the core agent now have the host tag.
  • GPU: fix a bug where the device assigned to a process could be wrong if it updates the CUDA_VISIBLE_DEVICES environment variable during runtime
  • GPUM: fix Kubernetes device allocation detection in Google Kubernetes Engine
  • The NTP check will no longer fail to start if the initial discovery of local NTP servers fails at agent startup.
  • Limit the HTTP timeout on startup to 5 seconds for the Logs Agent.
  • Prevent the process component from running in the cluster worker.
  • Removes an extra copy of agent.exe from the Windows container
  • Remote Agent management operations on Windows now attempt to restart the Agent services after failing to stop the services or uninstall the Agent.
  • Fix Cgroup namespace not properly detected in Workload Protection, leading to incorrect container ID resolution and misqualified detections.

Other Notes

  • Add a new metric to the Agent telemetry for the startup and running states. This will help us track the startup and running states of the Agent.
  • Transparent Huge Pages (THP) usage is now disabled by default in the System Probe and Security Agent. To re-enable their usage, set the system_probe_config.disable_thp or security_agent.disable_thp configuration options to false.

Datadog Cluster Agent

Prelude

Released on: 2025-08-14 Pinned to datadog-agent v7.69.0: CHANGELOG.

Enhancement Notes

  • The auto-instrumentation webhook supports labels and annotations as tags configuration. If any of the label or annotation mappings for the incoming pod correspond to Universal Service Tags (service, env, or version), the webhook will also add the corresponding UST environment variable to the pod (DD_SERVICE, DD_ENV, or `DD...
Read more

7.68.3

28 Jul 15:03
7.68.3
874cfce
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2025-07-28

Upgrade Notes

Enhancement Notes

  • Agents are now built with Go 1.24.5.

Bug Fixes

Datadog Cluster Agent

Prelude

Released on: 2025-07-28 Pinned to datadog-agent v7.68.3: CHANGELOG.

7.68.2

21 Jul 06:37
7.68.2
14cf7e9
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2025-07-21

Bug Fixes

  • Fix an issue with the Agent pre-install script that caused integrations shipped with the Agent to be removed during an Agent upgrade.
  • Print the correct FIPS status for the Cluster Agent when running in FIPS mode.

Datadog Cluster Agent

Prelude

Released on: 2025-07-21 Pinned to datadog-agent v7.68.2: CHANGELOG.

7.68.1

17 Jul 06:52
7.68.1
1667c7b
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2025-07-17

Security Notes

  • Bump the secret-generic-connector side binary to 0.2.5

Datadog Cluster Agent

Prelude

Released on: 2025-07-17 Pinned to datadog-agent v7.68.1: CHANGELOG.

7.68.0

10 Jul 12:13
7.68.0
ac9b536
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2025-07-10

Upgrade Notes

  • Bump the Python version to 3.12.11
  • Change how attribute precedence is handled. All fields are now evaluated across both span and resource attributes, using the following order of precedence (from highest to lowest):
    • datadog.* span attributes
    • datadog.* resource attributes
    • standard span attributes
    • standard resource attributes

New Features

  • Add a port of the Windows integrations-core Python network check to Go. This version is disabled by default but can be enabled with use_networkv2_check in your configuration.
  • Add support for Autodiscovery for RDS Postgres and MySQL databases.
  • Windows: Add remote certificate collection for the Windows Certificate Store integration.
  • Add a System Probe module that will collect software inventory data from the host.
  • Added logs.truncated and associated aggregate tags into /comp/core/agenttelemetry/impl/config.go
  • Workload protection (CWS) can now generate events based on the setsockopt syscall
  • Added a new logs.truncated metric to the Agent that reports the number of logs truncated before being sent. This metric helps monitor log volume loss due to truncation and is tagged by service and source for better visibility.

Enhancement Notes

  • The agent configcheck --verbose command and flares now include a section that lists all collected configurations, both matched and unmatched. This addition aids debugging by revealing which configurations the Agent has detected.
  • Adds in newly supported ap2.datadoghq.com site to the MSI's GUI menu.
  • Individual integrations can now set their own auto multiline configurations, including adding custom samples for logs specific to that integration.
  • Allows RDS autodicovery to work with an empty tag list. If an empty tag list is provided, the autodiscovery will not filter instances based on tags, allowing all RDS instances to be discovered.
  • OpenTelemetry instrumentation scope attributes are now converted into log attributes.
  • Introduce a new sample configuration file, application_monitoring.yaml, to support the Hands Off config feature. This file is automatically placed under /etc/datadog-agent/ on Linux systems only. Users can manually edit the file to apply application monitoring configurations.
  • Agents are now built with Go 1.24.4.
  • ecs_cluster_name is added as a global tag when running on EC2.
  • Improve the memory efficiency of obfuscator key generation.
  • In OTLP metrics ingestion, the instrumentation_scope_metadata_as_tags option is now enabled by default. This means scope attributes are now added as tags to metrics. If you have too many unique values for instrumentation scope attributes, this may cause cardinality issues. To mitigate this, you can disable the behavior by setting datadog.metrics.instrumentation_scope_metadata_as_tags to false.
  • Orchestrator manifests will now be published with all tags present in their metadata counterparts.
  • Single Step Instrumentation now uses the Python tracer major version 3 by default.
  • Refactor the logs-agent auditor to utilize a more testable architecture.
  • Add Kind, ApiVersion, and NodeName to manifests. Add HostName to CollectorManifest.
  • Sensitive text from custom resources is now scrubbed from the manifest. If a field is sensitive, all values within that field are automatically redacted, ensuring that sensitive data is not exposed even in nested structures.
  • Update registry writer to not write atomically when Agent runs on ECS Fargate to reduce memory leak.
  • Updated Windows container image labels to align with Linux image labels for better OCI compliance. Added standard Open Container Initiative (OCI) labels including image source, revision, and version information.

Bug Fixes

  • APM: Fix an issue where the trace-agent could panic during shutdown trying to obfuscate a SQL payload.
  • APM: Fix an issue where trace-agent could panic with "send on closed channel" during shutdown.
  • Prevent Logs Agent registry entries from being removed prematurely when the log source is still active.
  • Fixed TCP retransmit counts by excluding TCP keep-alive packets. Also fixed potential IRQL corruption and memory corruption related to IPv6 filters.
  • APM: Reduce the log level of APM Traces Received log message to debug. These values are available via metrics so this log is mostly just noisy.
  • Factor dependent services into the timeout when stopping the Agent service on Windows. Operations such as the stop-service Agent subcommand and remote updates now wait longer for the Agent and its subservices to stop before reporting an error.
  • Fixed debug log message for detected locally defined servers in NTP check.
  • Fixes a panic in the checks collector that occasionally occurs when the Agent is shutting down.
  • Fixes Python integrations not being persisted after Agent uninstall. Enables persisting integration during fleet updates.
  • Fixes multiline stacktraces being split up into separate logs when serverless-init is installed in-process.
  • Windows Agent remote updates now submit the remote config task state to the backend. This reduces the time it takes for a remote update to complete.
  • Windows Agent installer now uses absolute path to msiexec.exe instead of PATH lookup, improving installation reliability
  • Fixes telemetry reporting in the Agent Install Script for Windows PowerShell on hosts using PowerShell version less than 6 and without Internet Explorer installed, such as on a Server Core installation.
  • The Datadog Installer service on Windows is now set to manual start. This prevents alerts from tools that monitor automatically started services, such as the Windows Server Manager Dashboard.
  • Fix a bug that resulted in some Orchestrator Kubernetes manifests losing the configured "extraTags".
  • Fix how the Live Process and Live Containers sets the hostname when running in an Agent that is running in AWS Fargate
  • Applies SQL obfuscation logic to OpenTelemetry db semantics. Specifically, db.statement and db.query.text values will be obfuscated along with resource name and sql.query, according to obfuscation settings in the Agent config: https://github.com/DataDog/datadog-agent/blob/1768f80e3f14d0d300b1276ae23ec7c8237dde4c/pkg/config/config_template.yaml#L1226-L1364
  • Ensure serverless deployments send logs with gzip compression.
  • Fix a rare panic that can occur when a log is unable to be written to a TCP-based unreliable endpoint.
  • Fixed a bug where the system.cpu.num_cores metric could be incorrect on certain Windows platforms.
  • Fixed Windows container image metadata to properly include build timestamps and version information.

Other Notes

  • Add Origins for DuckDB, Keda and Supabase
  • Add metric origins for the Windows Certificate Store integration.
  • Add metric origins for new integrations.
  • SystemD units are now written by .deb and .rpm package scripts during the installation process. They were previously part of the package archive. We do not expect this change to affect users.

Datadog Cluster Agent

Prelude

Released on: 2025-07-10 Pinned to datadog-agent v7.68.0: CHANGELOG.

New Features

  • The admission controller can now enable kubelet API logging in the injected agent sidecar.

Enhancement Notes

  • Added a new metric to expose the ksm kube_cronjob_status_last_successful_time metric. The name of the metric is kubernetes_state.cronjob.duration_since_last_successful.
  • Single Step Instrumentation now uses the Python tracer major version 3 by default.

Bug Fixes

  • Stop sending telemetry associated with a DatadogMetric when the object is deleted.
  • Fix a bug in the Kubernetes State Metrics (KSM) check where custom resource metrics were incorrectly named using the kubernetes_state.customresource.<name> pattern instead of the intended kubernetes_state_customresource.<prefix>_<name> format.
  • Fixes a bug in the admission controller webhook that caused volume mounts to be skipped when other webhooks injected init containers after our own volume mounts had been added.
  • Properly take into account the timeZone field of the CronJob objects in the kubernetes_state.cronjob.on_schedule_check service check.

7.67.1

02 Jul 15:16
7.67.1
21d1070
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2025-07-02

Enhancement Notes

  • Agents are now built with Go 1.23.10.

Bug Fixes

  • Fixes invalid logs compression error in DDOT, sets DDOT logs compression to gzip.

  • Permissions are no longer applied recursively to the Datadog installer data directory on Windows.

    This fixes an issue that causes Agent updates to restrict access to the .NET APM tracer libraries that were previously installed by the DD_APM_INSTRUMENTATION_LIBRARIES option, preventing them from being loaded by IIS.

  • Fixes an issue in Install-Datadog.ps1 that could malform datadog.yaml and cause the Agent to fail to start. When datadog.yaml does not end with a new line the remote_updates option was incorrectly appended to the last line in the file instead of to a new line.

Datadog Cluster Agent

Prelude

Released on: 2025-07-02 Pinned to datadog-agent v7.67.1: CHANGELOG.

7.67.0

18 Jun 13:48
7.67.0
bdf863c
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2025-06-18

Upgrade Notes

  • Bump the Python version to 3.12.11
  • Upgraded JMXFetch to 0.49.7 <https://github.com/DataDog/jmxfetch/releases/0.49.7> which switches from snakeyaml to snakeyaml-engine, adding support for YAML 1.2. See 0.49.7 <https://github.com/DataDog/jmxfetch/releases/tag/0.49.7> for more details.
  • In order to avoid unnecessary DNS queries, the agent now uses FQDN when connecting to Datadog intakes. Specifically, it adds a trailing dot at the end of the Datadog intake hostnames. While most users may not notice this change, it can affect setups where connections between the agent and Datadog intakes are intercepted for deep packet inspection or TLS man-in-the-middle by proxies or firewalls. Users that have such a proxy or L7 firewall should ensure that the rules for agent connections to *.datadoghq.com hosts are also valid for connections to *.datadoghq.com. (with an additional trailing dot) hosts.
  • Update go-sqllexer to 0.1.6.

New Features

  • In the Systemd core check add the option to use regular expressions to select units to monitor.

  • Added a new variable extra_dbm to Aurora Autodiscovery. This variable matches the value of the datadoghq.com/dbm tag on the database instance.

  • Released a new ddot-collector container image that packages the [Datadog Distribution of OpenTelemetry Collector](https://docs.datadoghq.com/opentelemetry/setup/ddot_collector/).

  • The MacOS Agent now supports the Network Path feature by including system-probe and the traceroute module.

  • Windows: Added the Windows Certificate Store integration to monitor the expiration of certificates in the local machine certificate store.

  • Introducing a new setting collect_ec2_instance_info to collect basic EC2 instance information as host tags. This reproduces some of the behaviors of the AWS integration for users that can't enable it. The [AWS integration](https://docs.datadoghq.com/integrations/amazon_web_services/) should still be use whenever possible as it offers a better and more in depth integration.

  • Feature parity between Python disk check and Go disk check. The new version of the disk check is disabled by default for now, but it will be enabled later on. It can be enabled by setting use_diskv2_check: true in your configuration.

  • Pretty printed/multi-line JSON messages are now aggregated into a single line when auto multiline detection is enabled. This ensures the log is treated a structured log when processed by Datadog. Aggregation can be disabled by setting logs_config.auto_multi_line.enable_json_aggregation to false.

  • Add a networkv2 check that is a port of the Python network check to Go. This version is disabled by default but can be enabled with use_networkv2_check in your configuration.

  • Adds a new diagnostic check that identifies firewall rules blocking SNMP traps and NetFlow traffic on Windows systems.

  • Enables support for NetPath on Windows client versions. To enable set tcp_method to syn_socket in the network_path.d configuration file.

  • SNMP integration now defaults to use the Core loader instead of Python.

  • A new core check, agentprofiling, has been introduced to automatically generate a flare with profiles when the Datadog Agent exceeds a configured memory or CPU usage threshold. When a valid config file is set, the Agent monitors its own memory and CPU usage and, upon crossing the threshold, generates a flare with profiles that is either saved locally or sent to a Zendesk ticket.

    This enhancement simplifies troubleshooting memory-related issues that are difficult to reproduce or time, allowing users to passively capture critical memory data without manual intervention.

Enhancement Notes

  • APM: Improve the performance of the Trace Agent's QuantizePeerIPAddresses function, providing a marginal reduction in CPU usage for most workloads.
  • Added a new configuration option, ad_allowed_env_vars, which allows users to restrict which environment variables can be resolved in Autodiscovery check configurations. When set, only the environment variables listed are resolved.
  • Added a new configuration option, ad_disable_env_var_resolution, which lets users disable environment variable resolution in Autodiscovery check configurations.
  • Agents are now built with Go 1.23.9.
  • Enable HA support for Oracle integration.
  • Network devices autodiscovery now deduplicates devices based on their name, description and uptime with config flag use_deduplication.
  • Adds a compression_kind tag to the logs.encoded_bytes_sent telemetry metric, enabling aggregation and monitoring of log compression type usage during rollout.
  • The log agent now uses zstd compression as default for improved performance and reduced bandwidth usage. By default, zstd compression is used when no additional endpoints are configured.
  • Improved logging compression settings across different agent pipelines. Debug logs now clearly indicate whether compression settings are coming from pipeline-specific configuration, global logs configuration, or default fallback settings. This helps debug compression behavior across different pipelines.
  • Improved the behavior of the SQL obfuscator cache key computation. The cache key is now computed conditionally based on whether the cache is enabled.

Known Issues

  • In rare cases, profiles generated by the Agent (including those triggered by the new agentprofiling check) may become corrupted. This is a known limitation of the underlying profile generation system and is not specific to this feature. Corrupted profiles are unusable for analysis. If profiles are still needed, Datadog recommends restarting the Agent and contacting Datadog support for assistance.

Deprecation Notes

  • The remote tagger for the process-agent is now always enabled and cannot be disabled. The process_config.remote_tagger config entry is removed.

Bug Fixes

  • APM: Fix an issue where the Trace-Agent socket could be deleted during an Agent upgrade by the previous Trace-Agent during shutdown.
  • Fixes an issue where the extra_dbname variable in the Aurora Discovery template would default to an empty string if no database name was specified in the cluster resource. It now correctly falls back to the engine's default database name.
  • Fix the Python script used when installing the Agent RPM from leaving behind bytecode.
  • Do not drop the leading zeroes of the AWS account ID in the account_id tag.
  • Fix SBOM generation when container images are scanned using the overlayfs direct scan method (overlayfs_direct_scan: true).
  • Fix SNMP autodiscovery status to take into account ignored IP addresses.
  • Remove the FIPS Proxy status section from the Agent status page when running the FIPS Agent.
  • Increased the Agent GUI cookie persistence to one year. This ensures uninterrupted session continuity for users who configure an infinite session duration.
  • APM: Fix bug where agent status command would show zero traces being written out.
  • The Windows Agent MSI no longer fails if it is unable to delete temporary files related to extracting the embedded Python distribution.
  • The kubelet core check now respects the timeout parameter of the check configuration file.
  • Fixed potential compatibility issues with non-Datadog intakes by ensuring gzip compression is used when additional endpoints are configured.
  • Fixed event platform forwarder to use correct pipeline-specific compression settings instead of log endpoint settings. All non-log pipelines now default to zstd compression unless configured otherwise.
  • Use FQDNs when the Agent builds intake hostnames with DD_SITE to prevent generating as many DNS queries as there are entries in the search section of the /etc/resolv.conf` file. If an intake full URL is explicitly set with add_url`` parameter, then, the parameter is used as-is and using FQDNs or not remains a user choice.
  • [oracle]: Set hostname for Oracle autonomous database.
  • [oracle]: Fix Active Connections with active_session_history: true.
  • Fix incorrect connection stats with active session history (ASH) sampling by sending each ASH snapshot in a separate payload.
  • If a metric transaction can't be sent to the endpoint, this transaction can be serialized to disk. When this occurs, the API key must be sanitized. This ensures that when an API key sourced from a secret is refreshed, the replacer continues to sanitize the new key.
  • Fix rare panic in the flush mechanism of the serverless logs pipeline
  • SNMP: Correctly decode strings with trailing 00s.
  • Avoid running the Agent MSI a second time when rolling back a remote upgrade on Windows.
  • Do not fail remote upgrade on Windows when the Agent service takes more than 3 minutes to start
  • Windows: Prevent unnecessary failing access to process memory when a process is protected

Other Notes

  • Add metrics origins for wlan integration.
  • The compression behavior is now also determined by the presence of additional endpoints:
    • When additional endpoints are configured: gzip compression is used
    • When no additional endpoints are configured: the default zstd compression is used
  • Add system. prefix to wlan.* metrics. Rename transmit_rate and receive_rate metrics to txrate and rxrate res...
Read more

7.66.1

04 Jun 08:28
7.66.1
7fd1e9a
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2025-06-03

Bug Fixes

  • Fixes issue parsing pod list from kubelet when the InPlacePodVerticalScaling feature gate is enabled on the cluster.

Datadog Cluster Agent

Prelude

Released on: 2025-06-03 Pinned to datadog-agent v7.66.1: CHANGELOG.

7.66.0

22 May 13:31
7.66.0
8005fe1
Compare
Choose a tag to compare

Agent

Known issue

This version (and previous) of the Datadog Agent is not compatible with Kubernetes 1.33+ versions due to the Feature Gate InPlacePodVerticalScaling that became enabled by default. This flag modifies the kubelet /pods output preventing the correct behaviour of the Datadog Agent. The recommendation is to upgrade to Agent v7.66.1, which is fully compatible with the latest (and previous) Kubernetes versions. More details can be found in this issue.

Prelude

Release on: 2025-05-22

Upgrade Notes

  • If you use a custom Agent username and password on Windows with an Active Directory domain account and you want to remotely upgrade the Agent using Fleet Automation then you must provide the DDAGENTUSER_PASSWORD option when upgrading to 7.66 or later. For more information see the features release notes.
  • Breaking change: Added a new feature flag disable_operation_and_resource_name_logic_v2 in DD_APM_FEATURES that replaces enable_operation_and_resource_name_logic_v2. The new operation name logic for OTLP is now opt-out instead of opt-in.

New Features

  • Added a new WLAN check that monitors the Wi-Fi interface on the host system. This check is only available for macOS systems.

  • Fleet Automation now supports remote upgrades when using a custom Agent username and password on Windows.

    Windows stores the password as an encrypted LSA local private data object that is only accessible to local Administrators. Windows Service Manager stores service account passwords in the same location. For more information, see the Microsoft documentation on Storing Private Data and Private Data Objects.

    Uninstalling the Agent removes the encrypted password from the LSA.

    To avoid providing and manually managing the account password, consider using a Group Managed Service Account (gMSA). For more information, see Installing the Agent with a gMSA account.

  • Adds support for persisting of non-core integrations during Agent upgrades on Windows platforms. To disable, set the INSTALL_PYTHON_THIRD_PARTY_DEPS="0" property during the installation of the MSI.

  • adds the ability for the Agent to tail logs via the kubelet's API.

  • Support multiple authentication methods for a subnet in network devices autodiscovery.

  • Use cdpCacheSecondaryMgmtAddr and cdpCacheAddress for CDP topology links in case cdpCachePrimaryMgmtAddr is empty or of an unsupported type.

  • Enable language detection via tracers metadata

Enhancement Notes

  • Add a build option (--glibc, enabled by default) to build the Agent on glibc environment. On the other libc environments like musl, the Agent should be built with --no-glibc option. The option enables system-probe gpu module and corechecks gpu collector using github.com/NVIDIA/go-nvml which depends on a glibc-extended definition.
  • OpenTelemetry instrumentation scope attributes are now converted into span attributes.
  • Utilize distributed senders and rtt fairness algorithms to improve logs pipeline throughput and fairness.
  • Adds | separator to the DogstatsD debug table (from agent dogstatsd-stats) when not requesting JSON output. This allows the table to render properly in markdown format.
  • Get the k8s cluster name from an AKS node label if present
  • APM: Improve debug logging for ignore_resources configuration by showing what rule resulted in a trace being ignored.
  • Added an option for the Oracle integration to template the database instance identifier.
  • The Oracle integration now supports the empty_default_hostname option to omit host from metrics
  • Added the exclude_hostname option to the Oracle integration
  • The OTLP receiver gRPC server in the trace agent now respects the config otlp_config.receiver.protocols.grpc.max_recv_msg_size_mib or env var DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_GRPC_MAX_RECV_MSG_SIZE_MIB.
  • added new expvar "submission_error_count" on process endpoint to be shown during user status checks
  • Reduce the memory footprint of the logs pipeline by eliminating unnecessary fields in the log payload.
  • The Windows Agent MSI now sets the Agent account password to the provided DDAGENTUSER_PASSWORD value when it is a local account. Previously, if the provided password did not match the account password, the Agent would fail to start.

Bug Fixes

  • Addressed a bug in the cluster-agent API that prevented tag extraction for annotations from working due to client side filtering. The fix was implemented in both the node-agent and the cluster-agent. Now, node-agent clients specify the annotations filter when querying the cluster-agent.
  • APM: Fix an issue where apm_config.ignore_resources only removes the root span instead of discarding the whole trace when using OTLP ingestion.
  • When using OTLP ingest with metrics, the instrumentation_scope_metadata_as_tags option now outputs the instrumentation_scope tag instead of the deprecated instrumentation_library tag.
  • Prevents the index out of range error caused when trying to match inspect layer digests to history layers on some images.
  • Fix clusterchecks dispatching on the Cloud Foundry Cluster Agent
  • Fixed an issue in the KSM check where, when using pod_collection_mode: node_kubelet, the Agent reported incorrect values for the kubernetes_state.container.status_report.count.waiting metric.
  • Fixes a bug that Agent OTLP ingestion fails to start when the config otlp_config.receiver.protocols.grpc.max_recv_msg_size_mib or env var DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_GRPC_MAX_RECV_MSG_SIZE_MIB is set to a string.
  • Process Agent files are added to the flare archive instead of displaying "no session token provided".

Other Notes

  • Add a new Agent telemetry tag auth to the API telemetry metrics. This tag is used to evaluate the impact of reworking the authentication system for inter-process communication.
  • Add a new metric counter to the Agent telemetry for status rendering errors. This is used to detect potential issues with an ongoing change in the status rendering.

Datadog Cluster Agent

Prelude

Released on: 2025-05-22 Pinned to datadog-agent v7.66.0: CHANGELOG.

New Features

  • For workload selection in auto-instrumentation, users can now use the Kubernetes native valueFrom as an alternative to value in ddTraceConfigs. This enables dynamic, user-defined and label based value propagation to the tracing SDKs, like DD_SERVICE.
  • Collect EndpointSlice manifests in the orchestrator check.
  • Tag resources from Cluster Agent Orchestrator check with all static tags.

Bug Fixes

  • Fix major data races in the orchestrator check for the Kubernetes resource collection.
  • Fix data race in autodiscovery cluster checks provider.
  • Fix data race in autodiscovery kube services provider.
  • Fixes an issue with autoinstrumentation where sometimes a DD_SERVICE is not consistent between containers and init containers.
  • The auto-instrumentation webhook no longer mutates the istio-proxy container. This fixes an issue with Kubernetes-native sidecars and the istio service mesh where a standard sidecar is moved to be the first init container by istio after it was mutated by auto-instrumentation.
  • The cluster-agent kubernetes_metadata API now supports client specified annotations filtering. Clients can pass along filters as query parameters like '?filter=abc&filter=def'.

7.65.2

14 May 14:58
7.65.2
0e9956b
Compare
Choose a tag to compare

Agent

Prelude

Release on: 2025-05-13

Bug Fixes

  • (datadog-fips-agent) Ensure the post-install script always rebuilds fipsmodule.cnf in case the embedded OpenSSL is updated.
  • The embedded OpenSSL on Windows no longer links against zlib (which wasn't included), preventing errors related to accidentally loading a version of zlib installed on the host.
  • On Windows, restarting the datadogagent service now also restarts the Datadog Installer service to ensure configuration changes take effect.

Datadog Cluster Agent

Prelude

Released on: 2025-05-13 Pinned to datadog-agent v7.65.2: CHANGELOG.

Bug Fixes

  • Fix wrong computation of the init container resources in the autoinstrumentation webhook.