Skip to content

Releases: apache/skywalking

v10.1.0

29 Sep 13:30
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

A Version of PERFORMANCE

  • Huge UI Performance Improvement. Metrics widgets queries are bundled by leveraging the GraphQL capabilities.
  • Parallel Queries Support in GraphQL engine. Improve query performance.
  • Significantly improve the performance of OTEL metrics handler. Reduce CPU and GC costs in OTEL metrics processes.
  • With adopting BanyanDB 0.7, native database performance and stability are improved.

Project

  • E2E: bump up the version of the opentelemetry-collector to 0.102.1.
  • Push snapshot data-generator docker image to ghcr.io.
  • Bump up skywalking-infra-e2e to work around GHA removing docker-compose v1.
  • Bump up CodeQL GitHub Actions.
  • Fix wrong phase of delombok plugin to reduce build warnings.
  • Use ci-friendly revision to set the project version.

OAP Server

  • Fix wrong indices in the eBPF Profiling related models.
  • Support exclude the specific namespaces traffic in the eBPF Access Log receiver.
  • Add Golang as a supported language for Elasticsearch.
  • Remove unnecessary BanyanDB flushing logs(info).
  • Increase SW_CORE_GRPC_MAX_MESSAGE_SIZE to 50MB.
  • Support to query relation metrics through PromQL.
  • Support trace MQE query for debugging.
  • Add Component ID(158) for the Solon framework.
  • Fix metrics tag in HTTP handler of browser receiver plugin.
  • Increase alarm_record#message column length to 2000 from 200.
  • Remove alarm_record#message column indexing.
  • Add Python as a supported language for Pulsar.
  • Make more proper histogram buckets for the persistence_timer_bulk_prepare_latency,
    persistence_timer_bulk_execute_latency and persistence_timer_bulk_all_latency metrics in PersistenceTimer.
  • [Break Change] Update Nacos version to 2.3.2. Nacos 1.x server can't serve as cluster coordinator and configuration server.
  • Support tracing trace query(SkyWalking and Zipkin) for debugging.
  • Fix BanyanDB metrics query: used the wrong Downsampling type to find the schema.
  • Support fetch cilium flow to monitoring network traffic between cilium services.
  • Support labelCount function in the OAL engine.
  • Support BanyanDB internal measure query execution tracing.
  • BanyanDB client config: rise the default maxBulkSize to 10000, add flushTimeout and set default to 10s.
  • Polish BanyanDB group and schema creation logic to fix the schema creation failure issue in distributed race conditions.
  • Support tracing topology query for debugging.
  • Fix expression of graph Current QPS in MySQL dashboard.
  • Support tracing logs query for debugging.
  • BanyanDB: fix Tag autocomplete data storage and query.
  • Support aggregation operators in PromQL query.
  • Update the kubernetes HTTP latency related metrics source unit from ns to ms.
  • Support BanyanDB internal stream query execution tracing.
  • Fix Elasticsearch, MySQL, RabbitMQ dashboards typos and missing expressions.
  • BanyanDB: Zipkin Module set service as Entity for improving the query performance.
  • MQE: check the metrics value before do binary operation to improve robustness.
  • Replace workaround with Armeria native supported context path.
  • Add an http endpoint wrapper for health check.
  • Bump up Armeria and transitive dependencies.
  • BanyanDB: if the model column is already a @BanyanDB.TimestampColumn, set @BanyanDB.NoIndexing on it to reduce indexes.
  • BanyanDB: stream sort-by time query, use internal time-series rather than index to improve the query performance.
  • Bump up graphql-java to 21.5.
  • Add Unknown Node when receive Kubernetes peer address is not aware in current cluster.
  • Fix CounterWindow concurrent increase cause NPE by PriorityQueue
  • Fix format the endpoint name with empty string.
  • Support async query for the composite GraphQL query.
  • Get endpoint list order by timestamp desc.
  • Support sort queries on metrics generated by eBPF receiver.
  • Fix the compatibility with Grafana 11 when using label_values query variables.
  • Nacos as config server and cluster coordinator supports configuration contextPath.
  • Update the endpoint name format to <Method>:<Path> in eBPF Access Log Receiver.
  • Add self-observability metrics for OpenTelemetry receiver.
  • Support service level metrics aggregate when missing pod context in eBPF Access Log Receiver.
  • Fix query getGlobalTopology throw exception when didn't find any services by the given Layer.
  • Fix the previous analysis result missing in the ALS k8s-mesh analyzer.
  • Fix findEndpoint query requires keyword when using BanyanDB.
  • Support to analysis the ztunnel mapped IP address and mTLS mode in eBPF Access Log Receiver.
  • Adapt BanyanDB Java Client 0.7.0.
  • Add SkyWalking Java Agent self observability dashboard.
  • Add Component ID(5022) for the GoFrame framework.
  • Bump up protobuf java dependencies to 3.25.5.
  • BanyanDB: support using native term searching for keyword in query findEndpoint and getAlarm.
  • BanyanDB: support TLS connection and configuration.
  • PromQL service: query API support RFC3399 time format.
  • Improve the performance of OTEL metrics handler.
  • PromQL service: fix operators result missing rangeExpression flag.
  • BanyanDB: use TimestampRange to improve "events" query for BanyanDB.
  • Optimize network_address_alias table to reduce the number of the index.
  • PromQL service: support round brackets operator.
  • Support query Alarm message Tag for auto-complete.

UI

  • Highlight search log keywords.
  • Add Error URL in the browser log.
  • Add a SolonMVC icon.
  • Adding cilium icon and i18n for menu.
  • Fix the mismatch between the unit and calculation of the "Network Bandwidth Usage" widget in Windows-Service Dashboard.
  • Make a maximum 20 entities per query in service/instance/endpoint list widgets.
  • Polish error nodes in trace widget.
  • Introduce flame graph to the trace profiling.
  • Correct services and instances when changing page numbers.
  • Improve metric queries to make page opening brisker.
  • Bump up dependencies to fix CVEs.
  • Add a loading view for initialization page.
  • Fix a bug for selectors when clicking the refresh icon.
  • Fix health check to OAP backend.
  • Add Service, ServiceInstance, Endpoint dashboard forwarder to Kubernetes Topologies.
  • Fix pagination for service/instance list widgets.
  • Add queries for alarm tags.
  • Add skywalking java agent self observability menu.

Documentation

  • Update the version description supported by zabbix receiver.
  • Move the Official Dashboard docs to marketplace docs.
  • Add marketplace introduction docs under quick start menu to reduce the confusion of finding feature docs.
  • Update Windows Metrics(Swap -> Virtual Memory)

New Contributors

All issues and pull requests are here

10.0.1

30 May 03:42
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

Project

  • Add SBOM (Software Bill of Materials) to the project.

OAP Server

  • Fix LAL test query api.
  • Add component libraries of Derby/Sybase/SQLite/DB2/OceanBase jdbc driver.
  • Fix setting the wrong interval to day level measure schema in BanyanDB installation process.

UI

  • Fix widget title and tips.
  • Fix statistics span data.
  • Fix browser log display.
  • Fix the topology layout for there are multiple independent network relationships.

All issues and pull requests are here

10.0.0

13 May 01:02
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

Service Hierarchy

Service Hierarchy Hierarchy Graph
image image

Run with BanyanDB 0.6 in the Cluster Mode

image

Project

  • Support Java 21 runtime.
  • Support oap-java21 image for Java 21 runtime.
  • Upgrade OTEL collector version to 0.92.0 in all e2e tests.
  • Switch CI macOS runner to m1.
  • Upgrade PostgreSQL driver to 42.4.4 to fix CVE-2024-1597.
  • Remove CLI(swctl) from the image.
  • Remove CLI_VERSION variable from Makefile build.
  • Add BanyanDB to docker-compose quickstart.
  • Bump up Armeria, jackson, netty, jetcd and grpc to fix CVEs.
  • Bump up BanyanDB Java Client to 0.6.0.

OAP Server

  • Add layer parameter to the global topology graphQL query.
  • Add is_present function in MQE for check if the list metrics has a value or not.
  • Remove unreasonable default configurations for gRPC thread executor.
  • Remove gRPCThreadPoolQueueSize (SW_RECEIVER_GRPC_POOL_QUEUE_SIZE) configuration.
  • Allow excluding ServiceEntries in some namespaces when looking up ServiceEntries as a final resolution method of
    service metadata.
  • Set up the length of source and dest IDs in relation entities of service, instance, endpoint, and process to 250(was
    200).
  • Support build Service/Instance Hierarchy and query.
  • Change the string field in Elasticsearch storage from keyword type to text type if it set more than 32766 length.
  • [Break Change] Change the configuration field of ui_template and ui_menu in Elasticsearch storage from keyword type to text.
  • Support Service Hierarchy auto matching, add auto matching layer relationships (upper -> lower) as following:
    • MESH -> MESH_DP
    • MESH -> K8S_SERVICE
    • MESH_DP -> K8S_SERVICE
    • GENERAL -> K8S_SERVICE
  • Add namespace suffix for K8S_SERVICE_NAME_RULE/ISTIO_SERVICE_NAME_RULE and metadata-service-mapping.yaml as default.
  • Allow using a dedicated port for ALS receiver.
  • Fix log query by traceId in JDBCLogQueryDAO.
  • Support handler eBPF access log protocol.
  • Fix SumPerMinFunctionTest error function.
  • Remove unnecessary annotations and functions from Meter Functions.
  • Add max and min functions for MAL down sampling.
  • Fix critical bug of uncontrolled memory cost of TopN statistics. Change topN group key from StorageId to entityId + timeBucket.
  • Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
    • MYSQL -> K8S_SERVICE
    • POSTGRESQL -> K8S_SERVICE
    • SO11Y_OAP -> K8S_SERVICE
    • VIRTUAL_DATABASE -> MYSQL
    • VIRTUAL_DATABASE -> POSTGRESQL
  • Add Golang as a supported language for AMQP.
  • Support available layers of service in the topology.
  • Add count aggregation function for MAL
  • Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
    • NGINX -> K8S_SERVICE
    • APISIX -> K8S_SERVICE
    • GENERAL -> APISIX
  • Add Golang as a supported language for RocketMQ.
  • Support Apache RocketMQ server monitoring.
  • Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
    • ROCKETMQ -> K8S_SERVICE
    • VIRTUAL_MQ -> ROCKETMQ
  • Fix ServiceInstance in query.
  • Mock /api/v1/status/buildinfo for PromQL API.
  • Fix table exists check in the JDBC Storage Plugin.
  • Fix day-based table rolling time range strategy in JDBC storage.
  • Add maxInboundMessageSize (SW_DCS_MAX_INBOUND_MESSAGE_SIZE) configuration to change the max inbound message size of DCS.
  • Fix Service Layer when building Events in the EventHookCallback.
  • Add Golang as a supported language for Pulsar.
  • Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
    • RABBITMQ -> K8S_SERVICE
    • VIRTUAL_MQ -> RABBITMQ
  • Remove Column#function mechanism in the kernel.
  • Make query readMetricValue always return the average value of the duration.
  • Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
    • KAFKA -> K8S_SERVICE
    • VIRTUAL_MQ -> KAFKA
  • Support ClickHouse server monitoring.
  • Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
    • CLICKHOUSE -> K8S_SERVICE
    • VIRTUAL_DATABASE -> CLICKHOUSE
  • Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
    • PULSAR -> K8S_SERVICE
    • VIRTUAL_MQ -> PULSAR
  • Add Golang as a supported language for Kafka.
  • Support displaying the port services listen to from OAP and UI during server start.
  • Refactor data-generator to support generating metrics.
  • Fix AvgHistogramPercentileFunction legacy name.
  • [Break Change] Labeled Metrics support multiple labels.
    • Storage: store all label names and values instead of only the values.
    • MQE:
      • Support querying by multiple labels(name and value) instead using _ as the anonymous label name.
      • aggregate_labels function support aggregate by specific labels.
      • relabels function require target label and rename label name and value.
    • PromQL:
      • Support querying by multiple labels(name and value) instead using lables as the anonymous label name.
      • Remove general labels labels/relabels/label function.
      • API /api/v1/labels and /api/v1/label/<label_name>/values support return matched metrics labels.
    • OAL:
      • Deprecate percentile function and introduce percentile2 function instead.
  • Bump up Kafka to fix CVE.
  • Fix NullPointerException in Istio ServiceEntry registry.
  • Remove unnecessary componentIds as series ID in the ServiceRelationClientSideMetrics and ServiceRelationServerSideMetrics entities.
  • Fix not throw error when part of expression not matched any expression node in the MQE and `PromQL.
  • Remove kafka-fetcher/default/createTopicIfNotExist as the creation is automatically since #7326 (v8.7.0).
  • Fix inaccuracy nginx service metrics.
  • Fix/Change Windows metrics name(Swap -> Virtual Memory)
    • memory_swap_free -> memory_virtual_memory_free
    • memory_swap_total -> memory_virtual_memory_total
    • memory_swap_percentage -> memory_virtual_memory_percentage
  • Fix/Change UI init setting for Windows Swap -> Virtual Memory
  • Fix Memory Swap Usage/Virtual Memory Usage display with UI init.(Linux/Windows)
  • Fix inaccurate APISIX metrics.
  • Fix inaccurate MongoDB Metrics.
  • Support Apache ActiveMQ server monitoring.
  • Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
    • ACTIVEMQ -> K8S_SERVICE
  • Calculate Nginx service HTTP Latency by MQE.
  • MQE query: make metadata not return null.
  • MQE labeled metrics Binary Operation: return empty value if the labels not match rather than report error.
  • Fix inaccurate Hierarchy of RabbitMQ Server monitoring metrics.
  • Fix inaccurate MySQL/MariaDB, Redis, PostgreSQL metrics.
  • Support DoubleValue,IntValue,BoolValue in OTEL metrics attributes.
  • [Break Change] gGRPC metrics exporter unified the metric value type and support labeled metrics.
  • Add component definition(ID=152) for c3p0(JDBC3 Connection and Statement Pooling).
  • Fix MQE top_n global query.
  • Fix inaccurate Pulsar and Bookkeeper metrics.
  • MQE support sort_values and sort_label_values functions.

UI

  • Fix the mismatch between the unit and calculation of the "Network Bandwidth Usage" widget in Linux-Service Dashboard.
  • Add theme change animation.
  • Implement the Service and Instance hierarchy topology.
  • Support Tabs in the widget visible when MQE expressions.
  • Support search on Marketplace.
  • Fix default route.
  • Fix layout on the Log widget.
  • Fix Trace associates with Log widget.
  • Add isDefault to the dashboard configuration.
  • Add expressions to dashboard configurations on the dashboard list page.
  • Update Kubernetes related UI templates for adapt data from eBPF access log.
  • Fix dashboard K8S-Service-Root metrics expression.
  • Add dashboards for Service/Instance Hierarchy.
  • Fix MQE in dashboards when using Card widget.
  • Optimize tooltips style.
  • Fix resizing window causes the trace graph to display incorrectly.
  • Add the not found page(404).
  • Enhance VNode logic and support multiple Trace IDs in span's ref.
  • Add the layers filed and associate layers dashboards for the service topology nodes.
  • Fix Nginx-Instance metrics to instance level.
  • Update tabs of the Kubernetes service page.
  • Add Airflow menu i18n.
  • Add Support for dragging in the trace panel.
  • Add workflow icon.
  • Metrics support multiple labels.
  • Support the SINGLE_VALUE for table widgets.
  • Remove the General metric mode and related logical code.
  • Remove metrics for unreal nodes in the topology.
  • Enhance the Trace widget for batch consuming spans.
  • Clean the unused elements in the UI-templates.

Documentation

  • Update the release doc to remove the announcement as the tests are through e2e rather than manually.
  • Update the release notification mail a little.
  • Polish docs structure. Move customization docs separately from the introduction docs.
  • Add webhook/gRPC hooks settings example for backend-alarm.md.
  • Begin the process of SWIP - SkyWalking Improvement Proposal.
  • Add `SWIP-1 Create and det...
Read more

9.7.0

01 Dec 18:19
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

Dark Mode

The dafult style mode is changed to the dark mode, and light mode is still available.

dark-mode

New Design Log View

A new design for the log view is currently available. Easier to locate the logs, and more space for the raw text.

logs

Project

  • Bump Java agent to 9.1-dev in the e2e tests.
  • Bump up netty to 4.1.100.
  • Update Groovy 3 to 4.0.15.
  • Support packaging the project in JDK21. Compiler source and target remain in JDK11.

OAP Server

  • ElasticSearchClient: Add deleteById API.
  • Fix Custom alarm rules are overwritten by 'resource/alarm-settings.yml'
  • Support Kafka Monitoring.
  • Support Pulsar server and BookKeeper server Monitoring.
  • [Breaking Change] Elasticsearch storage merge all management data indices into one index management,
    including ui_template,ui_menu,continuous_profiling_policy.
  • Add a release mechanism for alarm windows when it is expired in case of OOM.
  • Fix Zipkin trace receiver response: make the HTTP status code from 200 to 202.
  • Update BanyanDB Java Client to 0.5.0.
  • Fix getInstances query in the BanyanDB Metadata DAO.
  • BanyanDBStorageClient: Add keepAliveProperty API.
  • Fix table exists check in the JDBC Storage Plugin.
  • Enhance extensibility of HTTP Server library.
  • Adjust AlarmRecord alarmMessage column length to 512.
  • Fix EventHookCallback build event: build the layer from Service's Layer.
  • Fix AlarmCore doAlarm: catch exception for each callback to avoid interruption.
  • Optimize queryBasicTraces in TraceQueryEsDAO.
  • Fix WebhookCallback send incorrect messages, add catch exception for each callback HTTP Post.
  • Fix AlarmRule expression validation: add labeled metrics mock data for check.
  • Support collect ZGC memory pool metrics.
  • Add a component ID for Netty-http (ID=151).
  • Add a component ID for Fiber (ID=5021).
  • BanyanDBStorageClient: Add define(Property property, PropertyStore.Strategy strategy) API.
  • Correct the file format and fix typos in the filenames for monitoring Kafka's e2e tests.
  • Support extract timestamp from patterned datetime string in LAL.
  • Support output key parameters in the booting logs.
  • Fix cannot query zipkin traces with annotationQuery parameter in the JDBC related storage.
  • Fix limit doesn't work for findEndpoint API in ES storage.
  • Isolate MAL CounterWindow cache by metric name.
  • Fix JDBC Log query order.
  • Change the DataCarrier IF_POSSIBLE strategy to use ArrayBlockingQueue implementation.
  • Change the policy of the queue(DataCarrier) in the L1 metric aggregate worker to IF_POSSIBLE mode.
  • Add self-observability metric metrics_aggregator_abandon to count the number of abandon metrics.
  • Support Nginx monitoring.
  • Fix BanyanDB Metadata Query: make query single instance/process return full tags to avoid NPE.
  • Repleace go2sky E2E to GO agent.
  • Replace Metrics v2 protocol with MQE in UI templates and E2E Test.
  • Fix incorrect apisix metrics otel rules.
  • Support Scratch The OAP Config Dump.
  • Support increase/rate function in the MQE query language.
  • Group service endpoints into _abandoned when endpoints have high
    cardinality.

UI

  • Add new menu for kafka monitoring.
  • Fix independent widget duration.
  • Fix the display height of the link tree structure.
  • Replace the name by shortName on service widget.
  • Refactor: update pagination style. No visualization style change.
  • Apply MQE on K8s layer UI-templates.
  • Fix icons display in trace tree diagram.
  • Fix: update tooltip style to support multiple metrics scrolling view in a metrics graph.
  • Add a new widget to show jvm memory pool detail.
  • Fix: avoid querying data with empty parameters.
  • Add a title and a description for trace segments.
  • Add Netty icon for Netty HTTP plugin.
  • Add Pulsar menu i18n files.
  • Refactor Logs view.
  • Implement the Dark Theme.
  • Change UI templates for Text widgets.
  • Add Nginx menu i18n.
  • Fix the height for trace widget.
  • Polish list style.
  • Fix Log associate with Trace.
  • Enhance layout for broken Topology widget.
  • Fix calls metric with call type for Topology widget.
  • Fix changing metrics config for Topology widget.
  • Fix routes for Tab widget.
  • Remove OpenFunction(FAAS layer) relative UI templates and menu item.
  • Fix: change colors to match dark theme for Network Profiling.
  • Remove the description of OpenFunction in the UI i18n.
  • Reduce component chunks to improve page loading resource time.

Documentation

  • Separate storage docs to different files, and add an estimated timeline for BanyanDB(end of 2023).
  • Add topology configuration in UI-Grafana doc.
  • Add missing metrics to the OpenTelemetry Metrics doc.
  • Polish docs of Concepts and Designs.
  • Fix incorrect notes of slowCacheReadThreshold.
  • Update OAP setup and cluster coordinator docs to explain new booting parameters table in the logs, and how to setup
    cluster mode.

All issues and pull requests are here

9.6.0

03 Sep 14:33
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

New Alerting Kernel

  • MQE(Metrics Query Expression) and a new notification mechanism are supported.
    alerting-rules

Support Loki LogQL

  • Newly added support for Loki LogQL and Grafana Loki Dashboard for SkyWalking collected logs

grafana-logql

WARNING

  • ElasticSearch 6 storage relative tests are removed. It worked and is not promised due to end of life officially.

Project

  • Bump up Guava to 32.0.1 to avoid the lib listed as vulnerable due to CVE-2020-8908. This API is never used.
  • Maven artifact skywalking-log-recevier-plugin is renamed to skywalking-log-receiver-plugin.
  • Bump up cli version 0.11 to 0.12.
  • Bump up the version of ASF parent pom to v30.
  • Make builds reproducible for automatic releases CI.

OAP Server

  • Add Neo4j component ID(112) language: Python.
  • Add Istio ServiceEntry registry to resolve unknown IPs in ALS.
  • Wrap deleteProperty API to the BanyanDBStorageClient.
  • [Breaking change] Remove matchedCounter from HttpUriRecognitionService#feedRawData.
  • Remove patterns from HttpUriRecognitionService#feedRawData and add max 10 candidates of raw URIs for each pattern.
  • Add component ID for WebSphere.
  • Fix AI Pipeline uri caching NullPointer and IllegalArgument Exceptions.
  • Fix NPE in metrics query when the metric is not exist.
  • Remove E2E tests for Istio < 1.15, ElasticSearch < 7.16.3, they might still work but are not supported as planed.
  • Scroll all results in ElasticSearch storage and refactor scrolling logics, including Service, Instance, Endpoint,
    Process, etc.
  • Improve Kubernetes coordinator to remove Terminating OAP Pods in cluster.
  • Support SW_CORE_SYNC_PERIOD_HTTP_URI_RECOGNITION_PATTERN and SW_CORE_TRAINING_PERIOD_HTTP_URI_RECOGNITION_PATTERN
    to control the period of training and sync HTTP URI recognition patterns. And shorten the default period to 10s for
    sync and 60s for training.
  • Fix ElasticSearch scroller bug.
  • Add component ID for Aerospike(ID=149).
  • Packages with name recevier are renamed to receiver.
  • BanyanDBMetricsDAO handles storeIDTag in multiGet for BanyanDBModelExtension.
  • Fix endpoint grouping-related logic and enhance the performance of PatternTree retrieval.
  • Fix metric session cache saving after batch insert when using mysql-connector-java.
  • Support dynamic UI menu query.
  • Add comment for docker/.env to explain the usage.
  • Fix wrong environment variable name SW_OTEL_RECEIVER_ENABLED_OTEL_RULES to right SW_OTEL_RECEIVER_ENABLED_OTEL_METRICS_RULES.
  • Fix instance query in JDBC implementation.
  • Set the SW_QUERY_MAX_QUERY_COMPLEXITY default value to 3000(was 1000).
  • Accept length=4000 parameter value of the event. It was 2000.
  • Tolerate parameter value in illegal JSON format.
  • Update BanyanDB Java Client to 0.4.0
  • Support aggregate Labeled Value Metrics in MQE.
  • [Breaking change] Change the default label name in MQE from label to _.
  • Bump up grpc version to 1.53.0.
  • [Breaking change] Removed '&' symbols from shell scripts to avoid OAP server process running as a background process.
  • Revert part of #10616 to fix the unexpected changes: if there is no data we should return an array with 0s,
    but in #10616, an empty array is returned.
  • Cache all service entity in memory for query.
  • Bump up jackson version to 2.15.2.
  • Increase the default memory size to avoid OOM.
  • Bump up graphql-java to 21.0.
  • Add Echo component ID(5015) language: Golang.
  • Fix index out of bounds exception in aggregate_labels MQE function.
  • Support MongoDB Server/Cluster monitoring powered by OTEL.
  • Do not print configurations values in logs to avoid sensitive info leaked.
  • Move created the latest index before retrieval indexes by aliases to avoid the 404 exception. This just prevents some interference from manual operations.
  • Add more Go VM metrics, as new skywalking-go agent provided since its 0.2 release.
  • Add component ID for Lock (ID=5016).
  • [Breaking change] Adjust the structure of hooks in the alarm-settings.yml. Support multiple configs for each hook types and specifying the hooks in the alarm rule.
  • Bump up Armeria to 1.24.3.
  • Fix BooleanMatch and BooleanNotEqualMatch doing Boolean comparison.
  • Support LogQL HTTP query APIs.
  • Add Mux Server component ID(5017) language: Golang.
  • Remove ElasticSearch 6.3.2 from our client lib tests.
  • Bump up ElasticSearch server 8.8.1 to 8.9.0 for latest e2e testing. 8.1.0, 7.16.3 and 7.17.10 are still tested.
  • Add OpenSearch 2.8.0 to our client lib tests.
  • Use listening mode for apollo implementation of dynamic configuration.
  • Add view_as_seq function in MQE for listing metrics in the given prioritized sequence.
  • Fix the wrong default value of k8sServiceNameRule if it's not explicitly set.
  • Improve PromQL to allow for multiple metric operations within a single query.
  • Fix MQE Binary Operation between labeled metrics and other type of value result.
  • Add component ID for Nacos (ID=150).
  • Support Compare Operation in MQE.
  • Fix the Kubernetes resource cache not refreshed.
  • Fix wrong classpath that might cause OOM in startup.
  • Enhance the serviceRelation in MAL by adding settings for the delimiter and component fields.
  • [Breaking change] Support MQE in the Alerting. The Alarm Rules configuration(alarm-settings.yml),
    add expression field and remove metrics-name/count/threshold/op/only-as-condition fields and remove composite-rules configuration.
  • Check results in ALS as per downstream/upstream instead of per log.
  • Fix GraphQL query listInstances not using endTime query
  • Do not start server and Kafka consumer in init mode.
  • Add Iris component ID(5018).
  • Add OTLP Tracing support as a Zipkin trace input.

UI

  • Fix metric name browser_app_error_rate in Browser-Root dashboard.
  • Fix display name of endpoint_cpm for endpoint list in General-Service dashboard.
  • Implement customize menus and marketplace page.
  • Fix minTraceDuration and maxTraceDuration types.
  • Fix init minTime to Infinity.
  • Bump dependencies to fix vulnerabilities.
  • Add scss variables.
  • Fix the title of instance list and notices in the continue profiling.
  • Add a link to explain the expression metric, add units in the continue profiling widget.
  • Calculate string width to set Tabs name width.
  • [Breaking change] Removed '&' symbols from shell scripts to avoid web application server process running as a background process.
  • Reset chart label.
  • Fix service associates instances.
  • Remove node-sass.
  • Fix commit error on Windows.
  • Apply MQE on MYSQL, POSTGRESQL, REDIS, ELASTICSEARCH and DYNAMODB layer UI-templates.
  • Apply MQE on Virtual-Cache layer UI-templates
  • Apply MQE on APISIX, AWS_EKS, AWS_GATEWAY and AWS_S3 layer UI templates.
  • Apply MQE on RabbitMQ Dashboards.
  • Apply MQE on Virtual-MQ layer UI-templates
  • Apply MQE on Infra-Linux layer UI-templates
  • Apply MQE on Infra-Windows layer UI-templates
  • Apply MQE on Browser layer UI-templates.
  • Implement MQE on topology widget.
  • Fix getEndpoints keyword blank.
  • Implement a breadcrumb component as navigation.

Documentation

  • Add Go agent into the server agent documentation.
  • Add data unit description in the configuration of continuous profiling policy.
  • Remove storage extension doc, as it is expired.
  • Remove how to add menu doc, as SkyWalking supports marketplace and new backend-based setup.
  • Separate contribution docs to a new menu structure.
  • Add a doc to explain how to manage i18n.
  • Add a doc to explain OTLP Trace support.
  • Fix typo in dynamic-config-configmap.md.
  • Fix out-dated docs about Kafka fetcher.
  • Remove 3rd part fetchers from the docs, as they are not maintained anymore.

All issues and pull requests are here

9.5.0

18 Jun 16:20
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

New Topology Layout

image

Elasticsearch Server Monitoring

image

Project

  • Fix Duplicate class found due to the delombok goal.

OAP Server

  • Fix wrong layer of metric user error in DynamoDB monitoring.
  • ElasticSearch storage does not check field types when OAP running in no-init mode.
  • Support to bind TLS status as a part of component for service topology.
  • Fix component ID priority bug.
  • Fix component ID of topology overlap due to storage layer bugs.
  • [Breaking Change] Enhance JDBC storage through merging tables and managing day-based table rolling.
  • [Breaking Change] Sharding-MySQL implementations and tests get removed due to we have the day-based rolling mechanism by default
  • Fix otel k8s-cluster rule add namespace dimension for MAL aggregation calculation(Deployment Status,Deployment Spec Replicas)
  • Support continuous profiling feature.
  • Support collect process level related metrics.
  • Fix K8sRetag reads the wrong k8s service from the cache due to a possible namespace mismatch.
  • [Breaking Change] Support cross-thread trace profiling. The data structure and query APIs are changed.
  • Fix PromQL HTTP API /api/v1/labels response missing service label.
  • Fix possible NPE when initialize IntList.
  • Support parse PromQL expression has empty labels in the braces for metadata query.
  • Support alarm metric OP !=.
  • Support metrics query indicates whether value == 0 represents actually zero or no data.
  • Fix NPE when query the not exist series indexes in ElasticSearch storage.
  • Support collecting memory buff/cache metrics in VM monitoring.
  • PromQL: Remove empty values from the query result, fix /api/v1/metadata param limit could cause out of bound.
  • Support monitoring the total number metrics of k8s StatefulSet and DaemonSet.
  • Support Amazon API Gateway monitoring.
  • Bump up graphql-java to fix cve.
  • Bump up Kubernetes Java client.
  • Support Redis Monitoring.
  • Add component ID for amqp, amqp-producer and amqp-consumer.
  • Support no-proxy mode for aws-firehose receiver
  • Bump up armeria to 1.23.1
  • Support Elasticsearch Monitoring.
  • Fix PromQL HTTP API /api/v1/series response missing service label when matching metric.
  • Support ServerSide TopN for BanyanDB.
  • Add component ID for Jersey.
  • Remove OpenCensus support, the related codes and docs as it's sunsetting.
  • Support dynamic configuration of searchableTracesTags
  • Support exportErrorStatusTraceOnly for export the error status trace segments through the Kafka channel
  • Add component ID for Grizzly.
  • Fix potential NPE in Zipkin receiver when the Span is missing some fields.
  • Filter out unknown_cluster metric data.
  • Support RabbitMQ Monitoring.
  • Support Redis slow logs collection.
  • Fix data loss when query continuous profiling task record.
  • Adapt the continuous profiling task query GraphQL.
  • Support Metrics Query Expression(MQE) and allows users to do simple query-stage calculation through the expression.
  • Deprecated metrics query v2 protocol.
  • Deprecated record query protocol.
  • Add component ID for go-redis.
  • Add OpenSearch 2.8.0 to test case.
  • Add ai-pipeline module.
  • Support HTTP URI formatting through ai-pipeline to do pattern recognition.
  • Add new HTTP URI grouping engine with benchmark.
  • [Breaking Change] Use the new HTTP URI grouping engine to replace the old regex based mechanism.
  • Support sumLabeled in MAL.
  • Migrate from kubernetes-client/java to fabric8 client.
  • Envoy ALS generated relation metrics considers http status codes >= 400 has an error at the client side.
  • Add cause message field when query continuous profiling task.

UI

  • Revert: cpm5d function. This feature is cancelled from backend.
  • Fix: alerting link breaks on the topology.
  • Refactor Topology widget to make it more hierarchical.
    1. Choose User as the first node.
    2. If User node is absent, choose the busiest node(which has the most calls of all).
    3. Do a left-to-right flow process.
    4. At the same level, list nodes from top to bottom in alphabetical order.
  • Fix filter ID when ReadRecords metric associates with trace.
  • Add AWS API Gateway menu.
  • Change trace profiling protocol.
  • Add Redis menu.
  • Optimize data types.
  • Support isEmptyValue flag for metrics query.
  • Add elasticsearch menu.
  • [Clean UI templates before upgrade] Set showSymbol: true, and make the data point shows on the Line graph.
    Please clean ui_template index in elasticsearch storage or table in JDBC storage.
  • [Clean UI templates before upgrade] UI templates: Simplify metric name with the label.
  • Add MQ menu.
  • Add Jeysey icon.
  • Fix: set endpoint and instance selectors with url parameters correctly.
  • Bump up dependencies versions icons-vue 1.1.4, element-plus 2.1.0, nanoid 3.3.6, postcss 8.4.23
  • Add OpenTelemetry log protocol support.
  • [Breaking Change] Configuration key enabledOtelRules is renamed to enabledOtelMetricsRules and
    the corresponding environment variable is renamed to SW_OTEL_RECEIVER_ENABLED_OTEL_METRICS_RULES.
  • Add grizzly icon.
  • Fix: the Instance List data display error.
  • Fix: set topN type to Number.
  • Support Metrics Query Expression(MQE) and allows users to do simple query-stage calculation through the expression.
  • Bump up zipkin ui dependency to 2.24.1.
  • Bump up vite to 4.0.5.
  • Apply MQE on General and Virtual-Database layer UI-templates.

Documentation

  • Add Profiling related documentations.
  • Add SUM_PER_MIN to MAL documentation.
  • Make the log relative docs more clear, and easier for further more formats support.
  • Update the cluster management and advanced deployment docs.

All issues and pull requests are here

9.4.0

09 Mar 04:01
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

PromQL and Grafana Support

image

Zipkin Lens UI Bundled

image

AWS S3 and DynamoDB monitoring

image

Project

  • Bump up Zipkin and Zipkin lens UI dependency to 2.24.0.
  • Bump up Apache parent pom version to 29.
  • Bump up Armeria version to 1.21.0.
  • Clean up maven pom.xmls.
  • Bump up Java version to 11.
  • Bump up snakeyaml to 2.0.

OAP Server

  • Add ServerStatusService in the core module to provide a new way to expose booting status to other modules.
  • Adds Micrometer as a new component.(ID=141)
  • Refactor session cache in MetricsPersistentWorker.
  • Cache enhancement - don't read new metrics from database in minute dimensionality.
    // When
    // (1) the time bucket of the server's latest stability status is provided
    //     1.1 the OAP has booted successfully
    //     1.2 the current dimensionality is in minute.
    //     1.3 the OAP cluster is rebalanced due to scaling
    // (2) the metrics are from the time after the timeOfLatestStabilitySts
    // (3) the metrics don't exist in the cache
    // the kernel should NOT try to load it from the database.
    //
    // Notice, about condition (2),
    // for the specific minute of booted successfully, the metrics are expected to load from database when
    // it doesn't exist in the cache.
  • Remove the offset of metric session timeout according to worker creation sequence.
  • Correct MetricsExtension annotations declarations in manual entities.
  • Support component IDs' priority in process relation metrics.
  • Remove abandon logic in MergableBufferedData, which caused unexpected no-update.
  • Fix miss set LastUpdateTimestamp that caused the metrics session to expire.
  • Rename MAL rule spring-sleuth.yaml to spring-micrometer.yaml.
  • Fix memory leak in Zipkin API.
  • Remove the dependency of refresh_interval of ElasticSearch indices from elasticsearch/flushInterval config. Now,
    it uses core/persistentPeriod + 5s as refresh_interval for all indices instead.
  • Change elasticsearch/flushInterval to 5s(was 15s).
  • Optimize flushInterval of ElasticSearch BulkProcessor to avoid extra periodical flush in the continuous bulk streams.
  • An unexpected dot is added when exp is a pure metric name and expPrefix != null.
  • Support monitoring MariaDB.
  • Remove measure/stream specific interval settings in BanyanDB.
  • Add global-specific settings used to override global configurations (e.g segmentIntervalDays, blockIntervalHours) in BanyanDB.
  • Use TTL-driven interval settings for the measure-default group in BanyanDB.
  • Fix wrong group of non time-relative metadata in BanyanDB.
  • Refactor StorageData#id to the new StorageID object from a String type.
  • Support multiple component IDs in the service topology level.
  • Add ElasticSearch.Keyword annotation to declare the target field type as keyword.
  • [Breaking Change] Column component_id of service_relation_client_side and service_relation_server_side have been replaced by component_ids.
  • Support priority definition in the component-libraries.yml.
  • Enhance service topology query. When there are multiple components detected from the server side,
    the component type of the node would be determined by the priority, which was random in the previous release.
  • Remove component_id from service_instance_relation_client_side and service_instance_relation_server_side.
  • Make the satellite E2E test more stable.
  • Add Istio 1.16 to test matrix.
  • Register ValueColumn as Tag for Record in BanyanDB storage plugin.
  • Bump up Netty to 4.1.86.
  • Remove unnecessary additional columns when storage is in logical sharding mode.
  • The cluster coordinator support watch mechanism for notifying RemoteClientManager and ServerStatusService.
  • Fix ServiceMeshServiceDispatcher overwrite ServiceDispatcher debug file when open SW_OAL_ENGINE_DEBUG.
  • Use groupBy and in operators to optimize topology query for BanyanDB storage plugin.
  • Support server status watcher for MetricsPersistentWorker to check the metrics whether required initialization.
  • Fix the meter value are not correct when using sumPerMinLabeld or sumHistogramPercentile MAL function.
  • Fix cannot display attached events when using Zipkin Lens UI query traces.
  • Remove time_bucket for both Stream and Measure kinds in BanyanDB plugin.
  • Merge TIME_BUCKET of Metrics and Record into StorageData.
  • Support no layer in the listServices query.
  • Fix time_bucket of ServiceTraffic not set correctly in slowSql of MAL.
  • Correct the TopN record query DAO of BanyanDB.
  • Tweak interval settings of BanyanDB.
  • Support monitoring AWS Cloud EKS.
  • Bump BanyanDB Java client to 0.3.0-rc1.
  • Remove id tag from measures.
  • Add Banyandb.MeasureField to mark a column as a BanyanDB Measure field.
  • Add BanyanDB.StoreIDTag to store a process's id for searching.
  • [Breaking Change] The supported version of ShardingSphere-Proxy is upgraded from 5.1.2 to 5.3.1. Due to the changes of ShardingSphere's API, versions before 5.3.1 are not compatible.
  • Add the eBPF network profiling E2E Test in the per storage.
  • Fix TCP service instances are lack of instance properties like pod and namespace, which causes Pod log not to work for TCP workloads.
  • Add Python HBase happybase module component ID(94).
  • Fix gRPC alarm cannot update settings from dynamic configuration source.
  • Add batchOfBytes configuration to limit the size of bulk flush.
  • Add Python Websocket module component ID(7018).
  • [Optional] Optimize single trace query performance by customizing routing in ElasticSearch. SkyWalking trace segments and Zipkin spans are using trace ID for routing. This is OFF by default, controlled by storage/elasticsearch/enableCustomRouting.
  • Enhance OAP HTTP server to support HTTPS
  • Remove handler scan in otel receiver, manual initialization instead
  • Add aws-firehose-receiver to support collecting AWS CloudWatch metric(OpenTelemetry format). Notice, no HTTPS/TLS setup
    support. By following AWS Firehose request, it uses proxy request
    (https://... instead of /aws/firehose/metrics), there must be a proxy(Nginx, Envoy, etc.).
  • Avoid Antlr dependencies' versions might be different in compile time and runtime.
  • Now PrometheusMetricConverter#escapedName also support converting / to _.
  • Add missing TCP throughput metrics.
  • Refactor @Column annotation, swap Column#name and ElasticSearch.Column#columnAlias and rename ElasticSearch.Column#columnAlias to ElasticSearch.Column#legacyName.
  • Add Python HTTPX module component ID(7019).
  • Migrate tests from junit 4 to junit 5.
  • Refactor http-based alarm plugins and extract common logic to HttpAlarmCallback.
  • Support Amazon Simple Storage Service (Amazon S3) metrics monitoring
  • Support process Sum metrics with AGGREGATION_TEMPORALITY_DELTA case
  • Support Amazon DynamoDB monitoring.
  • Support prometheus HTTP API and promQL.
  • Scope in the Entity of Metrics query v1 protocol is not required and automatical correction. The scope is determined based on the metric itself.
  • Add explicit ReadTimeout for ConsulConfigurationWatcher to avoid IllegalArgumentException: Cache watchInterval=10sec >= networkClientReadTimeout=10000ms.
  • Fix DurationUtils.getDurationPoints exceed, when startTimeBucket equals endTimeBucket.
  • Support process OpenTelemetry ExponentialHistogram metrics
  • Add FreeRedis component ID(3018).

UI

  • Add Zipkin Lens UI to webapp, and proxy it to context path /zipkin.
  • Migrate the build tool from vue cli to Vite4.
  • Fix Instance Relation and Endpoint Relation dashboards show up.
  • Add Micrometer icon.
  • Update MySQL UI to support MariaDB.
  • Add AWS menu for supporting AWS monitoring.
  • Add missing FastAPI logo.
  • Update the log details page to support the formatted display of JSON content.
  • Fix build config.
  • Avoid being unable to drag process nodes for the first time.
  • Add node folder into ignore list.
  • Add ElPopconfirm to component types.
  • Add an iframe widget for zipkin UI.
  • Optimize graph tooltips to make them more friendly.
  • Bump json5 from 1.0.1 to 1.0.2.
  • Add websockets icon.
  • Implement independent mode for widgets.
  • Bump http-cache-semantics from 4.1.0 to 4.1.1.
  • Update menus for OpenFunction.
  • Add auto fresh to widgets independent mode.
  • Fix: clear trace ID on the Log and Trace widgets after using association.
  • Fix: reset duration for query conditions after time range changes.
  • Add AWS S3 menu.
  • Refactor: optimize side bar component to make it more friendly.
  • Fix: remove duplicate popup message for query result.
  • Add logo for HTTPX.
  • Refactor: optimize the attached events visualization in the trace widget.
  • Update BanyanDB client to 0.3.1.
  • Add AWS DynamoDB menu.
  • Fix: add auto period to the independent mode for widgets.
  • Optimize menus and add Windows monitoring menu.
  • Add a calculation for the cpm5dAvg.
  • add a cpm5d calculation.
  • Fix data processing error in the eBPF profiling widget.
  • Support for double quotes in SlowSQL statements.
  • Fix: the wrong position of the menu when clicking the topology node.

##...

Read more

9.3.0

04 Dec 03:44
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

Metrics Association

Dashboard Pop-up Trace Query
image image

APISIX Dashboard

image

Use Sharding MySQL as the Database

image

Virtual Cache Performance

image

Virtual MQ Performance

image

Project

  • Bump up the embedded swctl version in OAP Docker image.

OAP Server

  • Add component ID(133) for impala JDBC Java agent plugin and component ID(134) for impala server.
  • Use prepareStatement in H2SQLExecutor#getByIDs.(No function change).
  • Bump up snakeyaml to 1.32 for fixing CVE.
  • Fix DurationUtils.convertToTimeBucket missed verify date format.
  • Enhance LAL to support converting LogData to DatabaseSlowStatement.
  • [Breaking Change] Change the LAL script format(Add layer property).
  • Adapt ElasticSearch 8.1+, migrate from removed APIs to recommended APIs.
  • Support monitoring MySQL slow SQLs.
  • Support analyzing cache related spans to provide metrics and slow commands for cache services from client side
  • Optimize virtual database, fix dynamic config watcher NPE when default value is null
  • Remove physical index existing check and keep template existing check only to avoid meaningless retry wait
    in no-init mode.
  • Make sure instance list ordered in TTL processor to avoid TTL timer never runs.
  • Support monitoring PostgreSQL slow SQLs.
  • [Breaking Change] Support sharding MySQL database instances and tables by Shardingsphere-Proxy. SQL-Database requires removing tables log_tag/segment_tag/zipkin_query before OAP starts, if bump up from previous releases.
  • Fix meter functions avgHistogram, avgHistogramPercentile, avgLabeled, sumHistogram having data conflict when
    downsampling.
  • Do sorting readLabeledMetricsValues result forcedly in case the storage(database) doesn't return data consistent
    with the parameter list.
  • Fix the wrong watch semantics in Kubernetes watchers, which causes heavy traffic to API server in some Kubernetes clusters, we should use Get State and Start at Most Recent semantic instead of Start at Exact because we don't need the changing history events, see https://kubernetes.io/docs/reference/using-api/api-concepts/#semantics-for-watch.
  • Unify query services and DAOs codes time range condition to Duration.
  • [Breaking Change]: Remove prometheus-fetcher plugin, please use OpenTelemetry to scrape Prometheus metrics and
    set up SkyWalking OpenTelemetry receiver instead.
  • BugFix: histogram metrics sent to MAL should be treated as OpenTelemetry style, not Prometheus style:
    (-infinity, explicit_bounds[i]] for i == 0
    (explicit_bounds[i-1], explicit_bounds[i]] for 0 < i < size(explicit_bounds)
    (explicit_bounds[i-1], +infinity) for i == size(explicit_bounds)
    
  • Support Golang runtime metrics analysis.
  • Add APISIX metrics monitoring
  • Support skywalking-client-js report empty service version and page path , set default version as latest and
    default page path as /(root). Fix the
    error fetching data (/browser_app_page_pv0) : Can't split endpoint id into 2 parts.
  • [Breaking Change] Limit the max length of trace/log/alarm tag's key=value, set the max length of column tags
    in tableslog_tag/segment_tag/alarm_record_tag and column query in zipkin_query and column tag_value in tag_autocomplete to 256.
    SQL-Database requires altering these columns' length or removing these tables before OAP starts, if bump up from previous releases.
  • Optimize the creation conditions of profiling task.
  • Lazy load the Kubernetes metadata and switch from event-driven to polling. Previously we set up watchers to watch the Kubernetes metadata changes, this is perfect when there are deployments changes and SkyWalking can react to the changes in real time. However when the cluster has many events (such as in large cluster or some special Kubernetes engine like OpenShift), the requests sent from SkyWalking becomes unpredictable, i.e. SkyWalking might send massive requests to Kubernetes API server, causing heavy load to the API server. This PR switches from the watcher mechanism to polling mechanism, SkyWalking polls the metadata in a specified interval, so that the requests sent to API server is predictable (~10 requests every interval, 3 minutes), and the requests count is constant regardless of the cluster's changes. However with this change SkyWalking can't react to the cluster changes in time, but the delay is acceptable in our case.
  • Optimize the query time of tasks in ProfileTaskCache.
  • Fix metrics was put into wrong slot of the window in the alerting kernel.
  • Support sumPerMinLabeled in MAL.
  • Bump up jackson databind, snakeyaml, grpc dependencies.
  • Support export Trace and Log through Kafka.
  • Add new config initialization mechanism of module provider. This is a ModuleManager lib kernel level change.
  • [Breaking Change] Support new records query protocol, rename the column named service_id to entity_id for support difference entity.
    Please re-create top_n_database_statement index/table.
  • Remove improper self-obs metrics in JvmMetricsHandler(for Kafka channel).
  • gRPC stream canceling code is not logged as an error when the client cancels the stream. The client
    cancels the stream when the pod is terminated.
  • [Breaking Change] Change the way of loading MAL rules(support pattern).
  • Move k8s relative MAL files into /otel-rules/k8s.
  • [Breaking Change] Refactor service mesh protobuf definitions and split TCP-related metrics to individual definition.
  • Add TCP{Service,ServiceInstance,ServiceRelation,ServiceInstanceRelation} sources and split TCP-related entities out from
    original Service,ServiceInstance,ServiceRelation,ServiceInstanceRelation.
  • [Breaking Change] TCP-related source names are changed, fields of TCP-related sources are changed, please refer to the latest oal/tcp.oal file.
  • Do not log error logs when failed to create ElasticSearch index because the index is created already.
  • Add virtual MQ analysis for native traces.
  • Support Python runtime metrics analysis.
  • Support sampledTrace in LAL.
  • Support multiple rules with different names under the same layer of LAL script.
  • (Optimization) Reduce the buffer size(queue) of MAL(only) metric streams. Set L1 queue size as 1/20, L2 queue size as 1/2.
  • Support monitoring MySQL/PostgreSQL in the cluster mode.
  • [Breaking Change] Migrate to BanyanDB v0.2.0.
    • Adopt new OR logical operator for,
      1. MeasureIDs query
      2. BanyanDBProfileThreadSnapshotQueryDAO query
      3. Multiple Event conditions query
      4. Metrics query
    • Simplify Group check and creation
    • Partially apply UITemplate changes
    • Support index_only
    • Return CompletableFuture<Void> directly from BanyanDB client
    • Optimize data binary parse methods in *LogQueryDAO
    • Support different indexType
    • Support configuration for TTL and (block|segment) intervals
  • Elasticsearch storage: Provide system environment variable(SW_STORAGE_ES_SPECIFIC_INDEX_SETTINGS) and support specify the settings (number_of_shards/number_of_replicas) for each index individually.
  • Elasticsearch storage: Support update index settings (number_of_shards/number_of_replicas) for the index template after rebooting.
  • Optimize MQ Topology analysis. Use entry span's peer from the consumer side as source service when no producer instrumentation(no cross-process reference).
  • Refactor JDBC storage implementations to reuse logics.
  • Fix ClassCastException in LoggingConfigWatcher.
  • Support span attached event concept in Zipkin and SkyWalking trace query.
  • Support span attached events on Zipkin lens UI.
  • Force UTF-8 encoding in JsonLogHandler of kafka-fetcher-plugin.
  • Fix max length to 512 of entity, instance and endpoint IDs in trace, log, profiling, topN tables(JDBC storages). The value was 200 by default.
  • Add component IDs(135, 136, 137) for EventMesh server and client-side plugins.
  • Bump up Kafka client to 2.8.1 to fix CVE-2021-38153.
  • Remove lengthEnvVariable for Column as it never works as expected.
  • Add LongText to support longer logs persistent as a text type in ElasticSearch, instead of a keyword, to avoid length limitation.
  • Fix wrong system variable name SW_CORE_ENABLE_ENDPOINT_NAME_GROUPING_BY_OPENAPI. It was opaenapi.
  • Fix not-time-series model blocking OAP boots in no-init mode.
  • Fix ShardingTopologyQueryDAO.loadServiceRelationsDetectedAtServerSide invoke backend miss parameter serviceIds.
  • Changed system variable SW_SUPERDATASET_STORAGE_DAY_STEP to SW_STORAGE_ES_SUPER_DATASET_DAY_STEP to be consistent with other ES storage related variables.
  • Fix ESEventQueryDAO missing metric_table boolQuery criteria.
  • Add default entity name(_blank) if absent ...
Read more

9.2.0

01 Sep 12:54
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

eBPF Network Profiling for K8s Pod

image

Event and Metrics Association

image

MySQL Server Monitoring

image

PostgreSQL Server Monitoring

image

Project

  • [Critical] Fix a low performance issue of metrics persistent in the ElasticSearch storage implementation. One single
    metric could have to wait for an unnecessary 7~10s(System Env Variable SW_STORAGE_ES_FLUSH_INTERVAL) since 8.8.0 -
    9.1.0 releases.
  • Upgrade Armeria to 1.16.0, Kubernetes Java client to 15.0.1.

OAP Server

  • Add more entities for Zipkin to improve performance.
  • ElasticSearch: scroll id should be updated when scrolling as it may change.
  • Mesh: fix only last rule works when multiple rules are defined in metadata-service-mapping.yaml.
  • Support sending alarm messages to PagerDuty.
  • Support Zipkin kafka collector.
  • Add VIRTUAL detect type to Process for Network Profiling.
  • Add component ID(128) for Java Hutool plugin.
  • Add Zipkin query exception handler, response error message for illegal arguments.
  • Fix a NullPointerException in the endpoint analysis, which would cause missing MQ-related LocalSpan in the trace.
  • Add forEach, processRelation function to MAL expression.
  • Add expPrefix, initExp in MAL config.
  • Add component ID(7015) for Python Bottle plugin.
  • Remove legacy OAL percentile functions, p99, p95, p90, p75, p50 func(s).
  • Revert #8066. Keep all metrics persistent even it is default value.
  • Skip loading UI templates if folder is empty or doesn't exist.
  • Optimize ElasticSearch query performance by using _mGet and physical index name rather than alias in these
    scenarios, (a) Metrics aggregation (b) Zipkin query (c) Metrics query (d) Log query
  • Support the NETWORK type of eBPF Profiling task.
  • Support sumHistogram in MAL.
  • [Breaking Change] Make the eBPF Profiling task support to the service instance level,
    index/table ebpf_profiling_task is required to be re-created when bump up from previous releases.
  • Fix race condition in Banyandb storage
  • Support SUM_PER_MIN downsampling in MAL.
  • Support sumHistogramPercentile in MAL.
  • Add VIRTUAL_CACHE to Layer, to fix conjectured Redis server, which icon can't show on the topology.
  • [Breaking Change] Elasticsearch storage merge all metrics/meter and records(without super datasets) indices into one
    physical index template metrics-all and records-all on the default setting.
    Provide system environment variable(SW_STORAGE_ES_LOGIC_SHARDING) to shard metrics/meter indices into
    multi-physical indices as the previous versions(one index template per metric/meter aggregation function).
    In the current one index mode, users still could choose to adjust ElasticSearch's shard
    number(SW_STORAGE_ES_INDEX_SHARDS_NUMBER) to scale out.
    More details please refer to New ElasticSearch storage option explanation in 9.2.0
    and backend-storage doc
  • [Breaking Change] Index/table ebpf_profiling_schedule added a new column ebpf_profiling_schedule_id,
    the H2/Mysql/Tidb/Postgres storage users are required to re-created it when bump up from previous releases.
  • Fix Zipkin trace query the max size of spans.
  • Add tls and https component IDs for Network Profiling.
  • Support Elasticsearch column alias for the compatibility between storage logicSharding model and no-logicSharding model.
  • Support MySQL monitoring.
  • Support PostgreSQL monitoring.
  • Fix query services by serviceId error when Elasticsearch storage SW_STORAGE_ES_QUERY_MAX_SIZE > 10000.
  • Support sending alarm messages to Discord.
  • Fix query history process data failure.
  • Optimize TTL mechanism for Elasticsearch storage, skip executed indices in one TTL rotation.
  • Add Kubernetes support module to share codes between modules and reduce calls to Kubernetes API server.
  • Bump up Kubernetes Java client to fix cve.
  • Adapt OpenTelemetry native metrics protocol.
  • [Breaking Change] rename configuration folder from otel-oc-rules to otel-rules.
  • [Breaking Change] rename configuration field from enabledOcRules to enabledOtelRules and
    environment variable name from SW_OTEL_RECEIVER_ENABLED_OC_RULES to SW_OTEL_RECEIVER_ENABLED_OTEL_RULES.
  • [Breaking Change] Fix JDBC TTL to delete additional tables data.
    SQL Database requires removing segment,segment_tag, logs, logs_tag, alarms, alarms_tag, zipkin_span, zipkin_query before OAP starts.
  • SQL Database: add @SQLDatabase.ExtraColumn4AdditionalEntity to support add an extra column from parent to an additional table.
  • Add component ID(131) for Java Micronaut plugin
  • Add component ID(132) for Nats java client plugin

UI

  • Fix query conditions for the browser logs.
  • Implement a url parameter to activate tab index.
  • Fix clear interval fail when switch autoRefresh to off.
  • Optimize log tables.
  • Fix log detail pop-up page doesn't work.
  • Optimize table widget to hide the whole metric column when no metric is set.
  • Implement the Event widget. Remove event menu.
  • Fix span detail text overlap.
  • Add Python Bottle Plugin Logo.
  • Implement an association between widgets(line, bar, area graphs) with time.
  • Fix tag dropdown style.
  • Hide the copy button when db.statement is empty.
  • Fix legend metrics for topology.
  • Dashboard: Add metrics association.
  • Dashboard: Fix FaaS-Root document link and topology service relation dashboard link.
  • Dashboard: Fix Mesh-Instance metric Throughput.
  • Dashboard: Fix Mesh-Service-Relation metric Throughput
    and Proxy Sidecar Internal Latency in Nanoseconds (Client Response).
  • Dashboard: Fix Mesh-Instance-Relation metric Throughput.
  • Enhance associations for the Event widget.
  • Add event widgets in dashboard where applicable.
  • Fix dashboard list search box not work.
  • Fix short time range.
  • Fix event widget incompatibility in Safari.
  • Refactor the tags component to support searching for tag keys and values.
  • Implement the log widget and the trace widget associate with each other, remove log tables on the trace widget.
  • Add log widget to general service root.
  • Associate the event widget with the trace and log widget.
  • Add the MYSQL layer and update layer routers.
  • Fix query order for trace list.
  • Add a calculation to convert seconds to days.
    q* Add Spring Sleuth dashboard to general service instance.
  • Support the process dashboard and create the time range text widget.
  • Fix picking calendar with a wrong time range and setting a unique value for dashboard grid key.
  • Add PostgreSQL to Database sub-menu.
  • Implement the network profiling widget.
  • Add Micronaut icon for Java plugin.
  • Add Nats icon for Java plugin.
  • Bump moment and @vue/cli-plugin-e2e-cypress.
  • Add Network Profiling for Service Mesh DP instance and K8s pod panels.

Documentation

  • Fix invalid links in release docs.
  • Clean up doc about event metrics.
  • Add a table for metric calculations in the ui doc.
  • Add an explanation for alerting kernel and its in-memory window mechanism.
  • Add more docs for widget details.
  • Update alarm doc introduce configuration property key
  • Fix dependency license's NOTICE and binary jar included issues in the source release.
  • Add eBPF CPU profiling doc.

All issues and pull requests are here

9.1.0

10 Jun 03:01
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

eBPF profiling

image

On-demand Pod Log

image

Project

  • [IMPORTANT] Remove InfluxDB 1.x and Apache IoTDB 0.X as storage options, check details
    at here. Remove converter-moshi 2.5.0, influx-java 2.15,
    iotdb java 0.12.5, thrift 0.14.1, moshi 1.5.0, msgpack 0.8.16 dependencies. Remove InfluxDB and IoTDB relative codes
    and E2E tests.
  • Upgrade OAP dependencies zipkin to 2.23.16, H2 to 2.1.212, Apache Freemarker to 2.3.31, gRPC-java 1.46.0, netty to
    4.1.76.
  • Upgrade Webapp dependencies, spring-cloud-dependencies to 2021.0.2, logback-classic to 1.2.11
  • [IMPORTANT] Add BanyanDB storage implementation. Notice BanyanDB is currently under active development
    and SHOULD NOT be used in production cluster.

OAP Server

  • Add component definition(ID=127) for Apache ShenYu (incubating).
  • Fix Zipkin receiver: Decode spans error, missing Layer for V9 and wrong time bucket for generate Service and
    Endpoint.
  • [Refactor] Move SQLDatabase(H2/MySQL/PostgreSQL), ElasticSearch and BanyanDB specific configurations out of column.
  • Support BanyanDB global index for entities. Log and Segment record entities declare this new feature.
  • Remove unnecessary analyzer settings in columns of templates. Many were added due to analyzer's default value.
  • Simplify the Kafka Fetch configuration in cluster mode.
  • [Breaking Change] Update the eBPF Profiling task to the service level, please delete
    index/table: ebpf_profiling_task, process_traffic.
  • Fix event can't split service ID into 2 parts.
  • Fix OAP Self-Observability metric GC Time calculation.
  • Set SW_QUERY_MAX_QUERY_COMPLEXITY default value to 1000
  • Webapp module (for UI) enabled compression.
  • [Breaking Change] Add layer field to event, report an event without layer is not allowed.
  • Fix ES flush thread stops when flush schedule task throws exception, such as ElasticSearch flush failed.
  • Fix ES BulkProcessor in BatchProcessEsDAO was initialized multiple times and created multiple ES flush schedule tasks.
  • HTTPServer support the handler register with allowed HTTP methods.
  • [Critical] Revert Enhance DataCarrier#MultipleChannelsConsumer to add
    priority
    to avoid consuming issues.
  • Fix the problem that some configurations (such as group.id) did not take effect due to the override order when using
    the kafkaConsumerConfig property to extend the configuration in Kafka Fetcher.
  • Remove build time from the OAP version.
  • Add data-generator module to run OAP in testing mode, generating mock data for testing.
  • Support receive Kubernetes processes from gRPC protocol.
  • Fix the problem that es index(TimeSeriesTable, eg. endpoint_traffic, alarm_record) didn't create even after rerun with
    init-mode. This problem caused the OAP server to fail to start when the OAP server was down for more than a day.
  • Support autocomplete tags in traces query.
  • [Breaking Change] Replace all configurations **_JETTY_** to **_REST_**.
  • Add the support eBPF profiling field into the process entity.
  • E2E: fix log test miss verify LAL and metrics.
  • Enhance Converter mechanism in kernel level to make BanyanDB native feature more effective.
  • Add TermsAggregation properties collect_mode and execution_hint.
  • Add "execution_hint": "map", "collect_mode": "breadth_first" for aggregation and topology query to improve 5-10x
    performance.
  • Clean up scroll contexts after used.
  • Support autocomplete tags in logs query.
  • Enhance Deprecated MetricQuery(v1) getValues querying to asynchronous concurrency query
  • Fix the pod match error when the service has multiple selector in kubernetes environment.
  • VM monitoring adapts the 0.50.0 of the opentelemetry-collector.
  • Add Envoy internal cost metrics.
  • Remove Layer concept from ServiceInstance.
  • Remove unnecessary onCompleted on gRPC onError callback.
  • Remove Layer concept form Process.
  • Update to list all eBPF profiling schedulers without duration.
  • Storage(ElasticSearch): add search options to tolerate inexisting indices.
  • Fix the problem that MQ has the wrong Layer type.
  • Fix NoneStream model has wrong downsampling(was Second, should be Minute).
  • SQL Database: provide @SQLDatabase.AdditionalEntity to support create additional tables from a model.
  • [Breaking Change] SQL Database: remove SQL Database config maxSizeOfArrayColumn and numOfSearchableValuesPerTag.
  • [Breaking Change] SQL Database: move Tags list from Segment,Logs,Alarms to their additional table.
  • [Breaking Change] Remove total field in Trace, Log, Event, Browser log, and alarm list query.
  • Support OFF_CPU eBPF Profiling.
  • Fix SumAggregationBuilder#build should use the SumAggregation rather than MaxAggregation.
  • Add TiDB, OpenSearch, Postgres storage optional to Trace and eBPF Profiling E2E testing.
  • Add OFF CPU eBPF Profiling E2E Testing.
  • Fix searchableTag as rpc.status_code and http.status_code. status_code had been removed.
  • Fix scroll query failure exception.
  • Add profileDataQueryBatchSize config in Elasticsearch Storage.
  • Add APIs to query Pod log on demand.
  • Remove OAL for events.
  • Simplify the format index name logical in ES storage.
  • Add instance properties extractor in MAL.
  • Support Zipkin traces collect and zipkin traces query API.
  • [Breaking Change] Zipkin receiver mechanism changes and traces do not stream into OAP Segment anymore.

UI

  • General service instance: move Thread Pool from JVM to Overview, fix JVM GC Count calculation.
  • Add Apache ShenYu (incubating) component LOGO.
  • Show more metrics on service/instance/endpoint list on the dashboards.
  • Support average values of metrics on the service/list/endpoint table widgets, with pop-up linear graph.
  • Fix viewLogs button query no data.
  • Fix UTC when page loads.
  • Implement the eBPF profile widget on dashboard.
  • Optimize the trace widget.
  • Avoid invalid query for topology metrics.
  • Add the alarm and log tag tips.
  • Fix spans details and task logs.
  • Verify query params to avoid invalid queries.
  • Mobile terminal adaptation.
  • Fix: set dropdown for the Tab widget, init instance/endpoint relation selectors, update sankey graph.
  • Add eBPF Profiling widget into General service, Service Mesh and Kubernetes tabs.
  • Fix jump to endpoint-relation dashboard template.
  • Fix set graph options.
  • Remove the Layer filed from the Instance and Process.
  • Fix date time picker display when set hour to 0.
  • Implement tags auto-complete for Trace and Log.
  • Support multiple trees for the flame graph.
  • Fix the page doesn't need to be re-rendered when the url changes.
  • Remove unexpected data for exporting dashboards.
  • Fix duration time.
  • Remove the total field from query conditions.
  • Fix minDuration and maxDuration for the trace filter.
  • Add Log configuration for the browser templates.
  • Fix query conditions for the browser logs.
  • Add Spanish Translation.
  • Visualize the OFF CPU eBPF profiling.
  • Add Spanish language to UI.
  • Sort spans with startTime or spanId in a segment.
  • Visualize a on-demand log widget.
  • Fix activate the correct tab index after renaming a Tabs name.
  • FaaS dashboard support on-demand log (OpenFunction/functions-framework-go version > 0.3.0).

Documentation

  • Add eBPF agent into probe introduction.

All issues and pull requests are here